Signals

From PostgreSQL wiki
Jump to navigationJump to search

Kinds of signals

Sender→receiver Signal Description Replacement Patch
administrator→postmaster SIGTERM smart shutdown keep, and add control socket?
administrator→postmaster SIGQUIT immediate shutdown keep, and add control socket?
administrator→postmaster SIGINT fast shutdown keep, and add control socket?
administrator→postmaster SIGHUP reload keep, and add control socket?
postmaster→all SIGQUIT immediate shutdown, quickdie(), _exit() keep
postmaster→all SIGTERM exit at next CFI() SendInterrupt(INTERRUPT_DIE)?
postmaster→avlauncher SIGUSR2 start autovacuum ?
postmaster→checkpointer SIGUSR2 shutdown ?
postmaster→walsenders SIGUSR2 finish and shutdown ?
postmaster→pgarch SIGUSR2 shutdown ?
postmaster→startup SIGUSR2 promote ?
postmaster→client backend SIGINT cancel query SendInterrupt(INTERRUPT_CANCEL)?
postmaster→all SIGKILL timed out while waiting for shutdown keep
postmaster→any SIGUSR1 bgworker state change notification SetLatch() SendInterrupt() proposal
postmaster→all SIGHUP reload config SendInterrupt(INTERRUPT_RELOAD_CONFIG)?
kernel→postmaster SIGCHLD child state change notification keep, but refactor?
kernel→backend SIGINFO postmaster exited keep for now (but see below for MT redesign ideas)
kernel→backend SIGALRM itimer keep for now, but change handlers to do RaiseInterrupt(INTERRUPT_XXX) SendInterrupt() proposal
backend→backend SIGURG latch wakeup keep for now, but later replace with ?
backend→backend SIGUSR1 SendProcSignal(pid, PROCSIG_XXX) SendInterrupt(INTERRUPT_XXX, procno) SendInterrupt() proposal
backend→postmaster SIGUSR1 SendPostmasterSignal(PMSIGNAL_XXX) ?

Thoughts on administrator→postmaster

The signals that pg_ctl and other control programs sends are mostly Unix conventions and it seems OK to keep them. But perhaps we should also have a control pipe?

On Windows, we already have a control socket for pretending to send SIGHUP etc to the postmaster. Maybe we stop pretending Windows has Unix signals, and support a general control pipe? That way we could also implement richer communication, like "what state are you in? what is recovery progress?".

Thoughts on backend→postmaster

Idea #1: We could teach postmaster to accept a shared latch. Some say it can't because that means it is exposed to shared memory corruption risks, but in fact it already has some exposure though PMSIGNAL_ vector. Perhaps it could have a "robust" latch mode, that always takes the slow patch (system call), so that it is no less robust than the current PMSIGNAL_ mechanism. Specifically, it is not possible for one backend to trash memory in such a way that prevents another backend from waking the postmaster.

Idea #2: We could use a pipe/socketpair to talk to the postmaster and send it richer messages.

In general, a lot of messages to the postmaster would probably go away in a multithreaded model anyway, because it would no longer be in charge of starting new backends.

Thoughts on postmaster→backend

To replace the current SIGUSR1/SIGUSR2 signals, in an intermediate phase, we could decide that it is OK for the postmaster to use SetLatch(). If we are worried about shared memory corruption, we could decide that it has to use a "robust" SetLatch() in the multi-process model, meaning it doesn't check shmem, it just always uses the system call slow path.

In the multi-threaded future, some of those probably go away and are replaced with communication between backends (you don't need to ask the postmaster to start a worker). Some communication is still needed. Should there be a single socketpair connecting the postmaster to the backend container process, through which it can coordinate eg promotion, shutdown? Perhaps that implies a special monitor thread inside the backend container process that would forward such communications, ie it receives eg "PROMOTE\n" through a pipe and the monitor thread generates SendInterrupt() and/or raw SetLatch() calls as required?

Thoughts on latch wakeups with threads

Idea #1: We could do pthread_kill(pthread_t, SIGURG) on the sending side. Then for WAIT_USE_POLL give each backend its own self-pipe, for WAIT_USE_EPOLL it might already work with one shared signalfd or maybe they need one each (?), and for WAIT_USE_KQUEUE no change is needed. (And Windows just works, native events.)

Idea #2: pthread_keill(), then on the receiving side, WAIT_USE_POLL could switch to ppoll() (finally standardised in POSIX 2024, atomic signal masking, Solaris has it and it is currently the only user of WAIT_USE_POLL?), for WAIT_USE_EPOLL we could witch to epoll_pwait2() (get rid or the signal pipe and use atomic signal masking). Again no change for WAIT_USE_KQUEUE. (And Windows just works, native events.)

Idea #3: We could give every backend a pipe, and write a byte to it. We could have done that already, but in a multi-process model it might create a MaxBackends^2 explosion of duplicated kernel descriptors. Should be OK for single-process multi-thread mode, and on Linux it replaces the current per-backend signalfd.

Idea #4: We could give every backend a pipe as a fallback, but use better options when available: With kqueue you can send a custom wakeup event directly to someone else's kqueue from inside the same process. For Linux we could replace the current signalfd that is in the epoll with an eventfd, which anyone can write into. For Linux we might eventually want to switch to a per-backend uring, in which case any thread could post a custom wakeup to any other backend's uring directly.

Thoughts on timers and SIGALRM

Each backend currently has its own separate itimer to manage various timeouts. In an intermediate phase that could continue, but the timer handlers could just call RaiseInterrupt(INTERRUPT_xxx), as shown the SendInterrupt() patch. (This remaining manipulation of the interrupt bitmap from inside a signal handler is the reason why the SendInterrupt() patch relies on --disable-atomics being dropped, because it's not safe to use lock-based emulation from inside a signal handler.)

In a multi-threaded future, I think we'd probably need to invent our own timer monitor thread, that would maintain a schedule table and do SendInterrupt(INTERRUPT_XXX, target_procno) at the right times as requested, or something like that? Then SIGALRM would not be needed, but each backend could still configure its own timeout schedule separately. Right?

Thoughts on Windows cleanup

Instead of giving every backend a named pipe to send fake signals too, we could delete all that stuff and keep just latches, and give both Unix and Windows master pipe/control socket for top level? Instead of generating a fake SIGCHLD, we could maybe add some way to consume WL_PROCESS_EXIT events to WaitEventSet, to abstract over Unix and Windows?

Thoughts on traces of reliance on EINTR in socket calls

There are a couple of leftover bits that still use blocking socket I/O. One I know of: RADIUS authentication. That's racy (one of the main problems latch multiplexing fixed) and we have to get rid of it. Then we can rip out most of the horrible socket wrapper code for Windows which is known to be buggy.