Kinds of signals

Sender→receiver	Signal	Description	Replacement	Patch
administrator→postmaster	SIGTERM	smart shutdown	keep, and add control socket?
administrator→postmaster	SIGQUIT	immediate shutdown	keep, and add control socket?
administrator→postmaster	SIGINT	fast shutdown	keep, and add control socket?
administrator→postmaster	SIGHUP	reload	keep, and add control socket?
postmaster→all	SIGQUIT	immediate shutdown, quickdie(), _exit()	keep
postmaster→all	SIGTERM	exit at next CFI()	SendInterrupt(INTERRUPT_DIE)?
postmaster→avlauncher	SIGUSR2	start autovacuum	?
postmaster→checkpointer	SIGUSR2	shutdown	?
postmaster→walsenders	SIGUSR2	finish and shutdown	?
postmaster→pgarch	SIGUSR2	shutdown	?
postmaster→startup	SIGUSR2	promote	?
postmaster→client backend	SIGINT	cancel query	SendInterrupt(INTERRUPT_CANCEL)?
postmaster→all	SIGKILL	timed out while waiting for shutdown	keep
postmaster→any	SIGUSR1	bgworker state change notification	SetLatch()	SendInterrupt() proposal
postmaster→all	SIGHUP	reload config	SendInterrupt(INTERRUPT_RELOAD_CONFIG)?
kernel→postmaster	SIGCHLD	child state change notification	keep, but refactor?
kernel→backend	SIGINFO	postmaster exited	keep for now (but see below for MT redesign ideas)
kernel→backend	SIGALRM	itimer	keep for now, but change handlers to do RaiseInterrupt(INTERRUPT_XXX)	SendInterrupt() proposal
backend→backend	SIGURG	latch wakeup	keep for now, but later replace with ?
backend→backend	SIGUSR1	SendProcSignal(pid, PROCSIG_XXX)	SendInterrupt(INTERRUPT_XXX, procno)	SendInterrupt() proposal
backend→postmaster	SIGUSR1	SendPostmasterSignal(PMSIGNAL_XXX)	?

Thoughts on administrator→postmaster

The signals that pg_ctl and other control programs sends are mostly Unix conventions and it seems OK to keep them. But perhaps we should also have a control pipe?

On Windows, we already have a control socket for pretending to send SIGHUP etc to the postmaster. Maybe we stop pretending Windows has Unix signals, and support a general control pipe? That way we could also implement richer communication, like "what state are you in? what is recovery progress?".

Thoughts on backend→postmaster

Idea #1: We could teach postmaster to accept a shared latch. Some say it can't because that means it is exposed to shared memory corruption risks, but in fact it already has some exposure though PMSIGNAL_ vector. Perhaps it could have a "robust" latch mode, that always takes the slow patch (system call), so that it is no less robust than the current PMSIGNAL_ mechanism. Specifically, it is not possible for one backend to trash memory in such a way that prevents another backend from waking the postmaster.

Idea #2: We could use a pipe/socketpair to talk to the postmaster and send it richer messages.

In general, a lot of messages to the postmaster would probably go away in a multithreaded model anyway, because it would no longer be in charge of starting new backends.

Thoughts on postmaster→backend

To replace the current SIGUSR1/SIGUSR2 signals, in an intermediate phase, we could decide that it is OK for the postmaster to use SetLatch(). If we are worried about shared memory corruption, we could decide that it has to use a "robust" SetLatch() in the multi-process model, meaning it doesn't check shmem, it just always uses the system call slow path.

In the multi-threaded future, some of those probably go away and are replaced with communication between backends (you don't need to ask the postmaster to start a worker). Some communication is still needed. Should there be a single socketpair connecting the postmaster to the backend container process, through which it can coordinate eg promotion, shutdown? Perhaps that implies a special monitor thread inside the backend container process that would forward such communications, ie it receives eg "PROMOTE\n" through a pipe and the monitor thread generates SendInterrupt() and/or raw SetLatch() calls as required?

Thoughts on latch wakeups with threads

Idea #1: We could do pthread_kill(pthread_t, SIGURG) on the sending side. Then for WAIT_USE_POLL give each backend its own self-pipe, for WAIT_USE_EPOLL it might already work with one shared signalfd or maybe they need one each (?), and for WAIT_USE_KQUEUE no change is needed. (And Windows just works, native events.)

Idea #2: pthread_keill(), then on the receiving side, WAIT_USE_POLL could switch to ppoll() (finally standardised in POSIX 2024, atomic signal masking, Solaris has it and it is currently the only user of WAIT_USE_POLL?), for WAIT_USE_EPOLL we could witch to epoll_pwait2() (get rid or the signal pipe and use atomic signal masking). Again no change for WAIT_USE_KQUEUE. (And Windows just works, native events.)

Idea #3: We could give every backend a pipe, and write a byte to it. We could have done that already, but in a multi-process model it might create a MaxBackends^2 explosion of duplicated kernel descriptors. Should be OK for single-process multi-thread mode, and on Linux it replaces the current per-backend signalfd.

Idea #4: We could give every backend a pipe as a fallback, but use better options when available: With kqueue you can send a custom wakeup event directly to someone else's kqueue from inside the same process. For Linux we could replace the current signalfd that is in the epoll with an eventfd, which anyone can write into. For Linux we might eventually want to switch to a per-backend uring, in which case any thread could post a custom wakeup to any other backend's uring directly.

Thoughts on timers and SIGALRM

Each backend currently has its own separate itimer to manage various timeouts. In an intermediate phase that could continue, but the timer handlers could just call RaiseInterrupt(INTERRUPT_xxx), as shown the SendInterrupt() patch. (This remaining manipulation of the interrupt bitmap from inside a signal handler is the reason why the SendInterrupt() patch relies on --disable-atomics being dropped, because it's not safe to use lock-based emulation from inside a signal handler.)

In a multi-threaded future, I think we'd probably need to invent our own timer monitor thread, that would maintain a schedule table and do SendInterrupt(INTERRUPT_XXX, target_procno) at the right times as requested, or something like that? Then SIGALRM would not be needed, but each backend could still configure its own timeout schedule separately. Right?

Thoughts on Windows cleanup

Instead of giving every backend a named pipe to send fake signals too, we could delete all that stuff and keep just latches, and give both Unix and Windows master pipe/control socket for top level? Instead of generating a fake SIGCHLD, we could maybe add some way to consume WL_PROCESS_EXIT events to WaitEventSet, to abstract over Unix and Windows?

Thoughts on traces of reliance on EINTR in socket calls

There are a couple of leftover bits that still use blocking socket I/O. One I know of: RADIUS authentication. That's racy (one of the main problems latch multiplexing fixed) and we have to get rid of it. Then we can rip out most of the horrible socket wrapper code for Windows which is known to be buggy.

Signals

Contents

Kinds of signals

Thoughts on administrator→postmaster

Thoughts on backend→postmaster

Thoughts on postmaster→backend

Thoughts on latch wakeups with threads

Thoughts on timers and SIGALRM

Thoughts on Windows cleanup

Thoughts on traces of reliance on EINTR in socket calls

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Tools

Search