GDB and io_uring

By Jake Edge
March 31, 2021

A problem reported when attaching GDB to programs that use io_uring has led to a flurry of potential solutions, and one that was merged into Linux 5.12-rc5. The problem stemmed from a change made in the 5.12 merge window to how the threads used by io_uring were created, such that they became associated with the process using io_uring. Those "I/O threads" were treated specially in the kernel, but that led to the problem with GDB (and likely other ptrace()-using programs). The solution is to treat them like other threads because it turned out that trying to make them special caused more problems than it solved.

Stefan Metzmacher reported the problem to the io-uring mailing list on March 20. He tried to attach GDB to the process of a program using io_uring, but the debugger went "into an endless loop because it can't attach to the io_threads". PF_IO_WORKER threads are used by io_uring for operations that might block; he followed up the bug report with two patch sets that would hide these threads in various ways. The idea behind hiding them is that if GDB cannot see the threads, it will not attempt to attach to them. Prior to 5.12, the threads existed but were not associated with the io_uring-using process, so GDB would not see them.

It is, of course, less than desirable for developers to be unable to run a debugger on code that uses io_uring, especially since io_uring support in their application is likely to be relatively new, thus it may need more in the way of debugging. The maintainer of the io_uring subsystem, Jens Axboe, quickly stepped in to help Metzmacher solve the problem. Axboe posted a patch set that included a way to hide the PF_IO_WORKER threads, along with some tweaks to the signal handling for these threads; in particular, he removed the ability for them to receive signals at all.

That made Eric W. Biederman somewhat uncomfortable; he asked why the io_uring threads could not take signals, and SIGSTOP in particular. In order to attach to a running process, ptrace() uses SIGSTOP, but the I/O threads lack much of the normal user-space context that would allow handling signals. Linus Torvalds explained that signal handling is done when a thread returns to user space, but that does not happen for kernel threads; he described that further in another message:

SIGSTOP handling is fundamentally done at signal handling time, and signal handling is fundamentally done at "return to user space" time.
End result: you cannot send kernel threads any signals at all, unless it _explicitly_ handles them manually. SIGSTOP isn't different from user space delivery of an "actual" signal handler in this respect.
And practically speaking, the only signal a kernel thread generally can handle is SIGKILL (and exit on it).
[...] I really think IO threads need to not participate, because they simply cannot handle signals in any sane manner.

A few days later, Axboe posted another version of the patch set with a longer description of the problem and the proposed solution:

Stefan reports that attaching to a task with io_uring will leave gdb very confused and just repeatedly attempting to attach to the IO threads, even though it receives an -EPERM every time. This patchset proposes to skip PF_IO_WORKER threads as same_thread_group(), except for accounting purposes which we still desire.
We also skip listing the IO threads in /proc/<pid>/task/ so that gdb doesn't think it should stop and attach to them. This makes us consistent with earlier kernels, where these async threads were not related to the ring owning task, and hence gdb (and others) ignored them anyway.

But it seems those patches went a bit too far; Biederman pointed out that the threads would no longer show up in /proc at all, which would hide them from top and other diagnostic tools. Torvalds noted that it also hides them from ps, which makes him "think that hiding them is the wrong model". There was some discussion of putting the threads under a different name in the /proc hierarchy, but Axboe thought it might work for some utilities but would likely "mess up _something_". Biederman said that there needed to be some mechanism to tell GDB (and other debuggers) that these threads are special: "I suspect getting -EPERM (or possibly a different error code) when attempting attach is the right [way] to know that a thread is not available to be debugged."

Axboe mentioned in the patch cover letter that the underlying problem might really be a GDB bug and Biederman seemed to agree, but Oleg Nesterov took exception to that: "The kernel changed the rules, and this broke gdb." But Biederman argued that it was not strictly a regression; "It is gdb not handling new functionality." Though, even if it was a GDB bug, Axboe said, it simply takes too long for updates to GDB to make their way to users to fix things that way; beyond that, "I doubt that gdb is the only thing that'd fall over, not expecting threads in there that it cannot attach to."

So, some kind of solution where everything "just works" was desired—and that is seemingly exactly what Torvalds came up with:

Actually, maybe the right model is to simply make all the io threads take signals, and get rid of all the special cases.
Sure, the signals will never be delivered to user space, but if we

just made the thread loop do "get_signal()" when there are pending signals
allowed ptrace_attach on them

they'd look pretty much like regular threads that just never do the user-space part of signal handling.
The whole "signals are very special for IO threads" thing has caused so many problems, that maybe the solution is simply to _not_ make them special?

Axboe agreed that it made more sense to "just embrace the signals, and have everything just work by default". To that end, he tried attaching GDB after making changes along those lines and was successful. That led to version 1 of the "allow signals for IO threads" patch set on March 25 and, after a few fixes, version 2 on March 26. The latter was promptly picked up for 5.12-rc5 on March 28; it also reverts some of the earlier attempts at fixes that had been picked up for 5.12-rc4 a week earlier.

The idea is that the I/O threads simply take signals like other threads and processes rather than be a special case, which cleans things up substantially. As Axboe put it in the cover letter of the first version, it is plain to see in hindsight:

As with most other good ideas, it's obvious once you hear it. The fact that we end up with _zero_ special cases with this is a clear sign that this is the right way to do it indeed. The fact that this series is 2/3rds revert further drives that point home.

In the end, it would seem that a much better solution came about; at least in part by Torvalds stepping back and reconsidering some of the assumptions. While PF_IO_WORKER threads cannot really do anything with signals that get sent to them, there is no real need to reject them either. Once that was recognized, the patches were fairly straightforward it seems. And, meanwhile, an unpleasant situation for developers of io_uring-using code was avoided.

Index entries for this article
Kernel	io_uring

GDB and io_uring

Posted Mar 31, 2021 20:29 UTC (Wed) by comex (subscriber, #71521) [Link] (2 responses)

Suppose that userland has enqueued a large number of operations into the io_uring queue, and you send SIGSTOP to the process.

Presumably it won't freeze any operations that have already been dequeued and are in progress. But with this change in place, will the kernel eventually stop dequeueing *new* operations? Or will it just ignore the signal?

GDB and io_uring

Posted Mar 31, 2021 20:55 UTC (Wed) by andresfreund (subscriber, #69562) [Link]

It's my understanding that the signals effectively just are ignored. The article says that signals are processed at the userspace boundary - which these threads never reach....

GDB and io_uring

Posted Mar 31, 2021 22:02 UTC (Wed) by ebiederm (subscriber, #35028) [Link]

The I/O worker when it reaches the top level of it's loops and processes signals it will join in the stop of the entire process.

No additional work will be dequeued.

Signals with signal handlers must be processed by modifying the userspace process to execute the handlers.

STOP signals and fatal signals without handlers (such as SIGKILL and SIGSTOP) are processed without
returning to userspace. So the io workers can and do process then whenever they check for signals.