Fix ASIO one shot lost notifications problem #2897
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Prior to this commit, the way the ASIO notifications worked on
linux epoll included an edge case/race condition that allowed
for a loss of ASIO events.
As part of testing wallaroo scalability, we encountered an issue
(WallarooLabs/wally#2547) where some of our TCP actors would
stop reading from the socket for no reason even though there
was data waiting on the socket. After much fighting, @SeanTAllen
inferred that the ASIO subsystem was losing notifications and
the TCP actor was never notified about the data waiting on the
socket.
This commit changes things so that going forward:
instead use level triggered mode
the state of the
ev->readable
andev->writeable
booleansfor either reads or writes, until they reset the
ev->readable
and/or
ev->writeable
booleans and resubscribe for notificationsagain in case of spurious notifications from epoll due to the
use of level triggered mode (even though one shot should ensure
that we only get one notification).
These changes ensure that if there's data on the socket at the
time we (re)subscribe to reads, we will be told immediately by
epoll that there is a read notification.
Similarly, if the socket is writeable at the time we (re)subscribe
to writes, we will be told immediately by epoll that there is a
write notification.
Both of these ensure that the ASIO subsystem will always be told
by epoll when the socket is readable or writeable and avoid any
race conditions due to edge triggering and the state of the socket
at the time the subscription occurs.
This PR also supersedes PR #2830 for the ASIO re-subscription related logic.