Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition with socket close event from peer #2923

Merged
merged 3 commits into from
Nov 6, 2018

Conversation

slfritchie
Copy link
Contributor

There are a couple places where we can be notified that a TCP
socket peer has closed the connection on us:

  1. _event_notify(), via an event from the ASIO thread.
  2. _pending_reads(), where we try to read from the socket
    (via the @pony_os_recv()? FFI call which is partial!)
    and get an error.

If we discover the remote peer's close by path #2, then sometime
later the ASIO thread may notify us of the close via path #1.

Both paths will call @pony_asio_event_unsubscribe(event). There's
a sanity check + pony_assert(0) inside of the Pony runtime to
prevent double calls to @pony_asio_event_unsubscribe(event).
However, pony_assert() will only act if using a Pony "debug" build
of the runtime. So, in non-debug use (i.e., normal day-to-day use),
nobody notices the problem: it's only visible when using a debug
build (probably because you're debugging some other problem).

This commit adds a comment + check to avoid double-calls to
@pony_asio_event_unsubscribe(event). It's done by adding a new
Pony API FFI call, @pony_asio_event_get_disposable, that peeks
into the event's flag status (and not the flags argument to
_event_notify()!) to check if it's safe to unsubscribe.

Many thanks to @dipinhora for discussing the merits of how the
heck to fix this and to @nisanharamati for being curious about,
"Why is this thing sometimes crashing, but only when using a
debug build of Pony?"

@slfritchie slfritchie added the changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge label Oct 19, 2018
@slfritchie slfritchie requested review from jemc and dipinhora October 19, 2018 04:25
if(ev == NULL)
return false;

return (ev->flags & ASIO_DISPOSABLE) != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be return (ev->flags == ASIO_DISPOSABLE); because ASIO_DISPOSABLE has a value of 0 as defined in the enum in asio.h and because when ev->flags is assigned ASIO_DISPOSABLE it is assigned as ev->flags = ASIO_DISPOSABLE unsetting any other flags previously assigned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, yes, I'll fix that.

@slfritchie
Copy link
Contributor Author

Hrm, the intermittent problems with ARM platforms + runtime not shutting down promptly appears to be happening in other PR branches, e.g. #2917 and timeout failure https://circleci.com/gh/ponylang/ponyc/18748. I spent a bit of time Friday trying to get my Raspberry Pi 3 to try to debug, but installing LLVM 3.9.1 correctly on its Raspian distro was frustrating & ended in failure.

@mfelsche
Copy link
Contributor

@slfritchie if you rebase on master, we should get a smooth ci run this time :)

There are a couple places where we can be notified that a TCP
socket peer has closed the connection on us:

1. _event_notify(), via an event from the ASIO thread.
2. _pending_reads(), where we try to read from the socket
   (via the @pony_os_recv()? FFI call which is partial!)
   and get an error.

If we discover the remote peer's close by path #2, then sometime
later the ASIO thread may notify us of the close via path #1.

Both paths will call `@pony_asio_event_unsubscribe(event)`.  There's
a sanity check + `pony_assert(0)` inside of the Pony runtime to
prevent double calls to `@pony_asio_event_unsubscribe(event)`.
However, `pony_assert()` will only act if using a Pony "debug" build
of the runtime.  So, in non-debug use (i.e., normal day-to-day use),
nobody notices the problem: it's only visible when using a debug
build (probably because you're debugging some *other* problem).

This commit adds a comment + check to avoid double-calls to
`@pony_asio_event_unsubscribe(event)`.  It's done by adding a new
Pony API FFI call, `@pony_asio_event_get_disposable`, that peeks
into the event's flag status (and *not* the `flags` argument to
`_event_notify()`!) to check if it's safe to unsubscribe.

Many thanks to @dipinhora for discussing the merits of how the
heck to fix this and to @nisanharamati for being curious about,
"Why is this thing sometimes crashing, but only when using a
debug build of Pony?"
@slfritchie slfritchie force-pushed the socket-close-asio-race branch from be75fbc to f1c5956 Compare October 27, 2018 21:23
@slfritchie
Copy link
Contributor Author

Thanks @mfelsche, I've rebased and am waiting for the CI results to roll in.

@slfritchie
Copy link
Contributor Author

Any additional comments prior to merging?

@aturley aturley merged commit fc57433 into master Nov 6, 2018
ponylang-main added a commit that referenced this pull request Nov 6, 2018
@SeanTAllen SeanTAllen deleted the socket-close-asio-race branch January 23, 2019 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants