Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: REP leaves label msgs for dead REQ in pipe #2572

Merged
merged 1 commit into from
May 16, 2017

Conversation

bluca
Copy link
Member

@bluca bluca commented May 10, 2017

Solution: roll back the pipe if writing messages other than the
first fails in router::xsend.
Also add test case that reproduces the memory leak when ran with
valgrind.
Fixes #2567

Solution: roll back the pipe if writing messages other than the
first fails in router::xsend.
Also add test case that reproduces the memory leak when ran with
valgrind.
Fixes zeromq#2567
@bluca
Copy link
Member Author

bluca commented May 10, 2017

@somdoron I think this is the right thing to do, but I could be missing something, could you please have a quick look when you have time?

@bjovke bjovke merged commit 75cc201 into zeromq:master May 16, 2017
@somdoron
Copy link
Member

I think the fix is incorrect, take a look at the pipe terminated method, current out will be null and all msg will be dropped (future and past)

@bluca
Copy link
Member Author

bluca commented May 16, 2017

Thanks, will look again

@bjovke
Copy link
Contributor

bjovke commented May 16, 2017

@somdoron current_out = NULL; happens after current_out->rollback (); with this fix anyway.
The issue could be only that rollback() removes all messages from pipe, I really don't know if this is correct behavior?
"all msg will be dropped (future and past)" current_out = NULL; existed before this patch. What exactly do you mean by this?

@somdoron
Copy link
Member

Ok Sorry, I'm surprised we only find out about this now.
We also need to call the rollback in the xpipe_terminated (or only there and remove the current_out=null from the xsend, letting the xpipe_termianted handle it)

Also calling rollback in the router version is a mistake because it set the more_out to false, which is not what we want here (because it is a silent drop, setting more_out to false will cause an error on the next part, which the router will think is a routing id).

So I would just call outpipe.rollback() in both places.

@bjovke
Copy link
Contributor

bjovke commented May 16, 2017

What I've discovered in my test:

  • When message is received by ZMQ_REP server, zmq::stream_engine_t::in_event () is called, which eventually calls zmq::metadata_t::add_ref () for metadata member.
  • ref_cnt of metadata grows to 4.
  • Client is killed before server tries to respond.
  • ref_cnt of metadata is decreased to 1 after whole cleanup has ended and metadata is never freed.
  • With patch from @bluca ref_cnt correctly decreases to 0 and LIBZMQ_DELETE(metadata); is called in zmq::stream_engine_t destructor.
  • This correctly corresponds to something left referencing metadata, probably one message which is not destroyed.

@bjovke
Copy link
Contributor

bjovke commented May 16, 2017

I'm also not sure why there's current_out = NULL; in zmq::router_t::xsend, zmq::router_t::rollback and zmq::router_t::xpipe_terminated both set this to NULL in a more appropriate way.

@bluca
Copy link
Member Author

bluca commented May 16, 2017

Thanks for the comments&insights

As far as I can see, setting current_out to NULL in xsend has been around since forever, I can trace it back to before router.cpp existed, in xrep: b79d07b

Also calling rollback in the router version is a mistake because it set the more_out to false, which is not what we want here (because it is a silent drop, setting more_out to false will cause an error on the next part, which the router will think is a routing id).

Note that I intentionally called outpipe's rollback rather than router for this reason, already in this diff.

So if I followed correctly, the proposal is to resend this same PR, with the addition of calling pipe_->rollback() in xpipe_terminated ?

@somdoron
Copy link
Member

somdoron commented May 17, 2017 via email

@bluca
Copy link
Member Author

bluca commented May 17, 2017

Here it is: #2577

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants