-
Notifications
You must be signed in to change notification settings - Fork 2k
ngx threads: replace list with rbtree #1215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5e2602c
to
1899416
Compare
All tests with uthreads locally is fine: |
@ZigzagAK This change is big. Are you actually seeing performance problems with the current list implementation in real world profiling? |
@ZigzagAK The travis ci failures look like real problems. You need to apply the no-pool patch to the nginx core otherwise nginx's memory pools may hide many memory bugs for small payload like in our existing test suite (furthermore, we should use asan or valgrind to run the tests to make sure more memory issues can be caught, not just those lead to segfaults). |
Yes, i have a performance problem in https://github.com/ZigzagAK/lua-resty-cache-redis/tree/0.2. https://github.com/ZigzagAK/lua-resty-cache-redis/blob/0.2/lib/resty/cache/redis_with_upsync.lua
https://github.com/ZigzagAK/lua-resty-cache-redis/blob/0.2/lib/resty/cache/redis.lua
I have BIG document (130k rows) and for the perfomance many threads creaded (and destroyed after completed) to interract with Redis. This document is loaded in ~300 seconds with 100% core utilization in ngx_http_lua_get_co_ctx because O(n). With rbtree document with 130k rows loaded in 30 seconds and not more than 60% core utilization. With list implementation every batch will be slower than previous because search in list O(n). |
flamegraph: |
Please, review my code. May be I anything forgot. |
@ZigzagAK |
I am apply the 'no pool patch' 1.13.6 to nginx. Coredump detected. |
I successful run the t/098-uthread-wait.t. static int With list this dead corutine still present in list. If i turn off deleting coctx from rbtree test is OK. But if coctx still present in rbtree no memory are freed until the request is not completed. |
06d8430
to
bc01dda
Compare
bc01dda
to
933e96a
Compare
@ZigzagAK This looks like an usual use case. Can I ask how many concurrent light threads do you create and run at a time in a single openresty server instance? And how many redis server instances are you running in the backend? I wonder if you could optimize your Lua logic to avoid creating too many light threads at the same time and just create a limited number of light threads and in each of these "worker" light thread, you can repeatedly access redis and consume more than one task. This might be much more efficient than even the rbtree light thread lookup approach. Creating too many concurrent light threads in a single request handler looks like a very unusual (if not very wrong) use case to me :) |
@agentzh Total number of threads 130k / 50 ~ 2600 total. In the same time only 50 light threads are active. I know, that is not very good, but limited number of light (green) threads is more comlicated for client code ligic. If it is thre real OS threads I never be use this approach, but is is a green corutines and this approach is normal.
In the moment limited number of threads are active (50-200 for example). The problem with list is the 'dead' threads. |
;-) |
This is log with load 134k records (every record for the first is readed from Redis and compared with new): With list implementation total time is ~300 seconds. |
@ZigzagAK True, light threads are much cheaper than OS threads, but it is still not coming for free and still incurs some nontrivial overhead especially for large number of light thread creations and destructions. The thing is that I really do not want to complicate the existing code base just to optimize for unusual use cases. That's not really worth it. I think you would see big performance increase with limited number of recycled light threads. As I said above, should be even faster than the unlimited case with your rbtree optimization. |
@ZigzagAK BTW, you can use the ngx.semaphore API to feed your worker light threads with new jobs very efficiently. |
@agentzh Thanks for the idea with semaphore. In this case i have reimplemented alghoritm with queue and semaphore. It is faster in two times. Now sync is 15 seconds. |
What about this pr? You don't want to replace list with rbtree? |
@ZigzagAK Good to know that. We can leave it as is for people who are interested. I prefer simplicity here and I'm not still convinced to migrate to the more complicated rbtree approach yet for this part. |
Could searching a list of 'dead' threads (coroutines?) be somehow related to a problem that I've described over here - https://groups.google.com/d/msg/openresty-en/3BjMNEhVvN0/EEGAy4mEBgAJ ??? Modifying lua_resty_http code to implement simplified body_reades (no coroutines for requests which are sent using simple request_uri() calls) has improved performance by 10x at least.... |
This pull request is now in conflict :( |
Closing this one since we have a better solution in #1800 which is superior in performance ( |
I hereby granted the copyright of the changes in this pull request
to the authors of this lua-nginx-module project.