-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Problem. #27
Comments
@zacksiri please give me more details about your environment and config, for instance, how many nodes, what cache topology are you rolling out (partitioned through the distributed adapter maybe?), what operations are you testing, how are you running the tests, in other words, if you can share some example code would be the best. On other hand, a single docker container, 1 core could be limited, remember one of the most important features about Erlang/Elixir is taking advantage of multicore architectures, so single core and on top of that a docker container, it might be a very limited node. Other thing thing to keep in mind, the distributed adapter uses "Distributed Erlang/Elixir" and that is implemented on top of TCP, so in terms of performance, it might be comparable with any other optimized protocol that uses TCP (like Redis or Memcached). But anyways, I wouldn't expect too much difference. So if you can repeat the test with better resources for Elixir (Nebulex) boxes as you mentioned, it would be great, and also gime as much details as possible. Thanks, I stay tuned!
|
@cabol I'm using the local adapter, to keep things simple for now. It's running on one node. I'm not doing too much. I implemented a custom fetch function on the Here is my config config :studio, MyApp.Cache,
adapter: Nebulex.Adapters.Local,
gc_interval: 86_400 @spec fetch(any(), keyword(), [{:do, any()}]) :: any()
def fetch(key, opts, do: calculate) do
ttl = Keyword.get(opts, :ttl, 86_400)
if __MODULE__.has_key?(key) do
__MODULE__.get(key)
else
{:ok, value} = calculate
__MODULE__.set(key, value, ttl: ttl, return: :value)
end
end I'm going to allow the containers to use all the cores on the machine, and see what happens. |
@zacksiri in that case it is weird, maybe as you mentioned there might be something in the environment, check this link: https://github.com/cabol/nebulex_examples/tree/master/nebulex_bench There you will see the last bench I ran using the local adapter, you can run the bench tests again in your environment to compare (also there is a description about the environment I used) Also, you can check in your env running the local bench for Nebulex: Let me know! |
Yeah I’m going to run some benches, with Redis the service is on a different node, but moving to nebulex means all the load is in 1 place so all the calls to the cache might add up, for that reason add to the response time. I will do some further research. |
@zacksiri as far as I understand, you are using the local adapter right? So you are running the tests on a single node, which make sense, you are trying to bench Nebulex locally (single-node), which is fine. The problem is, the node you are using is very small (that's our premise), what I'd do is the same test, but using bigger instances or nodes. For example, in my laptop (4 cores, 8 GB RAM, 256 SSD) I got this bench results (running
I'd try to run the same but in a docker container (or smaller instance), but meanwhile you can try the same, no need to move Nebulex to a single node, you can keep it on each node where you're running your app (that's the idea), but try to roll it out on bigger nodes, at least a bit bigger (2-4 cores and 8 GB of RAM - depends on what amount of data you want to cache). |
I’m going to recreate the node using a container and limiting the cpu to 1 core. To compare to your results. Haven’t had the time today, I’ll let you know what I find. Will try and scale to 2 core then 4 and see how results change. |
That sounds good, stay tuned !! |
Hey here are the results for 1 core inside the container. it is significantly slower than when I run it on my laptop. on my local laptop I'm getting similar results to yours. I think my cloud VM is really over subscribed
2 Cores
4 Cores
|
I just tried on 2 different cloud providers, I'm getting very similar results between them. this one I posted is actually better. Also there doesn't seem to be a difference between container or vm I think something doesn't add up. 2.5µs is fast, and if i make 100 calls it should still be only add 0.25ms to my response time. maybe i'm doing something wrong. |
@cabol just wanted to let you know I've resolved the performance issue, It had nothing to do with nebulex, it was my implementation making N+1 calls to the cache. which meant it was saturating my CPU which made the response time go up. Once I removed the N+1 Nebulex is now faster than redis as expected. I'm so impressed with Nebulex I'll be covering it in my Elixir Foundation videos series, and make it as a part of my company's standard tools. I'm also looking forward to the 1.0 release. Thank you for your hardwork on this library it's beautifully implemented. |
@zacksiri awesome, that's good to know, I like this kind of issues a lot, because they help you to improve or discover bugs, or new issues, but I'm glad to hear you were able to fix your issue. Any further issue(s) don't hesitate to ping me or open a new one. On other hand I'm also glad Nebulex is being useful for you (that's the goal) and I'd really appreciate if you cover it in you Elixir Foundation, that would be great!! And, I'm working very hard to release 1.0.0 version ASAP, there are a lot of improvements and new features. Thank you very much for your post and findings, it was very fruitful :) |
So just to complete the issue with a screenshot. everything above 54e5fc6 is with nebulex everything below is redis. Ignore the first request that says 156ms that has to make the initial network call to cache a token, everyhing after that call is using nebulex cache. I'm getting about 3x improvement overall. |
👍 |
I tried switching to nebulex from my hand-rolled version using
GenServer
+redis
. There was quite a performance hit. The 3 bottom results (below79734e2
) were before nebulex, and everything on top is nebulex results.Could it be that now the workload has shifted from redis to the internal node hence it is more affected by CPU (it's running in a container and I cap the CPU to 1 core). I think further testing is needed.
The text was updated successfully, but these errors were encountered: