Why is membase server so slow in response time?_问答_开发者

Why is membase server so slow in response time?

开发者 https://www.devze.com 2023-03-28 13:23 出处：网络

I have a problem that membase is being very slow on my environment. I am running several production servers (Passenger) on rails 2.3.10 ruby 1.8.7.

I have a problem that membase is being very slow on my environment. I am running several production servers (Passenger) on rails 2.3.10 ruby 1.8.7. Those servers communicate with 2 membase machines in a cluster.

the membase machines each have 64G of memory and a100g EBS attached to them, 1G swap.

My problem is that membase is being VERY slow in response time and is actually the slowest part right now in all of the application lifecycle.

my question is: Why?

the rails gem I am using is memcache-northscale. the membase server is 1.7.1 (latest).

The server is doing betw开发者_Python百科een 2K-7K ops per second (for the cluster)

The response time from membase (based on NewRelic) is 250ms in average which is HUGE and unreasonable.

Does anybody know why is this happening? What can I do inorder to improve this time?

It's hard to immediately say with the data at hand, but I think I have a few things you may wish to dig into to narrow down where the issue may be.

First of all, do your stats with membase show a significant number of background fetches? This is in the Web UI statistics for "disk reads per second". If so, that's the likely culprit for the higher latencies.

You can read more about the statistics and sizing in the manual, particularly the sections on statistics and cluster design considerations.

Second, you're reporting 250ms on average. Is this a sliding average, or overall? Do you have something like max 90th or max 99th latencies? Some outlying disk fetches can give you a large average, when most requests (for example, those from RAM that don't need disk fetches) are actually quite speedy.

Are your systems spread throughout availability zones? What kind of instances are you using? Are the clients and servers in the same Amazon AWS region? I suspect the answer may be "yes" to the first, which means about 1.5ms overhead when using xlarge instances from recent measurements. This can matter if you're doing a lot of fetches synchronously and in serial in a given method.

I expect it's all in one region, but it's worth double checking since those latencies sound like WAN latencies.

Finally, there is an updated Ruby gem, backwards compatible with Fauna. Couchbase, Inc. has been working to add back to Fauna upstream. If possible, you may want to try the gem referenced here: http://www.couchbase.org/code/couchbase/ruby/2.0.0

You will also want to look at running Moxi on the client-side. By accessing Membase, you need to go through a proxy (called Moxi). By default, it's installed on the server which means you might make a request to one of the servers that doesn't actually have the key. Moxi will go get it...but then you're doubling the network traffic.

Installing Moxi on the client-side will eliminate this extra network traffic: http://www.couchbase.org/wiki/display/membase/Moxi

Perry