Django cache performance_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-04-01 04:04 出处：网络

We are using now Redis for in-memory cache for our Django application (we have used memcached before, and there is no big difference in performance, we are using Redis because disk dump feature).

Problem is that performance of Django cache is, in my opinion - awful. We have view, with 102 cache hits (no misses) and it takes 81 m开发者_C百科s (just cache part, measured with Django debug toolbar). In my opinion - it's huge amount of time. I know, that making queries to DB would take 10x more time (or even 100x), but even with that fact cache performance is not good.

We are running Redis (and memcached before) on different host, connected in local network with other servers.

Is there any way to tweak cache performance in Django?

The problem is most likely the number of items that need to be retrieved for each page rather than the performance of the cache itself. 102 cache calls means a lot of time lost to network latency. With full control of the code you could probably fix that with multithreading or pipelining, but in this case you don't have that option - using a framework means getting much simpler code at the cost of lower performance on edge cases.

The easiest fix is probably to move the redis cache onto the web server - a local request is much faster. That would complicate invalidating the cache, but you could probably fix that with replication - either do all writes to the master node and read from the local slave so that all nodes have the same cache, or write locally with the master node used to replicate a del command to all slaves when you need to invalidate an object.

Another thing to look at is if the performance is actually a problem. 300ms to load a page isn't too bad in terms of the experience for an individual user. It's only a problem if it means that you can't handle more than 3 pages per second across all users - unlikely in this case where the bottleneck is network latency rather than CPU or local I/O.

Network latency between hosts is likely the cause. Simply communicating with Redis on localhost will take +200us (microseconds) for small keys and values. Memcached also communicates over the network and so suffers the same latency issue. Based on the numbers you shared, each request takes about 800us.

Not all caches communicate over the network. A faster method is to memory-map parts of the cache directly into the process' memory. If you're using multiple webservers then they'll each have their own cache but if you route requests consistently (by IP, username, etc.) you can decrease cache misses. Assuming you've moved your database to a separate host machine, you've likely got spare disk cycles available on your webservers.

If you want to try this approach, consider using DiskCache, an Apache2 licensed disk and file backed cache library, written in pure-Python, and compatible with Django. DiskCache includes a number of cache benchmarks and Django cache benchmarks. Keys and small values are memory-mapped into the Django process memory so retrieval is extremely fast (3-12 times faster than your setup). As is shown in the benchmarks, "get" latency is even less than Memcached (on localhost). There are also a number of tunable settings that you can customize to your liking.