Plone caching in two simple steps
Plone is slow. Sure, it does an awful amount of work for every request, but it really shows: in a standard install of Zope and Plone on nice hardware with plenty of memory, getting Plone’s index page takes about 250ms. Other, less complex objects such as CSS files or images take about a tenth of that. Unfortunately, as Plone’s default skin is amazingly complex (and for good reasons), you need to fetch anything between 10 and 50 objects out of the database before your browser stops looking busy. You can’t serve more than a couple of concurrent users with that.
Fortunately setting up a reasonable bit of caching is really trivial, should require no extra software, no recompilation, no nothing, and you can see speedups of more than 25×. Here’s how!
I’m assuming you have Apache installed in front of your Plone, doing virtual hosting and what have you. So you have ab; use it to measure both the homepage and a static file, e.g.
$ ab -n 100 http://example.com/ > index.pre.ab
$ ab -n 100 http://example.com/plone.css > css.pre.ab
Look at these files: you’re interested particularly in the “Requests per second” line; it’s probably something around 3rps for the index and 30rps for the css.
Step 1: start using mod_cache_disk
Decide where you want your disk cache to be. I usually choose /var/cache/www/<virtualhost>. Create the directory, and make sure that the user under which you run Apache can write that directory,
# mkdir -p /var/cache/www/example/
# chown -R www-data:www-data /var/cache/www
Then, add these two lines to the plone site’s virtual root:
CacheRoot /var/cache/www/example/
CacheEnable disk /
and do an apache2ctl configtest. If that fails, you might need to enable the mod_cache_disk Apache module and its dependencies (under Debian and derivatives, that is a2enmod disk_cache). Once configtest passes, do a graceful restart with apache2ctl graceful, and it should be working! Well, for “static” content, at least. Test it with the CSS file you tested earlier, e.g.
$ ab -n 100 http://example.com/plone.css > css.post.ab
You should see the “Requests per second” line grow by a factor of more than 20, i.e. more than 600rps.
This is already a huge difference, as now all the extra “static” content will be served almost immediately when compared to the dynamic html page. Depending on what skin and products you have installed, you probably can now serve double the amount of concurrent users. Not bad for something that should take you less than five minutes! (unless you’re like me, and typo the CacheRoot…) If you similarly test the Plone site root, you’ll notice that it isn’t any speedier. We’ll do that in step 2…
Step 2: enable caching of HTML
If you rather, you could install CacheFu. If so, skip the to the last paragraph. Unfortunately, CacheFu is designed to work with Squid, which you can’t really use on VPS accounts for example. Fortunately, it doesn’t actually depend on Squid to work
Now, we’re going to customize the skin via the ZMI. This is just for testing; to do it properly, you’d customize the skin product, right? Right. So! To the ZMI!
First, go to the Accelerated HTTP Cache Manager object (click on the thing called “HTTPCache”). Make sure “Cache anonymous connections only?” is disabled; that disables all authenticated caching, including things like static content that doesn’t change anyway. Then, just to be sure, go to the on the “Associate” tab; in the “Locate cacheable objects” radiogroup select “Associated with this cache manager”, hit “Locate”, and check to see you’re getting a boatload of content. You should, unless you broke stuff (or are using an ancient version of Plone), in which case you should find all “static” content and associate it.
Next, customize global_cache_settings, which is used by Plone’s main_template. Usually it lives in plone_templates (under portal_skins), but you might have another version further up the skin stack. What we want to do is add Vary headers so that Apache properly juggles multiple concurrent logged in users (Vary tells the cache that the URL is not the only thing to use to determine what content is being asked for). We also want to add a Last-Modified header so Apache will cache the page. Finally, your registered users might be prepared for content that is changed to need a reload before they see the changes, or that might be beyond them. If it is beyond them, you want to disallow caching of pages for authenticated users. Let’s look at the first case first:
<metal:cacheheaders define-macro="cacheheaders">
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Type', 'text/html;;charset=%s' % charset)" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Language', lang)" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Vary', 'Accept-Language,Accept-Encoding,Cookie')" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Cache-Control', 'max-age=0,s-maxage=300')" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Last-Modified', here.modified().toZone('GMT').rfc822())" />
</metal:cacheheaders>
This makes pages expire after five minutes, which might be a little low; change the 300 to the number of seconds you think is ok. Users editing content that want to see fresh content before the five minutes are up would need to hard-refresh in their browser.
Otherwise, if you’d rather authenticated users’ pages weren’t cached at all, you specifically say just that:
<metal:cacheheaders define-macro="cacheheaders">
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Type', 'text/html;;charset=%s' % charset)" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Language', lang)" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Vary', 'Accept-Language,Accept-Encoding,Cookie')" />
<tal:block condition="isAnon">
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Cache-Control', 'max-age=0,s-maxage=300')" />
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Last-Modified', here.modified().toZone('GMT').rfc822())" />
</tal:block>
<tal:block condition="not:isAnon">
<metal:block tal:define="dummy python:request.RESPONSE.setHeader('Cache-Control', 'must-revalidate,max-age=0,no-cache')" />
</tal:block>
</metal:cacheheaders>
And that’s it! If you retest the index page,
$ ab -n 100 http://example.com/plone.css > css.post.ab
you should be seeing times similar to for your static pages. Your whole homepage, including all files, CSS, everything, should now be ready to ship out to the user in less than a 50ms! That means that you can now handle more than 20 concurrent users. Whee!
References
- Caching with Apache, by Matt Rohrer, for Learning Lab Denmark.
- The Apache documentation for
mod_cacheandmod_disk_cache.
10 comments.