Plone caching in two simple steps

Posted on May 24, 2007 by john.
Categories: apache, plone.

Plone is slow. Sure, it does an awful amount of work for every request, but it really shows: in a standard install of Zope and Plone on nice hardware with plenty of memory, getting Plone’s index page takes about 250ms. Other, less complex objects such as CSS files or images take about a tenth of that. Unfortunately, as Plone’s default skin is amazingly complex (and for good reasons), you need to fetch anything between 10 and 50 objects out of the database before your browser stops looking busy. You can’t serve more than a couple of concurrent users with that.

Fortunately setting up a reasonable bit of caching is really trivial, should require no extra software, no recompilation, no nothing, and you can see speedups of more than 25×. Here’s how!

I’m assuming you have Apache installed in front of your Plone, doing virtual hosting and what have you. So you have ab; use it to measure both the homepage and a static file, e.g.

    $ ab -n 100 http://example.com/ > index.pre.ab
    $ ab -n 100 http://example.com/plone.css > css.pre.ab

Look at these files: you’re interested particularly in the “Requests per second” line; it’s probably something around 3rps for the index and 30rps for the css.

Step 1: start using mod_cache_disk

Decide where you want your disk cache to be. I usually choose /var/cache/www/<virtualhost>. Create the directory, and make sure that the user under which you run Apache can write that directory,

    # mkdir -p /var/cache/www/example/
    # chown -R www-data:www-data /var/cache/www

Then, add these two lines to the plone site’s virtual root:

    CacheRoot /var/cache/www/example/
    CacheEnable disk /

and do an apache2ctl configtest. If that fails, you might need to enable the mod_cache_disk Apache module and its dependencies (under Debian and derivatives, that is a2enmod disk_cache). Once configtest passes, do a graceful restart with apache2ctl graceful, and it should be working! Well, for “static” content, at least. Test it with the CSS file you tested earlier, e.g.

    $ ab -n 100 http://example.com/plone.css > css.post.ab

You should see the “Requests per second” line grow by a factor of more than 20, i.e. more than 600rps.

This is already a huge difference, as now all the extra “static” content will be served almost immediately when compared to the dynamic html page. Depending on what skin and products you have installed, you probably can now serve double the amount of concurrent users. Not bad for something that should take you less than five minutes! (unless you’re like me, and typo the CacheRoot…) If you similarly test the Plone site root, you’ll notice that it isn’t any speedier. We’ll do that in step 2…

Step 2: enable caching of HTML

If you rather, you could install CacheFu. If so, skip the to the last paragraph. Unfortunately, CacheFu is designed to work with Squid, which you can’t really use on VPS accounts for example. Fortunately, it doesn’t actually depend on Squid to work :)

Now, we’re going to customize the skin via the ZMI. This is just for testing; to do it properly, you’d customize the skin product, right? Right. So! To the ZMI!

First, go to the Accelerated HTTP Cache Manager object (click on the thing called “HTTPCache”). Make sure “Cache anonymous connections only?” is disabled; that disables all authenticated caching, including things like static content that doesn’t change anyway. Then, just to be sure, go to the on the “Associate” tab; in the “Locate cacheable objects” radiogroup select “Associated with this cache manager”, hit “Locate”, and check to see you’re getting a boatload of content. You should, unless you broke stuff (or are using an ancient version of Plone), in which case you should find all “static” content and associate it.

Next, customize global_cache_settings, which is used by Plone’s main_template. Usually it lives in plone_templates (under portal_skins), but you might have another version further up the skin stack. What we want to do is add Vary headers so that Apache properly juggles multiple concurrent logged in users (Vary tells the cache that the URL is not the only thing to use to determine what content is being asked for). We also want to add a Last-Modified header so Apache will cache the page. Finally, your registered users might be prepared for content that is changed to need a reload before they see the changes, or that might be beyond them. If it is beyond them, you want to disallow caching of pages for authenticated users. Let’s look at the first case first:

      <metal:cacheheaders define-macro="cacheheaders">
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Type', 'text/html;;charset=%s' % charset)" />
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Language', lang)" />
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Vary', 'Accept-Language,Accept-Encoding,Cookie')" />
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Cache-Control', 'max-age=0,s-maxage=300')" />
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Last-Modified', here.modified().toZone('GMT').rfc822())" />
      </metal:cacheheaders>

This makes pages expire after five minutes, which might be a little low; change the 300 to the number of seconds you think is ok. Users editing content that want to see fresh content before the five minutes are up would need to hard-refresh in their browser.

Otherwise, if you’d rather authenticated users’ pages weren’t cached at all, you specifically say just that:

      <metal:cacheheaders define-macro="cacheheaders">
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Type', 'text/html;;charset=%s' % charset)" />
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Content-Language', lang)" />
        <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Vary', 'Accept-Language,Accept-Encoding,Cookie')" />
        <tal:block condition="isAnon">
          <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Cache-Control', 'max-age=0,s-maxage=300')" />
          <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Last-Modified', here.modified().toZone('GMT').rfc822())" />
        </tal:block>
        <tal:block condition="not:isAnon">
          <metal:block tal:define="dummy python:request.RESPONSE.setHeader('Cache-Control', 'must-revalidate,max-age=0,no-cache')" />
        </tal:block>
      </metal:cacheheaders>

And that’s it! If you retest the index page,

    $ ab -n 100 http://example.com/plone.css > css.post.ab

you should be seeing times similar to for your static pages. Your whole homepage, including all files, CSS, everything, should now be ready to ship out to the user in less than a 50ms! That means that you can now handle more than 20 concurrent users. Whee!

References

10 comments.

Pingback on May 27th, 2007.

[...] Plone caching in two simple steps @ kibble corp very similar to the approach we’re using, how-to fodder? (tags: plone howto) [...]

Comment on May 28th, 2007.

John-

Nice little how to! There’s a lot of need to demystify simple caching scenarios, and this how-to helps quite a bit.

A couple of thoughts/questions:

1) We saw some nice decreases in bytes transmitted from enabling mod_gzip in Apache — it’s especially effective on those big JS & CSS files. No big ab results, though.

2) Given how smart CacheFu is, and given the fact that it works just fine w/o Squid, why would you want to bother with manual configuration of caching headers? Doesn’t CacheFu do a better job with even less fuss?

john
Comment on May 28th, 2007.

Jon,

Re: (1) is hard to measure, as you say. I’ll by looking into that soon, however I seem to remember issues with (some versions of?) MSIE not understanding compressed CSS nor JS.

Re: (2), CacheFu didn’t seem to work out of the box on some of the ancient (2.0.x) versions of Plone I was working with :). And, some people don’t like adding products to an already running instance… Otherwise, yes, CacheFu is the way to go.

Comment on May 29th, 2007.

John-

Ah, I hadn’t realized you were referring to a Plone 2.0 site. That makes more sense.

We’ve not experienced IE issues with gzipped stuff — I think that recent Microsoft service packs for IE have more or less fixed the issue, but honestly I haven’t researched it too deeply. We haven’t gotten any complaints. :-)

john
Comment on May 29th, 2007.

Yeah, I was going to be specific and mention versions and such, but then I realized it didn’t really add anything, and diluted the simplicity of it all.

Good news re gzipped stuff on IE. Now I don’t have to research! whee :)

Pingback on May 30th, 2007.

[...] Sir, your on-disk cache is full. Posted on May 30th, 2007 by john. Categories: apache.I forgot two mention two things in my two-step micro howto on plone caching. [...]

Javier
Comment on October 9th, 2007.

Ummm, my last post is not showing what I want. Lets do it right.

Where it say


python:request.RESPONSE.setHeader('Content-Type', 'text/html;charset=%s' % charset)

Should say:


python:request.RESPONSE.setHeader('Content-Type', 'text/html;;charset=%s' % charset)

john
Comment on October 10th, 2007.

Javier,

You are, of course, correct. Thank you. Fixed.

duc toan
Comment on May 26th, 2008.

I follwed your tutoria and everything seems ok.
I tested in my local machine, I tried even when Plone site ’s down. But when I using IE browser or other people get access to my Plone so this doesn’t work.
How can I do to request’s served by Apache cache even when Plone site is down.

john
Comment on May 26th, 2008.

Duc,
I don’t think Apache is smart enough a cache to be able to survive Plone being down. I’m not sure Squid is, either. If you need that, you might want to look at one of the staging solutions; we’ve used a custom-grown one for serving Except’s site for a while (until we switched to having a “real” server that could run Zope).
Cheers,

Leave a comment

Names and email addresses are required (email addresses aren't displayed), url's are optional.

Comments may contain the following xhtml tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>