Wednesday, July 2, 2008

WP Super Cache

I recently installed WP Super Cache to speed up my WordPress blogs. The regular WP-Cache plug-in saves processed copies of pages, reducing the database and PHP processing overhead, but the pages are still served via PHP. WP Super Cache takes this to the next level, using mod_rewrite to send requests directly to cached HTML files:

RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} !.*s=.*
RewriteCond %{QUERY_STRING} !.*attachment_id=.*
RewriteCond %{HTTP_COOKIE} !^.*(comment_author_|wordpress|wp-postpass_).*$
RewriteCond %{DOCUMENT_ROOT}/blog/wp-content/cache/supercache/%{HTTP_HOST}/blog/$1/index.html -f
RewriteRule ^(.*) /blog/wp-content/cache/supercache/%{HTTP_HOST}/blog/$1/index.html [L]

If a page isn’t yet cached, the -f test will fail, and the request will be directed to the regular WordPress PHP script (using WP-Cache). It’s a nice system, but I hit a few snags setting it up, so I wanted to document them here.

The first problem was that the super cache was not being generated. Every request was being handled by WP-Cache. It turns out that WP Super Cache (sensibly) doesn’t add a page to the super cache unless $_GET is empty. I was using a .htaccess file from a much older version of WordPress, and it was packing the query string with elements of the path:

RewriteRule ^archives/?$ /blog/index.php?pagename=archives [QSA]
RewriteRule ^category/(.*)/(feed|rdf|rss|rss2|atom)/?$ /blog/wp-feed.php?category_name=$1&feed=$2 [QSA]
RewriteRule ^category/?(.*) /blog/index.php?category_name=$1 [QSA]
RewriteRule ^author/(.*)/(feed|rdf|rss|rss2|atom)/?$ /blog/wp-feed.php?author_name=$1&feed=$2 [QSA]
RewriteRule ^author/?(.*) /blog/index.php?author_name=$1 [QSA]
RewriteRule ^([0-9]{4})/?([0-9]{1,2})?/?([0-9]{1,2})?/?([_0-9a-z-]+)?/?([0-9]+)?/?$ /blog/index.php?year=$1&monthnum=$2&day=$3&name=$4&page=$5 [QSA]
RewriteRule ^([0-9]{4})/?([0-9]{1,2})/([0-9]{1,2})/([_0-9a-z-]+)/(feed|rdf|rss|rss2|atom)/?$ /blog/wp-feed.php?year=$1&monthnum=$2&day=$3&name=$4&feed=$5 [QSA]
RewriteRule ^([0-9]{4})/?([0-9]{1,2})/([0-9]{1,2})/([_0-9a-z-]+)/trackback/?$ /blog/wp-trackback.php?year=$1&monthnum=$2&day=$3&name=$4 [QSA]
RewriteRule ^feed/?([_0-9a-z-]+)?/?$ /blog/wp-feed.php?feed=$1 [QSA]
RewriteRule ^comments/feed/?([_0-9a-z-]+)?/?$ /blog/wp-feed.php?feed=$1&withcomments=1 [QSA]

Updating .htaccess file to let WordPress itself parse the path:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /blog/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /blog/index.php [L]
</IfModule>
# END WordPress

fixed that problem.

Now the posts were being saved to the super cache, and served from it, but the feeds were only using the regular cache. WP Super Cache is hard-coded not to cache feeds in order to avoid serving them with an incorrect content type (a.k.a. media type or MIME type). Feeds are supposed to be served as application/rss+xml or application/atom+xml, and Feed Validator will complain if they aren’t. But the super-cached files are all stored as index.html so Apache will serve them as text/html. (This isn’t a problem when using WP-Cache, because it sends the proper HTTP header depending on the type of feed being generated.)

To fix this, I modified WP Super Cache to generate local .htaccess files inside the super cache:

if (is_feed()) {
    $type = get_query_var('feed');
    $type = str_replace('/','',$type);
    switch ($type) {
        case 'atom':
            $mediaType = "application/atom+xml";
            break;
        case 'rdf':
            $mediaType = "application/rdf+xml";
            break;
        case 'rss':
        case 'rss2':
        default:
            $mediaType = "application/rss+xml";
    }
    $htaccess = @fopen ("{$dir}.htaccess", 'w');
    if ($htaccess) {   
        fputs($htaccess, "AddType $mediaType .html");
        fclose($htaccess);
    }
}

For example, the file blog/wp-content/cache/supercache/mjtsai.com/blog/feed/rss2/.htaccess changes the content type for the index.html file:

AddType application/rss+xml .html

Then I modified WP Super Cache’s wp-cache-phase2.php to remove the is_feed() check from the line:

if( !empty( $_GET ) || is_feed() || ( $super_cache_enabled == true && is_dir( substr( $supercachedir, 0, -1 ) . '.disabled' ) ) )

Now the feeds are super cached and served with the correct content types.

12 Comments RSS · Twitter

Enjoyed the post, but one question:

Where did you put the first modification (the one that generates the .htaccess file for RSS Feeds)? I put it in the function "wp_cache_ob_callback" at line 189 (after the block "if( $user_info == '' || $do_cache === true )") and it seems to be working, but I was wondering where you put it.

I put it slightly below that, just above “$new_cache = true;”. Your way should work, too, though.

It looks as if my RSS/Atom feeds aren't updated when viewed in IE or in many RSS readers, but in FF they work fine. I haven't yet put the second or third code fragments into my WP-Supercache plugin, but I wanted to check/verify that that code is still relevant a little over a year later and might fix my problem. Feeds I'm talking about are at http://www.studlife.com/feed/atom/ or http://www.studlife.com/news/feed/atom/. This is very frustrating, because RSS is not at all reliable for my site right now - hoping you can help. Thanks!

And by second/third code fragments, I actually meant the last few code fragments where you add to the Super Cache code. Thanks!

Scott: As far as I am aware, it still works with WordPress 2.8.4 and WP Super Cache 0.9.6.1.

Michael, thanks for your response. Where in the Super Cache plugin does the if (is_feed()) code snippet go? I imagine somewhere in wp_cache.php?

BTW, I couldn't imagine that this is a widespread issue for all users of WP Super Cache... or is it? do they all have stale feeds? Why has this not been fixed in the plugin itself?

Actually, because of the line in the last code snippet in wp-cache-phase2.php that checks if( is_feed() ), shouldn't WP Super Cache not attempt to cache feeds? If so, I don't understand what it is about WP Super Cache that is making my feeds (listed above) stale.

Scott: is_feed() is described in the post. Most users don't have stale feeds because without my patch the feeds aren't (super) cached.

Hmm, well for some reason I haven't applied anything that you posted but my feeds ARE stale for logged-out users when WP Super Cache is enabled. As soon as I disable caching, the feeds get up to date. Any idea why this might be happening?

I am working on a similar issue for a customer who is having huge CPU loads with a tiny wordpress installation due to the feeds not being cached with super-cache 0.9.6.1. I did see some notes a newer release of super-cache that mentioned RSS feeds, so perhaps the author has finally gotten around to caching feeds (though I wonder if the comments about "stale" feeds are due to a bug in a later release of super-cache).

I used your notes to help solve my problem, with these additions:
I modified the root .htaccess to redirect /feed /?feed=rss and /?feed=rss2 to /feed/ so no php code is needed for any of those accesses.

I also added another bit to the feed/.htaccess to allow the compressed feeds to output with the right mime type too.
For my server, I had to use ForceType rather than just AddType:

<Files *.gz>
ForceType application/rss+xml
</Files>
<Files *.html>
ForceType application/rss+xml
</Files>

Thanks for your help though, it was a good start on getting the load issues under control.

I upgraded supercache this morning and verified that the latest version also doesn't cache RSS feeds, though it has an option to disable caching, which would imply it would cache the feeds if you left that option unset....?

Personally I switched to W3 Total Cache. For me is a good alternative, also installed Php Speedy. For the moment I'm satisfied with the result.

Leave a Comment