itbrokeand.ifixit.com

Sometimes things break, and then we've got to fix them.

Matryoshka: A Configurable Caching Library for PHP

-

Like most websites, we make heavy use of caching to reduce load on our servers and decrease page response times. Our caching daemon of choice is memcached. The PHP extensions are certainly usable and provide all of the core functionality that you could need. However, we use a lot of patterns to make our day-to-day caching much easier that aren't provided by the extensions.

One tool that we wrote is Matryoshka: an open source caching library for PHP which makes common operations easier and allows for on-the-fly configuration.

Configurable Behavior

Matryoshka is designed to be very configurable. You can add functionality on the fly simply by wrapping an existing Backend with a new one. We use this extensively to prefix keys, modify expiration times, disable cache gets, gather metrics, etc.

Start off with a Memcache instance:

// From the native extension.
$memcache = new Memcache();
$memcache->pconnect('localhost', 11211);
$cache = Matryoshka\Memcache::create($memcache);

Then prefix all the keys:

$prefixedCache = new Matryoshka\Prefix($cache, 'prefix-');
// The key ends up being "prefix-key".
$prefixedCache->set('key', 'value');
$value = $prefixedCache->get('key');

Finally double all expiration times for the prefixed backend:

$doubleExpiration = function($expiration) {
   return $expiration * 2;
};
$cache = new Matryoshka\ExpirationChange($prefixedCache,
 $doubleExpiration);
// Results in an expiration time of 20 for "prefix-key".
$cache->set('key', 'value', 10);

By using composition, caching configurations can be assembled on the fly quite easily. The core API is identical for all backends so the caller doesn't need to be aware of the exact configuration. Common configurations and base backends (like memcached connections) can be made into singletons or provided using dependency injection in your application. Additionally, this architecture results in very maintainable and testable code because each class has exactly one job.

Scopes

Cache invalidation is hard. To make it easier, Matryoshka provides "cache scopes" to invalidate a group of keys at once. This works by prefixing all keys with a unique value that is stored in the backend using the scope name.

$cache = new Matryoshka\Scope($memcachedBackend, 'name');

// This results in a get request to memcached for 'scope-name'
// which results in something like '0fb4ae36'. This `set` call
// then results in a key of '0fb4ae36-key'.
$cache->set('key', 'value');
$cache->set('key2', 'value2');
$value = $cache->get('key'); // '0fb4ae36-key' => 'value'

// Deleting the scope results in a new scope value e.g. 'e093f71e'.
$cache->deleteScope();

// Both of these result in a miss because the scope has a new
// value so the keys are now prefixed with 'e093f71e-'.
$value = $cache->get('key'); // 'e093f71e-key' => false
$value2 = $cache->get('key2'); // 'e093f71e-key2' => false

We have found this to be particularly useful for scoping keys to code deploys. We simply put any caches that should be invalidated under the 'deploy' scope which is deleted anytime we deploy code.

Cache keys can also have dynamic scopes. In a generic example of a blog with posts, your scope name could be "post-{$postid}". Then, all keys using a particular $postid prefix can be cleared anytime that specific post is modified.

Cache scopes are an implementation of generational caching, which we make heavy use of throughout our application.

Note: Cache scopes help with cache invalidation but they unfortunately don't make naming things any easier.

Helper Functions

Matryoshka adds a few helper functions to make common operations easier. getAndSet makes populating the cache dead simple:

// Calls the provided callback if the key is not found and sets
// it in the cache before returning the value to the caller.
$value = $cache->getAndSet('key', function() {
   return 'value';
});

Similarly, getAndSetMultiple makes doing multi-gets significantly easier:

// Array of key => id. The ids can be anything used to identify
// the resource that the key represents.
$keys = [
   'key-1-a' => [1, 'a'],
   'key-2-b' => [2, 'b']
];
// Calls the provided callback for any missed keys so the missing
// values can be generated and set before returning them to the
// caller. The values are returned in the same order as the
// provided keys.
$values = $cache->getAndSetMultiple($keys, function($missing) {
   // Use the ids to fill in the missing values.
   foreach ($missing as $key => $primaryKey) {
      $missing[$key] = getValueFromDb($primaryKey);
   }

   // Return the new values to be cached and merged with the hits.
   return $missing;
});

Try it out!

You can install Matryoshka with composer from Packagist or by cloning the repo into your project. A complete list of backends as well as more examples are available in the readme. memcached, specifically the Memcache extension, is the only supported caching daemon right now but adding others is very easy. We encourage you to try it out and contribute any caching techniques that you find useful in your own applications.

Happy caching!

comments powered by Disqus