Caching Anyone?

We have been working on design patterns and implementation for adding caching to our REST services using the ServiceStack framework. And I have to say, the design patterns took quite a while to iron out with experimentation, but in retrospect implementing them with ServiceStack has been a breeze relatively.

This post is long and technical, and only for those who want to support caching of their REST services.

So, why have we gone there with caching our REST services so soon in our product lifecycle?  

Answer: Because, we are continuously evolving a bunch of REST services that our business depends on. We realised (the hard way) that if we wanted minimize the risk of producing slow user experiences, that we need to optimize the data going back and forth over the wire. And to do that we needed to have our clients reuse (as much as they can) any responses that they have already downloaded, that haven't yet changed (something that HTTP caching can address). AND we also wanted to save on compute time and reduce the load on these services from having to re-calculate the same kinds of responses over and over again, sometimes within a few seconds of each other (something that Service Caching can address).

This is not just about just saving stuff you had already downloaded previously (caching), this is about detecting when that stuff may have changed, and when you need to get the update efficiently and reliably (cache management). As Phil Karlton said in two hard things... That's the trickier part, and the place many web developers haven't been or had to go before. In 20+ years of web development, I know I've not had to go too deep here before, and I suspect many others still haven't either. So how easy is it to go there?

You will need some study time. 

[BTW: For context, it's worth noting that we have many REST API's under constant evolution (50+). A few different types of services at the 'back-end' which run the core business, and WebAPI service at the 'front-end' that helps display the desktop and mobile UX of our clients. Consequently, the 'front-end' WebAPI does a lot of aggregation of data from the 'back-end' API's, as it is primarily concerned with shaping all this data for user interaction. Aggregation, incidentally brings in a whole set of other issues when HTTP caching is introduced.] [Edit: see the comments too]

Along our learning journey into applying both HTTP Caching and Service Caching to our REST services we have made some significant discoveries while learning about a topic that surprisingly we have not been all that familiar with before. We needed some practical guidance. Go your fav search engine.

  • The first major lesson we learned was just how poorly many web developers (ourselves included) understand HTTP caching (if at all). Have you ever studied RFC2616, section 13 before?
  • The second lesson was, that there are not a lot of design patterns out there you can pick up and run with if you need to implement HTTP caching for your REST services. Lots of info and ideas (lots of it misleading of course), and very little actionable guidance to start with. Not a good start, as the devil sure is in the detail with this one as usual.
  • The third is that HTTP Caching and Service Caching are entirely two different beasts (different goals and policies) that are vaguely related but not the same, and should not be conflated concepts. You kinda need both to really improve overall end to end performance of a REST service, but one does not replace the other. It's a both decision people.
  • And, the fourth lesson we learned was how tough it was dealing with freshness between services that a aggregate responses from each other.

Furthermore, at the time we started this journey, our service development framework ServiceStack had built-in hooks for us to build both HTTP Caching and Service Caching, thank god, but we still had to do a lot of work to understand how to pull various pieces together and implement a bunch of components to make an effective end-to-end caching strategy. Thanks god for request and response filters!

[BTW: by the time you read this, that would have changed significantly, which is serendipitous for us, and the whole ServiceStack community). However, part of the reason we wanted to write this post is to give further guidance on how to expand your caching strategy with ServiceStack to accommodate concepts that are not present out of the box in ServiceStack caching strategy.]

The Patterns

Like most things, when investigating caching, it does depend on what you are trying to achieve and why. And understanding the different goals between these two patterns is key. 

On the Service Caching side, there is a fairly well-known pattern of: caching most of your GET responses (for longish durations i.e. minutes), and refreshing those cached GET responses on either a POST/PUT/DELETE request for the same resource - as outlined (in part) in the ServiceStack caching guidance. Depending on what verbs your specific resource supports, in many cases the API self-manages it 'freshness' using this pattern. 

[Note: There is a missing part to what is often described in this pattern. That for many GET API's that return 'search' kind of results containing multiple resources, you'll also want to refresh the cached GET search results if there is a chance that a POST of a new resource will add to the search results.]

On the HTTP Caching side, it turned out, there are only so many general caching requirements that can be met with the current HTTP 1.1 standard in HTTP caching. Not that the HTTP 1.1. is lacking in any way, it is that the requirements for HTTP caching are pretty well defined and accommodated by the specification - pretty ingenious actually. Given that, it seems like those design patterns should be easily definable in a simple actionable specification agnostic of platform or language.

[One day, I see someone taking the bold step of setting up a site for standardised design patterns where they are defined in some generic unit test format that anyone can read and translate into their own language. I digress however.]

If you are not familiar with HTTP caching, you can find more practical guidance in RFC7234 (which supersedes RFC2616 section 13). It is a typically computer sciencecy specification with tons of potential ambiguity (unless you have hours to study it), and so it is not surprising a lot of developers don't know how exactly to make it actionable in practice.

There are some great and credible references on how (logically) HTTP caching should work, and how Service Caching should be done, notably:

To summarize all this, in HTTP Caching, basically, there are two concepts to grasp.

  • The first, and most familiar, is HTTP Expiration. That is where the resource server (your web service) declares that a particular resource has a TTL (time to live, or freshness) in (seconds or absolute UTC time - watch out there!). Clients that make a request read that information from caching response headers, and those headers describe 'advice' or 'directives' to the client that help the client decide how to cache and expire the response. It's a very familiar concept to most web developers, even if implementing it is not that common because many of the web servers take care of this kind of stuff automatically for you for static content. But that is not the case with REST services, where the content is much more dynamic. In general, you will have to do something explicit in code or configuration to make this happen with your responses. And as soon as you do that, you need to understand the second concept in HTTP caching, or you encounter numerous intermittent red lights in your integration testing of your API for unpredictable time-based reasons.
  • The second concept is HTTP Validation. That is, where the resource server (your web service) is programmed explicitly to validate some metadata about a response that it previously sent to a client, such as an ETag or a Last-Modified timestamp. The client explicitly asks the server to validate either an ETag or Last-Modified of a response that it has in its cache. By sending a special GET request ('If-Not-Match' for ETags, or 'If-Modified-Since' for Last-Modified date stamps). The server performs the validation using information in the headers of the request, against information is has about the original response. The server responds by either: sending a '304 - Not Modified' response (plus empty content, and some up to date caching headers), if the validation succeeds. Or it responds with a 2XX status code along with the new response (plus up to date caching headers), if the validation fails.

Before we get into the weeds about how you can implement any of this, I think it is worth noting that HTTP Expiration on its own with REST services, has very limited utility without HTTP Validation. Why? because unlike static images/CSS/HTML, the content of REST services responses may change at any time for anyone, frequently or infrequently, and having fresh versions of those 'representations' is critical to the usability of many REST API's. The measure of volatility is harder to asses with API's. So waiting for client caches to time-out and expire does not produce a very responsive (as in, 'up-to-date') API, and pretty poor user experiences. Expiring the cache too often, again does not make a responsive API - it demands too many fetches.  So assuming that you can just defer this commitment, and fine tune that TTL sometime in the future will take a lot of time, research and resources.

Service Caching versus HTTP Caching

First off, let's get something clear from the get go. Service Caching DOES NOT EQUAL HTTP Caching. They both have different design goals, they both have different implementations, but may share similar pieces of the overall architecture. However, in practice, Service Caching is likely to inform/enable Http Caching, but unlikely visa versa.

Service Caching, is likely to involve a distributed cache (like Redis/Memcached etc in a scalable) REST service), that needs to remember responses it has already calculated for a caller, for some duration (in secs/mins).

It is a key part of a caching strategy to reduce the workload that your service has to perform in calculating responses, and free up resources to handle more requests. Let your service REST damn it! This will help your service (and data repos) scale better to increased load.

It helps to cache as much as you can for as long as you can, serving responses from memory rather than recalculating them.

But beware! you will still need to authorize the caller each call, and you still need to be aware of who the caller is because many REST responses can look different depending on who is calling the same URI. So, lets summarise the main concerns here:

  • Reduce compute load and increase availability, by improving response times (for scalable services using distributed caching)
  • Maintaining accurate Authorization of the caller
  • Dealing with multiple representations of the same resource for different callers (and never mixing them)

HTTP Caching, which involves using a cache on a client (i.e the browser cache or your own cache for your own client) to remember responses (and the caching 'advice') provided by a service for some duration (secs). This informs clients when to expire and when to validate the cached response. You are depending on the client to do the right things, as directed by the service.

It is a key part of your caching strategy to reduce the amount of data that is transmitted over the wire between client and service. The goal of which is to reduce the network latency of fetching data, to help improve your desktop/mobile UX and improve on your load on the service as more people use your service concurrently.

Also, when you have more and more users use your service, it makes it a cinch to stand up intermediary caches geographically closer to your users so they can share these responses.

It helps for the client to only ask for fresh responses when the data that it has cached is actually out of date (i.e. There needs to be a mechanism for it to tell when that time is right).

But beware! you still need to serve the right representation to the right caller, and you need to always be serving the most up-to-date representation to that caller. So, lets summarise the main concerns here:

  • Improving network latency times (using a client cache of some kind, and using HTTP validation)

So, it turns out in practice that you can absolutely have Service Caching without HTTP Caching, and you can have HTTP Caching without Service Caching, and that both will improve performance of your REST services dramatically (one or two orders of magnitude perhaps). But working together, they far out perform each other separately. 

In an architecture where you have many REST services, at different layers, like we do, we like to think that each REST service itself has both a Service Cache at the front of it, and a Client Cache at the back of it, for talking to other services out there in our architecture or on the Internet. So for us in practice HTTP caching is really thought of as client-side caching for the rest of this discussion.

So how did we implement both Service Caching and Client Caching in our architecture using ServiceStack?

Service Caching (with ServiceStack)

We applied the following policy for Service Caching on the front side of each of our ServiceStack REST services:

  • For each GET service operation in a resource, we decide (case by case) if the response should/could be cached or NOT. Very few should not be, like images perhaps, and certain authorization functions perhaps. It's entirely your call.
  • If we are caching this GET response, we decide on the expiration duration. Which for us is a T-Shirt size duration (i.e. VeryShort, Short, Medium, Long, VeryLong) for how long we should cache the GET response. Many REST resources may change infrequently or frequently depending on exactly what our users are doing, and so predicting when they change is near impossible, so the default we chose was a duration of 'Short'. There are some REST resources that change very rarely (if ever at all) like uploaded images or location data, so they get a 'VeryLong' duration instead. The actual values of these T-Shirt size durations we define in the configuration for our deployed service, so that we can tune the durations over time without recompiling the code. For example, 'Short' for us is currently 5mins, 'VeryLong' is 12hours, 'VeryShort' is 60 seconds. Reletively speaking, these are long durations, because the API is managing 'freshness' itself.
  • Then we decide if the GET response (representation) will be unique depending on which user is calling. You won't know this upfront until you have implemented the API itself. But some API's verify the caller, or produce different responses for different users. Some produce the same representation of a resource for all users.
  • For each POST/PUT/DELETE verb in the same resource, we decide what possible side effects that POST/PUT/DELETE would have on each of the cached GET representations. If a cached GET representation is likely to be impacted, we will wipe out all the GET representations that are cached upon the request for the POST/PUT/DELETE. Remembering that some POSTS can create new resources that would be included in the cached results of some of the GET verbs (i.e. search results).

The way we implement this caching policy is using declarative attributes on each of the service operations of each of the REST services we have.

We have a simple configuration for the caching T-Shirt sizes. And we have a base class (for all our services) that has a method for fetching a cached response (from a ICacheClient) and generating new responses by calling a delegate which includes all the code to calculate a new response, which is then cached. We also calculate an ETag for every cached response (MD5 digest), and we save that ETag alongside our cached response. Our ICacheClient in production is Redis, and we calculate a cache key based off the actual request (URI (PathInfo), + QueryString + the calling user's ID).

Also, code that returns the cached response runs long after the the caller has been identified and authorized.

Here is the code for a typical REST service in our architecture. We like cars:

internal partial class Cars : ServiceBase, ICars
{
public ICarsManager CarsManager { get; set; }

    // Route: /cars/{Id}
    [CacheResponse(CacheExpirationDuration.Short, CacheRepresentation.PerUser)]
    [HttpClientShouldCacheResponse(CacheExpirationDuration.Short)]
    public object Get(GetCar body)
    {
        return ProcessCachedRequest(body, HttpStatusCode.OK, () =>
        {
            var response = this.CarsManager.GetCar(this.Request, body);

            return response;
        });
    }

    // Route: /cars/search
    [CacheResponse(CacheExpirationDuration.Short, CacheRepresentation.AllUsers)]
    [HttpClientShouldCacheResponse(CacheExpirationDuration.Short)]
    public object Get(SearchCars body)
    {
        return ProcessCachedRequest(body, HttpStatusCode.OK, () =>
        {
            var response = this.CarsManager.SearchCars(this.Request, body);

            return response;
        });
    }

    // Route: /cars
    [CacheResetRelatedResponses("/cars/search")]
    public CreateCarResponse Post(CreateCar body)
    {
        return ProcessRequest(body, HttpStatusCode.Created, () =>
        {
            var response = this.CarsManager.CreateCar(this.Request, body);
            this.SetLocationHeader(GetCreateCarResponseId(response));

            return response;
        });
    }

    // Route: /cars/{Id}
    [CacheResetRelatedResponses("/cars/{Id}")]
    [CacheResetRelatedResponses("/cars/search")]
    [CacheResetRelatedResponses("/cars")]
    public UpdateCarResponse Put(UpdateCar body)
    {
        return ProcessRequest(body, HttpStatusCode.Accepted, () =>
        {
            var response = this.CarsManager.UpdateCar(this.Request, body);

            return response;
        });
    }

    // Route: /cars/{Id}
    [CacheResetRelatedResponses("/cars/{Id}")]
    [CacheResetRelatedResponses("/cars/search")]
    [CacheResetRelatedResponses("/cars")]
    public DeleteCarResponse Delete(DeleteCar body)
    {
        return ProcessRequest(body, HttpStatusCode.Accepted, () =>
        {
            var response = this.CarsManager.DeleteCar(this.Request, body);

            return response;
        });
    }


    ... Other service operations
}

 

A few things to observe here:

  1. Our service derives from our base class 'ServiceBase' which includes the methods: 'ProcessRequest' and 'ProcessCachedRequest'. Which we will talk about soon.
  2. The injected ICarsManager is just our way, in our architecture of delegating the actual response creation to another layer that has the actual code that applies various business rules and constraints and fetches data from the data store, etc. Your implementation would be different.
  3. The GET service operations that will have cached responses return `object` rather than their associated `IReturn<Dto>`. That's because the 'ProcessCachedRequest()' method uses the `ToOptmizedResultFromCache()` method in ServiceStack which returns a `CompressedResult` not the DTO result.
  4. The two GET operations are attributed with the `[CacheResponse(CacheExpirationDuration.Short, CacheRepresentation.PerUser ]` declare the T-Shirt size expiration, and response representation (i.e. whether the response is specific to the calling user, or is the same for all users). As you can see, the 'GET /cars/{Id}' response will return data that is unique to each user who is calling, but the 'GET /cars/search' response will be the same for whomever is calling. This is hugely significant part of any caching strategy.
  5. The two GET operations are also attributed with the `[HttpClientShouldCacheResponse(CacheExpirationDuration.Short)] ` which is our lead-into HTTP caching, which we will talk about later. But for now, all you need to know is that these attributes will yield HTTP response caching headers that 'advise' the HTTP client on how to cache and whether or not this response is cacheable. Without this attribute, no HTTP caching headers are generated for this response, and the default on the internet is not to cache a response, especially if the response is HTTPS. The T-Shirt sizes durations have vastly different value ranges than the ones used for Service Caching. For example, 'Short' here is 5 secs, 'VeryLong' is an hour, and 'VeryShort' is 1sec at present. That is because, we want the HTTP clients to be validating on a different frequency than the actual caching of responses on the service. 
  6. The POST operation is attributed with a `[CacheResetRelatedResponses("/cars/search")]` attribute. The first argument is a GET route (somewhere else in the API) which if previously cached, will be wiped out before this response is recalculated. In the case of the 'cars' API, we know that if a new car resource is created in this POST operation, then the 'GET /cars/search' operation response should be re-calculated, since it could now include the new POST'ed resource.
  7. The PUT and DELETE operations are attributed with a few `[CacheResetRelatedResponses("/cars/search")] ` attributes, because they either update or delete a specific resource, and so depending on how many GET API's your resource has, and whether an update affects their data, will determine how many of these attributes are present, and what routes they have. In this case, we only had two cached GET operations. What is interesting here is that the route of the `[CacheResetRelatedResponses("/cars/{Id}")] ` attribute includes a substitution '{Id}' that must be made at runtime so that the specified cached result can be wiped from the cache only. This attribute, in fact uses the current PUT/DELETE request to make that substitution based on its DTO. So that only the current 'car' is wiped from the cache, rather than all 'cars'. This means that your REST API must be designed so that the GET/PUT/POST/DELETE verbs will have consistent and harmonious routes (as REST recommends anyway).
  8. Out of interest, we have over 50+ of these kinds of services in our architecture (at present), each with on average 5-10 different verbs. You may have noticed that this code looks very regular as it is, since we have a pattern toolkit generate it for us from just a simple visual DSL based on a few properties for each verb. This utterly avoids manual human error when you are building lots of these things, and keeps everything absolutely consistent.
REST Toolkit, showing the metadata required to generate a ServiceStack service interface that includes service caching.

REST Toolkit, showing the metadata required to generate a ServiceStack service interface that includes service caching.

So what do these `[CacheResponse]` and `[CacheResetRelatedResponses]` attribute do? and what about that `ProcessCachedRequest()` method? and any other pieces we need here?

Let's talk attributes and the declarative.

The `[CacheResponse]` attribute is simply a RequestFilterAttribute, that you place on a GET service operation:

[AttributeUsage(AttributeTargets.Method, Inherited = false)]
public class CacheResponseAttribute : RequestFilterAttribute
{
internal const int FilterPriority = RequireRolesAttribute.FilterPriority + 1;
private CacheExpirationDuration expiresIn;

    protected CacheResponseAttribute()
        : this(CacheExpirationDuration.Short)
    {
    }

    public CacheResponseAttribute(CacheRepresentation representation = CacheRepresentation.PerRequest)
        : this(DefaultDuration, representation)
    {
    }

    public CacheResponseAttribute(CacheExpirationDuration expiresIn,
        CacheRepresentation representation = CacheRepresentation.PerRequest)
        : base(ApplyTo.Get)
    {
        this.expiresIn = expiresIn;
        Representation = representation;
        Priority = FilterPriority;
    }

    public CacheRepresentation Representation { get; private set; }

    public TimeSpan ExpiresIn
    {
        get { return GetExpiresInFromConfiguration(Configuration, this.expiresIn); }
    }

    public override void Execute(IRequest req, IResponse res, object requestDto)
    {
        req.Items.Set(RequestItemKey, new CacheInfo
        {
            ExpiresIn = ExpiresIn,
            Representation = Representation
        });
    }
}

and all it does is calculate the actual duration (in secs) from the configured `CacheExpirationDuration` using values in our configuration file (omitted). Then it saves the `CacheExpirationDuration` and the `CacheRepresentation` in a structure that is put into the `IRequest.Items` collection for use later. That's it.

Note: This attribute must run after any Authorization filters you may have in your service (i.e. Priority - see https://github.com/ServiceStack/ServiceStack/wiki/Order-of-Operations for guidance) . Forget that and you could end up serving cached representations for one user to another user, or worse anybody anonymous who calls!!! Definitely a mistake to avoid for caching rookies!

The `[CacheResetRelatedResponses]` attribute is simply another RequestFilterAttribute, that you place on a POST/PUT/DELETE service operations (you can have many):  

[AttributeUsage(AttributeTargets.Method, Inherited = false, AllowMultiple = true)]
public class CacheResetRelatedResponsesAttribute : RequestFilterAttribute
{
internal const int FilterPriority = RequireRolesAttribute.FilterPriority + 1;

    protected CacheResetRelatedResponsesAttribute()
        : base(ApplyTo.Put | ApplyTo.Delete | ApplyTo.Post)
    {
        Priority = FilterPriority;
    }

    public CacheResetRelatedResponsesAttribute(string routePattern)
        : this()
    {
        Guard.AgainstNullOrEmpty(() => routePattern, routePattern);

        RoutePattern = routePattern;
    }

    public string RoutePattern { get; set; }

    public override void Execute(IRequest req, IResponse res, object requestDto)
    {
        if (req.Items.ContainsKey(RequestItemKey))
        {
            var existingInfo = (CacheResetInfo)req.Items[RequestItemKey];
            if (!existingInfo.RoutePatterns.Contains(RoutePattern))
            {
                existingInfo.RoutePatterns.Add(RoutePattern);
            }
        }
        else
        {
            req.Items.Add(RequestItemKey, new CacheResetInfo
            {
                RoutePatterns = new List<string> { RoutePattern }
            });
        }
    }
}

It simply takes the declared routes and saves them to the `IRequest.Items` collection for the service operation to intercept later in the request pipeline.

OK, so both these filters run on a service operation, and they both pass declarative information into the current request pipeline. At some time shortly after they run, the service operation itself is executed and the `ServiceBase.ProcessCachedRequest()` method is run, which will read that information and act upon it.

Here's what essentially happens (for cached GET responses only):

    public ICurrentCaller Caller { get; set; }

    public IServiceInterfaceCacheClient ServiceInterfaceCache { get; set; }

    protected object ProcessCachedRequest(object request, HttpStatusCode code, Func<object> action)
    {
           var cacheInfo = request.Items.GetValueOrDefault(CacheResponseAttribute.RequestItemKey) as CacheInfo ??
               CacheResponseAttribute.GetDefaultCacheInfo(Configuration);

            var response = ServiceInterfaceCache.GetCachedResponse(Request, Response, Caller, cacheInfo.Representation, cacheInfo.ExpiresIn, () => action());

            var cacheResetInfo = request.Items.GetValueOrDefault(CacheResetRelatedResponsesAttribute.RequestItemKey) as CacheResetInfo;
            if (cacheResetInfo == null)
            {
                return;
            }

            cacheResetInfo.RoutePatterns
                .ForEach(pattern =>
                {
                    ServiceInterfaceCache.RemoveAllRelated(Request, pattern);
                });

            SetResponseCode(code);

            return response;
    }

The `ICurrentCaller` implementation obtains the ID of the calling user. In our case we get that from the OAuth 'access_token' that came in the request. Your implementation may be different.

  • The `IServiceInterfaceCache.GetCachedResponse` method essentially uses the `ToOptimizedResultUsingCache()` method already built in ServiceStack, but also calculates and caches an 'ETag' value for the response, as well as caching the response (if none is already cached). It also then generates the following response headers: Cache-Control: no-cache, max-age=<duration>', 'Expires: <now+duration>', 'ETag: <etag>'.
  • The `IServiceInterfaceCache.RemoveAllRelated()` method essentially goes through all the configured routes, makes any substitutions from data in the current request DTO, and wipes out any cached values matching them.

Note: it is funny that the 'Cache-Control' header includes the keyword 'no-cache'. Many developers first think that this keyword means to the client "DO NOT CACHE THIS RESPONSE', but actually it means "Please do not rely on the value in your cache being fresh, and please, please, please re-validate it first before using it".

It is also worth noting that, we calculate cache keys in this format: "{IRequest.PathInfo}?{IRequest.QueryString}.{UserId}". We add the User ID on the end of the cache key, obviously so we can cache responses for different users if specified by the GET service operation, but also so that we can use a '*' wildcard for when it comes time to wipe out all cached entries for particular request. In essence, we want to wipe out each and every 'PerUser' cached response for the same URI, so basically we wipe all all entries with this cache key: "{IRequest.PathInfo}?{IRequest.QueryString}". Redis, for example, has terrible performance penalties for doing searches on cache keys, so we have to have a simple cleanup strategy, using will cards.

The last piece of Service Caching puzzle that we need to mention is the piece that enables HTTP Validation to work at all. This is a bridge to HTTP Caching, and strictly speaking is part of the HTTP Caching policy, but we are adding it here for now. 

HTTP Validation requires that the client sends a 'If-None-Match' (or 'If-Modified-Since') header in a GET request when it wants to validate that a cached response it has (in its cache), has not changed yet. To avoid the needless download of a fresh version of the cached response. The theory is that this validation check is faster and lighter than a full request.

So, the client would call one of our GET service operations, and include in the GET request the 'If-None-Match: <ETag>' header, with the 'ETag' that it would have gotten back in the responses headers of the GET previously.

So without getting into how the client does this yet (read later in HTTP Caching), we need a way that the service can quickly and efficiently check the Etag of a cached response it has with the ETag in a 'If-None-Match' presented by a client, and if they match, then send back a '304 - Not Modified'.

This is straightforward because not only does each service operation cache the response, but it also caches the ETag, and Expiration information along with it. This enables us to create a simple decoupled GlobalRequestFilter that looks out for the 'If-None-Match' header, and when one comes in, it then looks in the service cache for the request. If it finds a cached response, and the ETag matches, it responds with '304 - Not Modified'.

If this filter runs before any service operation, we have a very efficient, and decoupled mechanism to do HTTP Validation, that is cheap in terms of compute and network latency.

This GlobalFilter is configured in the AppHost.Configure() of all our services:

public static Action<IRequest, IResponse, object> HandleHttpExpirationByETagFilter()
    {
        return (request, response, dto) =>
        {
            if (request.Verb != HttpMethods.Get)
            {
                return;
            }

            var eTagFromRequest = request.Headers[HttpHeaders.IfNoneMatch];
            if (eTagFromRequest == null)
            {
                return;
            }

            var configuration = request.TryResolve<IConfigurationSettings>();
            var serviceCache = request.TryResolve<IServiceInterfaceCacheClient>();
            var caller = request.TryResolve<ICurrentCaller>();

            if (!eTagFromRequest.HasValue())
            {
                return;
            }

            if (eTagFromRequest.EqualsIgnoreCase(CachingConstants.PurgeETag))
            {
                PurgeCache(request, serviceCache);
                return;
            }

            var eTagFromCache = GetETagFromCache(request, caller, serviceCache);
            if (eTagFromCache == null)
            {
                return;
            }

            if (!IsETagMatch(eTagFromCache, eTagFromRequest))
            {
                return;
            }

            response.AddCachingExpirationHeaders(GetConfiguredHttpCacheDuration(configuration), eTagFromCache);
            response.StatusCode = (int)HttpStatusCode.NotModified;
            response.EndRequestWithNoContent();
        };
    }

It is a simple filter, but it must run before any other request filter, and it must be quick, to make HTTP Validation worthwhile.

  • Notice that, when it sends a '304 - Not Modified' it also sets the following cache headers: Cache-Control: no-cache, max-age=<duration>', 'Expires: <now+duration>', 'ETag: <etag>'. Just so the client can update its cache again.
  • It must not return any content with a '304' response.
  • Because this filter runs before all other filters, if the Etags don't match (typically because the server response has been updated since the client last cached it), then the service operation will run shortly after this check and send back the freshest response, which the client can then cache and use.

Note: There is an interesting caveat here. It turns out that all the MS HTTP clients including `HttpWebRequest` and `HttpClient` all throw an exception when they see a '304 - Not Modified' in the response stream! So MS clients will never have access to the additional caching information that this filter (and others like it) return, even though it should be there according to the HTTP 1.1. spec. I guess Microsoft had their reasons for throwing instead of returning a '304', but it is somewhat inconvenient for client caches on the MS platform not to have that updated caching information. 

OK, that is it for Service Caching. It amounts to: (1) a set of declarative RequestFilter attributes for each service operation you want to cache and reset, (2) a base class to do the actual caching (using ICacheClient) and setting HTTP Expiration headers, and (3) a global RequestFilter to handle HTTP Validation with 'If-None-Match' and ETags.

The salient points to remember are:

  • Service caching is designed to avoid re-calculating responses that can be re-used over again until they either expire, or are updated through their own API. In fact, because for many resource API's  they are self managing, theoretically you can cache them for very long periods of time (i.e. hours).
  • Your Service Caching strategy provides the mechanisms for HTTP Caching, by providing caching 'advice' in the HTTP Expiration headers from cached responses, and by handling HTTP Validation of cached responses in client caches.

Next, we can move to Client side Caching with ServiceStack and how we address that.

Client Caching (with ServiceStack)

In order to complete the HTTP Caching picture your clients need to have a cache, and they need to comply with the HTTP caching 'advice' provided to it by your service (in the form of HTTP response headers) per response.

In an architecture like ours, just about every REST service communicates with other REST services, and therefore they all have the opportunity to cache responses from those other services. It's beautiful when its all working end to end.

At the very front end of our architecture is a browser or mobile app that is also a client, and must also have a cache. Browsers already do the right thing, but our mobile app needs a cache of its own.

We make extensive use of the JsonServiceClient and the other typed clients in ServiceStack, so the job for us became how to add client caching to them.

We applied the following policy for Client Caching for each of our clients:

  • When a response is request by the client, ask for it from the client cache first.
  • If the response is not cached (i.e. it never was, or it expired) then make a call for a new response from the origin service.
  • When any response comes back, (assuming a 2XX status code), then cache the response, and cache any caching 'advice' that comes with it. Expect headers like: 'Cache-Control' or 'Expires', and 'ETag' or 'Last-Modified', and as long as we either get 'Cache-Control' or 'Expires' and not 'Cache-Control: no-store' we will remember the  'max-age' or 'expires' and the ETag along with the response.
  • If a response is found in the cache, AND it has an ETag, then make a HTTP validation call to the origin service, and set the 'If-None-Match' header with the ETag in the GET request.
  • If the 'If-None-Match' validation check [throws] a '304 - Not Modified' (exception), we update the expiry of the originally cached response (and renew the caching advice) and then return the cached response.
  • If the validation check returns a 2XX status, with some data, then we cache that new response, and any caching advice included, and return the newly cached response.

The way we implemented this client cache was by extending our own JsonServiceClient and adding a 'ClientCache' property. If the 'ClientCache' property is set, we would use the cache, otherwise we don't.

[BTW: We already implement our own JsonServiceClient because in our architecture because all of our services are protected by OAuth, so we decided to extend the JsonServiceClient to manage the retrieval and renewal of OAuth 'access-tokens' for us. Adding client caching was just another convenience.]

ServiceStack's JsonServiceClient already has some hooks that are needed to implement client caching. They are the `ResultsFilter` and the `ResultsResponseFilter` delegates that are called just before a request is made, and just after a response if returned, respectively. By hooking these delegates, you get the chance to serve cached responses from a cache, and store responses in your cache, as described here: https://github.com/ServiceStack/ServiceStack/wiki/C%23-client#custom-client-caching-strategy

Using another `ICacheClient` instance under the covers, we implemented a `IServiceClientCache` class that does what is described in the policy above, and set it to the JsonServiceClient instance.

 

This is our extended JsonServiceClient:

public class JsonServiceClient : ServiceStack.JsonServiceClient, IJsonServiceClient
{
public JsonServiceClient(string baseUrl)
: base(baseUrl)
{
ResultsFilter = FetchResponseFromClientCache;
ResultsFilterResponse = CacheResponseInClientCache;
}

    public IServiceClientCache ClientCache { get; set; }

    public TResponse Purge<TResponse>(IReturn<TResponse> requestDto)
    {
        return Purge<TResponse>((object)requestDto);
    }

    public TResponse Purge<TResponse>(object requestDto)
    {
        Action<HttpWebRequest> filter = request =>
        {
            request.Headers.Add(HttpHeaders.IfNoneMatch, CachingConstants.PurgeETag);
        };

        try
        {
            RequestFilters.Add(filter);
            return Get<TResponse>(requestDto);
        }
        finally
        {
            RequestFilters.Remove(filter);
        }
    }


    ... other custom JsonServiceClient stuff

    private void CacheResponseInClientCache(WebResponse webResponse, object response, string httpMethod,
        string requestUri, object request)
    {
        if (ClientCache == null)
        {
            return;
        }

        ClientCache.CacheResponse(webResponse, response, request, httpMethod, requestUri);
    }

    private object FetchResponseFromClientCache(Type responseType, string httpMethod, string requestUri,
        object request)
    {
        if (ClientCache == null)
        {
            return null;
        }

        var validationClient = new JsonServiceClient(BaseUri)
        {
            RequestFilter = RequestFilter,
            RequestFilters = RequestFilters,
            CookieContainer = CookieContainer,
            ResultsFilter = null,
            ResultsFilterResponse = null,
            ClientCache = null,
        };
        Headers.ToDictionary().ForEach((name, value) =>
        {
            validationClient.Headers.Add(name, value);
        });

        return ClientCache.GetCachedResponse(responseType, request, httpMethod, requestUri, validationClient);
    }
}

 

A few things to observe here:

  1. In the constructor we register the delegates `ResultsFilter` and `ResultsFilterResponse`
  2. The `ClientCache` property is injectable and optional, as can be seen in the delegates themselves
  3. The 'Purge' methods, we will discuss later.
  4. The 'FetchResponseFromClientCache' method (called when the JsonServiceClient wants to make a request across the wire) actually creates a new instance of a JsonServiceClient (validationClient), and clones the headers, any filters and CookiesContainer from the current instance, then passes that validationClient into the 'IServiceClientCache.GetCachedResponse()' method. This is done, because if the response is found in the client cache, and it has an ETag, then we need to call across the wire to do our HTTP validation check with the 'If-None-Match'. So we need to use the validationClient instance to do that, which means it better have all the context (headers, filters, cookies etc) that the calling JsonServiceClient had (for example, any Authorization or CSRF headers included in the request).
  5. The 'IServiceClientCache.GetCachedResponse()' method essentially looks in the cache for a cached response, and if not found, return null. If found in the cache, and has an ETag, then it makes the validation call, and if it gets '304 - Not Modified' it returns the cached response. Otherwise, it returns null, which signals to the `ResultsFilter` delegate that the client must go across the wire to get a fresh response.
  6. 'CacheResponseInClientCache()' simply stores the response in the cache, using the Request.AbsoluteURI as the cache key.

This is what `IServiceClientCache.GetCachedResponse()` looks like:

public object GetCachedResponse(Type responseType, object request, string httpMethod, string requestUri,
IJsonServiceClient client)
{
if (httpMethod.NotEqualsIgnoreCase(HttpMethods.Get))
{
return null;
}

        var cachedResponse = Storage.Get(requestUri, responseType);
        if (cachedResponse == null)
        {
            return null;
        }

        if (IsWithinRevalidateBuffer(cachedResponse.LastValidated))
        {
            return cachedResponse.Response;
        }

        if (!cachedResponse.ETag.HasValue())
        {
            return cachedResponse.Response;
        }

        try
        {
            client.Headers.Add(HttpHeaders.IfNoneMatch, cachedResponse.ETag);
            client.Get<HttpWebResponse>(requestUri);
        }
        catch (WebServiceException ex)
        {
            if (ex.StatusCode == (int)HttpStatusCode.NotModified)
            {
                //TODO: If .NET did not throw exception for 304, we would update the cached expiry with the returned headers
                Storage.Store(requestUri, cachedResponse.Response, cachedResponse.ExpiresIn, cachedResponse.ETag,
                    GetTimeNow());
            }

            return cachedResponse.Response;
        }
        finally
        {
            client.Headers.Remove(HttpHeaders.IfNoneMatch);
        }

        // TODO: Since we cannot return the typed response (of the HttpWebResponse) to the caller, we instead return null, and delete cachedresponse
        Storage.Delete(requestUri);

        return null;
    }

 

A couple observations:

  1. The 'Storage.Get()' method returns a structure with the response and the expiration and validation 'advice' that was stored with the response, as well as a timestamp of the last time it was re-validated.
  2. The `IsWithinRevalidateBuffer()` check is done, because we found in practice that in some cases particularly when aggregating data, our client code was making repeated calls for the same resource very close together in time. This buffer (currently about 3 seconds) prevents us making a re-validation across the wire so close together for the same resource. It is simply an optimization that in practice saves a bunch of 'If-None-Match' checks across the wire in a time-frame where we think its unlikely the response is likely to change.
  3. Unfortunately, the MS .NET clients that JsonServiceClient is built upon throws an exception when it gets a '304 - Not modified'. Thats a shame. We have to deal with it though, in a non-ideal way.
  4. Unfortunately, with ServiceStack, if you want to read the HTTP response headers you have to do a client.Get<HttpWebResponse>() , which forces you to de-serialise the response yourself, which is a minefield we didn't want to cross. We avoided it. (would be nice to have a static method on JsonServiceClient to do this for us). Instead, we decided to return null, triggering the calling client to make the GET request again to fetch the fresh response. Not ideal, and something we could and probably should improve on, to save one other call across the wire.

This is what `IServiceClientCache.CacheResponse ()` looks like:

    public void CacheResponse(WebResponse webResponse, object response, object request, string httpMethod,
        string requestUri)
    {
        if (httpMethod.NotEqualsIgnoreCase(HttpMethods.Get))
        {
            return;
        }

        var rules = CachingAdvisor.GetCachingAdvice(webResponse);
        if (rules.ExpiresIn == 0)
        {
            return;
        }

        Storage.Store(requestUri, response, rules.ExpiresIn, rules.ETag, DateTime.MinValue);
    }

Nice and simple, just cache the response for next time.

The `CachingAdvisor.GetCachingAdvice()` method simply processes the possible HTTP response caching headers (i.e. Cache-Control, Expires, ETag, etc.) and we then save them along with the response, and deal with when we are asked for the response from the cache later.

And that's about it for client caching.

Making It Simpler

Now as it turns out, while we were finalising the client caching side of this pattern, and after much conference on the ServiceStack forums (https://forums.servicestack.net/t/jsonserviceclient-http-client-cache/2115/58) on how to integrate it with ServiceStack, Demis (ServiceStack creator) was also taking client side caching in ServiceStack to the next level, in the pre-release version of ServiceStack v.4.0.55. Which just happened to also contain a bug-fix that we needed to get our client caching working. You could call that serendipitous too! Good on ya Demis!

So getting some of the benefits you have seen above in ServiceStack now, is a lot simpler now than it was then. Demis has done a bunch of work to add HTTP Expiration and HTTP Validation headers in your services a snap,. And has provided a server side filter to handle HTTP Validation. AND, he has created a client side cache for the JsonServiceClient that complies with those headers.

That should be enough to get many started with caching in their ServiceStack services and clients for now.

I hope this article provides some further guidance on how to go to the next level with caching in your services, especially if you want to support declarative service caching, and 'PerUser' representations, and managing expiration of service caches separately from expiration and validation of client side caches.