Irrespective of how well the databases and file systems scale, they will still be slower than what the application expects, especially for high visibility applications where the responsiveness is taken for granted.

Therefore, these apps typically save data into an “eventually consistent” data layer and over that, add a distributed cache layer that holds “immediately required” information by the app to facilitate its “responsiveness.”

It’s also possible that you might have multiple distributed cache clusters, focused on their specific datasets and operations around them.

Let’s look at a simplified view of how a distributed cache works. The cache distributes the data across nodes of the cluster. The data object is replicated across the cluster, you’ll have the master copy in one node and maybe one or two backup copies in other nodes. This replication, between master and backup, can also occur asynchronously, but typically a Distributed cache synchronous replication is applied for 1 backup atleast.

Distributed caches (dist cache) can hold different types of information which includes:

  1. Reference data – e.g. Country Info – A single in-frequent publisher
  2. “High-frequency” data – e.g. messages – Multiple publishers, rapidly changing data. For example, the cache storing latest 10 messages, ‘point in time’ snapshot. The snapshot update needs to be atomic.
  3. Distributed state – e.g, request counters – Multiple updaters who increment a counter, and these increments need to be atomic. For example, Hazelcast’s PNCounter.

    An interesting cluster wide state management is Nginx’s runtime state sharing feature. Their cluster wide rate limiting feature is implemented over this. Every node of the cluster maintains their copy of the state. Any update is notified to all nodes in the cluster.

Use case 1 mentioned above is the typical use case across majority of the installations.

When a single process updates the dataset, there is no requirement to write synchronization. This is the case of (1). It’s straightforward and simple. This is similar in concept to a single write thread and multiple read threads. The only point to handle is that while the write occurs, all reads should block or use the snapshot prior to the in-progress write operation. Typically, distributed cache implementations will transparently handle the write/read sequencing.

But for use cases 2 and 3, the question that arises is “who performs this atomic update?”

Possible update scenarios for use cases 2 and 3 are:

  1. A client of the dist cache can perform an atomic update, but since we have many ‘write’ clients, their write operations need to be sequenced. If not sequenced, the data object could end up in an in-consistent state.

    To sequence across clients, each client will first need to obtain a distributed lock, perform the write operation and then release the lock. Distributed locks are usually available, for example Hazelcast’s Fencedlock and Redis’s RedLock

    But the biggest issue with locks is that they can increase contention in the system if not tuned properly. A distributed lock further adds to latency, due to the network chatter to engage the lock across multiple clients.
  2. The other approach is to handover the update to the dist cache itself. The code execution is ensured to be atomic since the cache has control over all operations within the cluster. Here, one caveat is that the operation(s) handed off to the cache need to be as quick as possible, otherwise you start introducing bottlenecks.

    Examples for approach 2 are…
  • Hazelcast – EntryProcessors and Executor Service.
  • Redis – EVAL that can be used to execute a Lua script.

I’ll strongly recommend you read the documentation to better understand the features and the configuration elements available. A distributed cache which is not tuned correctly to the use case, can introduce latency into the application.

For example, while using the Hazelcast EntryProcessor, it’s best to use the OBJECT storage format instead of Binary. Otherwise, every EntryProcessor update requires SerDe processing to convert to Object for the update and then back to serialized form for storage. Another optimization is to ensure that the EntryProcessor does not return a response if the client does not require the result.

I hope you have found this article insightful. Now, get ready to deep dive into a code demo that showcases what we’ve discussed here. That’s Part 2 of this article – Stay tuned!