7

Say I have a webapp running on some number of load-balanced EC2 servers, storing and retrieving metadata from SimpleDB with larger chunks of data stored on S3 (due to the whole 1 KB limitation of SimpleDB). Since S3 is pretty high latency and I don't want to be making a ton of requests over there anyway, I'll want a caching layer for the info... enter ElastiCache.

Ok so I provision an ElastiCache server with endpoint X so I hardcode X into my app on EC2 and it's running happily until I get a few hundred thousand new users and all of a sudden my cache server is woefully underpowered for the demand. Fortunately I can just start up a few new larger cache servers... but then I realize I've got endpoints X, Y, and Z and my app only knows to try X, so I still have a problem.

So right now I'm just trying to get my head wrapped around the various pieces to this puzzle, and I haven't gotten to the coding part yet, but won't this be an issue? I've read the documentation for ElastiCache and it mentions that it is a cache cluster, but then each server in the cluster seems to have its own endpoint. Is there a way for an app running on EC2 to know about all the cache servers that are running, and more to the point which one contains the data for a particular key? Is it possible to ask the cluster as a whole to store or retrieve a piece of information?

Ty W
  • 6,406
  • 4
  • 26
  • 34
  • I doubt it matters a great deal to the answer of this question, but for whatever its worth my EC2 app will more than likely be done in PHP. – Ty W Sep 16 '11 at 14:33
  • I noticed a feature request along these same lines at https://forums.aws.amazon.com/thread.jspa?threadID=74852, so unfortunately it looks like having a config file with your cache endpoints in it is about all you can do for now. – Ty W Sep 22 '11 at 06:13

3 Answers3

3

Today Aws announced cache discovery. Your problem is solved. http://aws.typepad.com/aws/2012/11/amazon-elasticache-now-with-auto-discovery.html .

Elroy Flynn
  • 2,499
  • 1
  • 16
  • 30
0

Amazon's Elasticache Autodiscovery is absolutely horrible. It is basically impossible to install, which is crazy because it should be very simple.

I wrote a simple function in PHP to generate an elasticache node URL given the number of nodes you have running. Yes, you have to update your code if you change the number of nodes (or perhaps put this value in an env var).

It maps the same keys to the same nodes:

function get_elasticache_node_url( $key, $config_url, $num_nodes ) {
  $node = hexdec( substr( md5($key), 0, 15) ) % $num_nodes + 1;
  $nodestr = str_pad($node, 4, "0", STR_PAD_LEFT);
  return str_replace('.cfg.','.'.$nodestr.'.',$config_url);
}

$num_nodes = 10;
$config_url = 'cluster-name.xyzxyz.cfg.use1.cache.amazonaws.com';

echo get_elasticache_node_url("key1", $config_url, $num_nodes );
echo get_elasticache_node_url("key2", $config_url, $num_nodes );

Output:

cluster-name.xyzxyz.0001.use1.cache.amazonaws.com
cluster-name.xyzxyz.0004.use1.cache.amazonaws.com
Nate
  • 2,783
  • 2
  • 20
  • 24
0

If your app is deployed from versioncontrol (I hope it is), you'd just edit the configuration file and re-deploy the application. I don't see a huge problem with this approach, but maybe I am missing the obvious.

Let me know.

Till
  • 21,590
  • 4
  • 55
  • 86
  • 2
    I was hoping there was an elasticache mechanism to present the cache cluster to the application as if it were a single server. not having to worry about which endpoint to hit for a particular key, not having to worry about the cache config when spinning up or shutting down a cache node, etc. It would seem that such a thing does not exist yet. – Ty W Oct 31 '11 at 06:14
  • It doesn't. Generally, AWS' ElastiCache provide you with nodes. How those nodes are used (e.g. do you use all of them as a giant store, or do you mirror, etc.) is up to you. This is just how memcache works. You can probably wrap your discovery into a periodic API call, but I wouldn't recommend that. We re-deploy for these changes. – Till Oct 31 '11 at 16:46
  • I wouldn't put extended magic around this. E.g. just adding a node is usually not a good idea or something your application can probably deal with. E.g. when I add another node to our ElasticCache cluster, I need to rebalance the cache cluster so to speak. Usually, it's easier to add it to the configuration, adjust the settings for ext/memcache (we use PHP) and then start from scratch by emptying the cache and letting it re-populate. – Till Nov 08 '11 at 17:42