0

So I am just incrementing scores in a sorted set. That is the only command I am running, about 10-30 commands per second, from a Java application, using the Jedis client. Since I am just updating the scores, I don't care about the response either. My concern is that each ZINCRBY command is being put into its own TCP packet, and also waiting for the next reply before allowing my thread to send the next ZINCRBY thread.

So, I want to just implement pipe-lining to batch say 50 commands at a time. Here's where I see a code/design-pattern smell: Isn't this design pattern common enough that the driver should handle it? It appears that the .net "StackExchange.redis" driver does command batching automatically, but that the Java drivers don't have this feature? Is my idea to make a custom redis command buffer class, which puts incoming commands into a pipeline and calls sync() after 50 items, really needed?

Also, I noticed this in my logs, as I am using Jedis via Spring Data Redis:

20160929 06:48:27.393 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.393 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.393 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.393 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.393 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.394 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.394 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.629 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.630 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.630 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.631 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.631 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.631 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.631 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.631 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.632 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.632 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.632 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.632 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.632 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.633 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.633 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.633 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection
20160929 06:48:27.633 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Opening RedisConnection
20160929 06:48:27.633 [Twitter4J Async Dispatcher[0]] DEBUG o.s.d.r.c.RedisConnectionUtils # Closing Redis Connection

So it appears that it is closing the connection per naively executed command (via the Spring provided template pattern). I think that closing the connection forces the TCP buffer to send a single command per packet, so that seems pretty inefficient to me since sockets eats up a fair amount of CPU. Although the Spring Data Redis API does allow direct access to the Jedis client and won't close connections if a pipeline is currently open, so writing the "pipeline buffer" is an option with that.

In short, should I create/leverage a buffer that writes to a redis pipeline and the flushes after X commands? I simply don't like the idea of wasting all these CPU cycles (higher AWS bill) running each command naively, and am curious if there is a better design pattern for my scenario. Or if switching to a different Java Redis Client would solve this problem. Or if there is some Java library that already buffers commands in a redis pipeline.

Zombies
  • 22,793
  • 38
  • 132
  • 216

2 Answers2

2

I think we need to dig a bit more into details here as you're mentioning different aspects here.

In general, all Java Redis clients (Jedis, Lettuce, Redisson and much more) write commands directly to the TCP channel by default. So each command is sent as one or multiple TCP packets. Lettuce and Redisson as their mode of operation as both clients use an asynchronous/event-driven programming model under the hood to write Redis commands to the TCP channel. Jedis forces you to await the command result since it exposes a blocking API. Redisson and Lettuce expose different kinds of API (asynchronous using Futures or reactive using RxJava/Reactive Streams) that do not force you to await the command result.

Batching/buffering is another technique to collect commands in the client memory and send commands as a batch to Redis. This works with both, Lettuce and Redisson clients. Jedis writes commands in its pipelining mode directly to the TCP channel.

Command batching brings some implications. It's possible to auto-batch commands (say in sizes of 10 or 50) but this requires some attention by users. Batching always requires some final synchronization to avoid commands lingering in the queue and not been sent because the batch size is not reached yet.

Spring Data Redis uses Jedis and Lettuce to expose its functionality, so Spring Data Redis is required to cope with the common substance of both drivers. You can set up Spring Data Redis to use connection pooling with Jedis so you benefit from pooled connections that are not closed each time you interact with Redis.

mp911de
  • 14,871
  • 2
  • 41
  • 82
  • Thank you. I intend to use Lettuce after doing some research. In my use case, I am reading from the streaming twitter API and have enough consistent redis commands to send (all fire and forget) that I will easily fill my size 50 buffer up, and then invoke `connection.flushCommands();`. I was wondering if flush-after-x-commands or other fire-and-forget optimization functionality already existed in any of the Java drivers like it does in the .net StackExchange.redis driver (see http://stackoverflow.com/questions/27796054/pipelining-vs-batching-in-stackexchange-redis) – Zombies Sep 29 '16 at 14:46
  • I forgot to add a conclusion to the post: It's important to know what you're optimizing for. Each approach comes with its own properties and that must fit your approach and use-case. Thanks for the flush-after-x-commands pointer. This may be of interest. – mp911de Sep 29 '16 at 14:49
0

Pipelining is a common pattern with Redis, to reduce the cost of communications. Jedis is a recommended Java driver for redis, which supports pipelining. Lettuce is an alternative, which supports pipelining too.

Pascal Le Merrer
  • 5,177
  • 17
  • 32