34

Is there a Redis data structure, which would allow atomic operation of popping (get+remove) multiple elements, which it contains?

There are well known SPOP or RPOP, but they always return a single value. Therefore, when I need first N values from set/list, I need to call the command N-times, which is expensive. Let's say the set/list contains millions of items. Is there anything like SPOPM "setName" 1000, which would return and remove 1000 random items from set or RPOPM "listName" 1000, which would return 1000 right-most items from list?

I know there are commands like SRANDMEMBER and LRANGE, but they do not remove the items from the data structure. They can be deleted separately. However, if there are more clients reading from the same data structure, some items can be read more than once and some can be deleted without reading! Therefore, atomicity is what my question is about.

Also, I am fine if the time complexity for such operation is more expensive. I doubt it will be more expensive than issuing N (let's say 1000, N from the previous example) separate requests to Redis server.

I also know about separate transaction support. However, this sentence from Redis docs discourages me from using it for parallel processes modifying the set (destructively reading from it):
When using WATCH, EXEC will execute commands only if the watched keys were not modified, allowing for a check-and-set mechanism.

Pavel S.
  • 10,292
  • 14
  • 68
  • 108

9 Answers9

20

Use LRANGE with LTRIM in a pipeline. The pipeline will be run as one atomic transaction. Your worry above about WATCH, EXEC will not be applicable here because you are running the LRANGE and LTRIM as one transaction without the ability for any other transactions from any other clients to come between them. Try it out.

Eli
  • 31,424
  • 32
  • 127
  • 194
  • 2
    Does a redis pipeline guarantee atomicism? I think what you mean is a redis [transaction](http://redis.io/topics/transactions). – plmw Dec 17 '13 at 23:47
  • 1
    pipelining itself without `MULTI` and `EXEC` is not, but every Redis library I've ever used has these on by default, and I didn't want to complicate the issue. So, yes, if you're using some odd Redis library where pipelines do not have `MULTI` and `EXEC` on by default, you should turn them on. – Eli Dec 17 '13 at 23:51
  • I'm concerned about the following: suppose the list has fewer than 100 members, and you do `lrange 0 99` followed by `ltrim 100 -1` - the first command will fail, and the second will truncate the list to nothing. – Emil Jul 31 '15 at 15:51
  • 1
    @Emil the first command will not fail. It will return up to 100 members, even if that number is less. See docs here: http://redis.io/commands/LRANGE#out-of-range-indexes – Eli Jul 31 '15 at 19:32
  • Will this block any push operations? Is this a recommended approach while using a Reactive Framework? – Archmede Jul 24 '18 at 14:13
  • Im confused about how `LTRIM` works when there is only 1 member in the list. I can't get LTRIM working when only 1 element is remaining. I think that this answer might be bugged / incomplete. @Eli – Gravy Jan 22 '20 at 20:51
13

To expand on Eli's response with a complete example for list collections, using lrange and ltrim builtins instead of Lua:

127.0.0.1:6379> lpush a 0 1 2 3 4 5 6 7 8 9
(integer) 10
127.0.0.1:6379> lrange a 0 3        # read 4 items off the top of the stack
1) "9"
2) "8"
3) "7"
4) "6"
127.0.0.1:6379> ltrim a 4 -1        # remove those 4 items
OK
127.0.0.1:6379> lrange a 0 999      # remaining items
1) "5"
2) "4"
3) "3"
4) "2"
5) "1"
6) "0"

If you wanted to make the operation atomic, you would wrap the lrange and ltrim in multi and exec commands.

Also as noted elsewhere, you should probably ltrim the number of returned items not the number of items you asked for. e.g. if you did lrange a 0 99 but got 50 items you would ltrim a 50 -1 not ltrim a 100 -1.

To implement queue semantics instead of a stack, replace lpush with rpush.

Granitosaurus
  • 17,068
  • 2
  • 45
  • 66
thom_nic
  • 7,031
  • 6
  • 38
  • 41
  • 1
    You might want to consider changing "pop 4 items off the top of the stack" -> "read 4 items from the head of the stack". 'pop' has the overall meaning to read + delete at once, most of the time from the rear of the structure. – Geert-Jan Apr 02 '17 at 19:41
  • 3
    You cannot use `lrange` and `ltrim` in `multi` and `exec` since redis does not support intra-transaction dependency – cppcoder Aug 21 '17 at 13:35
  • I hope I can still get an answer... Is there an option to "spop count" from redis list in an atomic way? And in a way that make sure the data is returned from the oldest to the newest. Spop returns random values which might make the oldest data to be handled in a very late stage. – Mickey Hovel Feb 04 '19 at 06:41
10

Starting from Redis 3.2, the command SPOP has a [count] argument to retrieve multiple elements from a set.

See http://redis.io/commands/spop#count-argument-extension

Alessandro Cosentino
  • 1,880
  • 1
  • 16
  • 26
  • 2
    Since OP asked for set *or list* and `SPOP` only works for set, I added a complete example of how to do this for a list: http://stackoverflow.com/a/43130793/213983 – thom_nic Mar 31 '17 at 01:07
4

if you want a lua script, this should be fast and easy.

local result = redis.call('lrange',KEYS[1],0,ARGV[1]-1)
redis.call('ltrim',KEYS[1],ARGV[1],-1)
return result

then you don't have to loop.

update: I tried to do this with srandmember (in 2.6) with the following script:

local members = redis.call('srandmember', KEYS[1], ARGV[1])
redis.call('srem', KEYS[1], table.concat(table, ' '))
return members

but I get an error:

error: -ERR Error running script (call to f_6188a714abd44c1c65513b9f7531e5312b72ec9b): 
Write commands not allowed after non deterministic commands

I don't know if future version allow this but I assume not. I think it would be problem with replication.

Yehosef
  • 16,151
  • 4
  • 31
  • 53
  • 1
    or in "normal" redis, which is perfectly OK for this situation: `MULTI; LRANGE key 0 N-1; LTRIM N -1; EXEC;` – Justin Apr 09 '15 at 21:29
  • 1
    The OP seems to not be interested in using transactions - see the end of the question. But in any event you should add your suggestion as an answer instead of a comment to my answer. – Yehosef Apr 11 '15 at 21:51
  • Can you try this redis.call('srem', KEYS[1], unpack(members)) – Hrishikesh Mishra Oct 13 '16 at 04:52
3

Here is a python snippet that can achieve this using redis-py and pipeline:

from redis import StrictRedis

client = StrictRedis()

def get_messages(q_name, prefetch_count=100):
    pipe = client.pipeline()
    pipe.lrange(q_name, 0, prefetch_count - 1)  # Get msgs (w/o pop)
    pipe.ltrim(q_name, prefetch_count, -1)  # Trim (pop) list to new value
    messages, trim_success = pipe.execute()
    return messages

I was thinking that I could just do a a for loop of pop but that would not be efficient, even with pipeline especially if the list queue is smaller than prefetch_count. I have a full RedisQueue class implemented here if you want to look. Hope it helps!

radtek
  • 26,590
  • 9
  • 126
  • 97
2

Redis 4.0+ now supports modules which add all kinds of new functionality and data types with much faster and safer processing than Lua scripts or multi/exec pipelines.

Redis Labs, the current sponsor behind Redis, has a useful set of extension modules called redex here: https://github.com/RedisLabsModules/redex

The rxlists module adds several list operations including LMPOP and RMPOP so you can atomically pop multiple values from a Redis list. The logic is still O(n) (basically doing a single pop in a loop) but all you have to do is install the module once and just send that custom command. I use it on lists with millions of items and thousands popped at once generating 500MB+ of network traffic without issue.

Mani Gandham
  • 5,581
  • 43
  • 54
0

I think you should look at LUA support in Redis. If you write a LUA script and executes it on redis, it is guaranteed that it is atomic (because Redis is mono-threaded). No queries will be performed before the end of your LUA script (ie: you can't implement a big task in LUA or redis will get slow).

So, in this script you add your SPOP and RPOP, you can append the results from each redis command in an LUA array for instance and then return the array to your redis client.

What the documentation is saying about MULTI is that it is optimistic locking, that means it will retry doing the multi thing with WATCH until the watched value is not modified. If you have many writes on the watched value, it will be slower than 'pessimistic' locking (like many SQL databases: POSTGRESQL, MYSQL...) that in some manner 'stops the world' in order for the query to be executed first. Pessimistic locking is not implemented in redis, but you can implement it if you want, but it is complex and maybe you don't need it (not so many writes on this value: optimistic should be quite enough).

zenbeni
  • 6,098
  • 3
  • 27
  • 51
  • Why the extra hassle of Lua when you can just do this in a single transaction/pipeline using built-in Redis functions? – Eli Dec 18 '13 at 21:51
  • @Eli because it can be some kind of pessimistic locking, so it is way different to MULTI. – zenbeni Dec 18 '13 at 22:55
  • Also: you can't use in a multi of many redis requests, the return of one redis query to build the next one. You can easily with LUA. – zenbeni Dec 18 '13 at 22:57
  • you're not using the return of one query to build the next. You just LRANGE and then LTRIM on a list. You don't need the results from LRANGE to LTRIM. – Eli Dec 19 '13 at 00:16
  • And in his case he doesn't need to use WATCH, just EXEC and MULTI, which doesn't need to lock anything, but executes as an atomic operation where it's guaranteed that nothing can possibly happen between LRANGE and LTRIM. http://redis.io/topics/transactions – Eli Dec 19 '13 at 00:26
0

you probably can try a lua script (script.lua) like this:

local result = {}
for i = 0 , ARGV[1] do
    local val = redis.call('RPOP',KEYS[1])
    if val then
        table.insert(result,val)
    end
end
return result

you can call it this way :

redis-cli  eval "$(cat script.lua)" 1 "listName" 1000
Philippe T.
  • 1,114
  • 7
  • 11
  • 1
    This answer ignores the entire basis of the question: "Therefore, when I need first N values from set/list, I need to call the command N-times, which is expensive. Let's say the set/list contains millions of items." – BHSPitMonkey Aug 20 '14 at 22:58
  • Really, maybe a misunderstanding , but the goal is to get N item from List/set in a command? "Is there a Redis data structure, which would allow atomic operation of popping (get+remove) multiple elements, which it contains?" ... "which would return and remove 1000 random items from set or RPOPM "listName" 1000"... = redis-cli eval "$(cat script.lua)" 1 "listName" 1000 – Philippe T. Aug 21 '14 at 07:09
0

Starting from Redis 6.2 you can use count argument to determine how many elements you want it to be popped from the list. count is available for both LPOP and RPOP. This is the pull request that implements count feature.

redis> rpush foo a b c d e f g
(integer) 7
redis> lrange foo 0 -1
1) "a"
2) "b"
3) "c"
4) "d"
5) "e"
6) "f"
7) "g"
redis> lpop foo
"a"
redis> lrange foo 0 -1
1) "b"
2) "c"
3) "d"
4) "e"
5) "f"
6) "g"
redis> lpop foo 3
1) "b"
2) "c"
3) "d"
redis> lrange foo 0 -1
1) "e"
2) "f"
3) "g"
redis> rpop foo 2
1) "g"
2) "f"
redis> 
Ersoy
  • 6,908
  • 6
  • 25
  • 36