An easy-to-understand tutorial on the fundamentals of Redis data structures

Redis has five basic data structures, string, list, hash, set and zset. they are the daily development of the frequency of use is very high application of the most widely used data structures, the five data structures are eaten through, you have mastered half of the Redis application knowledge.

string

Let’s start with string, which represents a mutable array of bytes that we can initialize, get the length of the string, get the substrings of the string, overwrite the substrings of the string, and append the substrings.

Redis strings are dynamic strings, strings that can be modified, the internal structure is similar to the implementation of Java’s ArrayList, the use of pre-allocated redundant space to reduce the frequent allocation of memory, as shown in the figure, the actual internal allocation of space for the current string capacity is generally higher than the actual length of the string len. when the length of the string less than 1M When the length of the string is less than 1M, the expansion is to double the existing space, if more than 1M, the expansion will only expand 1M more space at a time. It should be noted that the maximum length of the string is 512M.

Initializing strings Requires “variable name” and “content of variable”.

> set ireader beijing.zhangyue.keji.gufen.youxian.gongsi
OK

Get the content of the string Provide the “variable name”.

> get ireader
"beijing.zhangyue.keji.gufen.youxian.gongsi"

Get the length of the string Provide the “variable name”.

> strlen ireader
(integer) 42

Get substring Provide “variable name” and start and end position [start, end].

> getrange ireader 28 34
"youxian"

Override substring Provide “Variable Name” as well as Start Position and Destination substring.

> setrange ireader 28 wooxian
(integer) 42  
> get ireader
"beijing.zhangyue.keji.gufen.wooxian.gongsi"

additional substring

> append ireader .hao
(integer) 46 
> get ireader
"beijing.zhangyue.keji.gufen.wooxian.gongsi.hao"

Unfortunately strings do not provide string insertion methods and substring deletion methods.

Counter If the content of the string is an integer, then it is also possible to use the string as a counter.

> set ireader 42
OK
> get ireader
"42"
> incrby ireader 100
(integer) 142
> get ireader
"142"
> decrby ireader 100
(integer) 42
> get ireader
"42"
> incr ireader  # = incrby ireader 1
(integer) 43
> decr ireader  # = decrby ireader 1
(integer) 42

The counter has a range, it can’t be more than Long.Max and can’t be less than Long.MIN

> set ireader 9223372036854775807
OK
> incr ireader
(error) ERR increment or decrement would overflow
> set ireader -9223372036854775808
OK
> decr ireader
(error) ERR increment or decrement would overflow

Expiration and Deletion Strings can be actively deleted using the del directive. You can use the expire directive to set an expiration time, at which point they will be automatically deleted, which is a passive deletion. The lifetime of a string can be obtained using the ttl directive.

list

Redis will list data structure named list instead of array, because the list of the storage structure with a chained table instead of an array, and the chained table or bidirectional chained table. Because it is a chained list, so the random positioning performance is weak, the first and last insertion and deletion performance is better. If the list of list length is very long, when using the time complexity of the relevant operations we must pay attention to the chain table.

Negative subscripts The position of the elements of a chain table is represented by the natural number 0,1,2,....n-1 , and can also be represented by the negative number -1,-2,...-n , -1 represents the “penultimate”, -2 represents the “penultimate”, then -n represents the first element, and the corresponding subscript is 0 .

Queue/Stack Chained tables can be used to append and remove elements from the head and tail of the table. Combined with the rpush/rpop/lpush/lpop instructions, a chained table can be used as a queue or as a stack, both left and right.


> rpush ireader go
(integer) 1
> rpush ireader java python
(integer) 3
> lpop ireader
"go"
> lpop ireader
"java"
> lpop ireader
"python"

> lpush ireader go java python
(integer) 3
> rpop ireader
"go"
...

> rpush ireader go java python
(integer) 3
> rpop ireader 
"python"
...

> lpush ireader go java python
(integer) 3
> lpop ireader
"python"
...

In everyday applications, lists are often used as asynchronous queues.

Length Use the llen command to get the length of a linked table.

> rpush ireader go java python
(integer) 3
> llen ireader
(integer) 3

Random read You can use the lindex instruction to access the element at the specified position, and the lrange instruction to get a list of the children of a linked table, with start and end subscript parameters.

> rpush ireader go java python
(integer) 3
> lindex ireader 1
"java"
> lrange ireader 0 2
1) "go"
2) "java"
3) "python"
> lrange ireader 0 -1  
1) "go"
2) "java"
3) "python"

When using lrange to get all the elements, you need to provide end_index, if there is no negative subscript, you need to first get the length through the llen instruction to get the value of end_index, with a negative subscript, using -1 instead of end_index can achieve the same effect.

Modifying Elements Use the lset command to modify an element at a specified location.

> rpush ireader go java python
(integer) 3
> lset ireader 1 javascript
OK
> lrange ireader 0 -1
1) "go"
2) "javascript"
3) "python"

Inserting Elements Using the linsert instruction to insert elements in the middle of the list, experienced programmers know that when inserting elements, we often can not figure out whether to insert in front of the specified position or after the insertion, so antirez in the linsert instruction to add the direction of the parameter before/after to show the instructions to indicate the insertion of the front and back. But what is unexpected is that the linsert instruction does not insert by specifying a position, but by specifying a specific value. This is because in a distributed environment, the elements of a list are always changing, meaning that the subscripts of the elements calculated at one moment may not be the subscripts you expect at the next moment.

> rpush ireader go java python
(integer) 3
> linsert ireader before java ruby
(integer) 4
> lrange ireader 0 -1
1) "go"
2) "ruby"
3) "java"
4) "python"

Up to the current position, I haven’t found an application scenario where insertion is specified in a practical application.

Deleting Elements The deletion operation of a list does not determine the elements by specifying subscripts either; you need to specify the maximum number of elements to be deleted as well as the value of the element.

> rpush ireader go java python
(integer) 3
> lrem ireader 1 java
(integer) 1
> lrange ireader 0 -1
1) "go"
2) "python"

Fixed-length lists In real-world application scenarios, we sometimes encounter the need for “fixed-length lists”. For example, if you want to display the list of winning usernames in real time in the form of a lantern, because there are too many winning users, and the number that can be displayed is usually less than 100, then the fixed-length list will be used here. The command to maintain a fixed-length list is ltrim, which takes two arguments, start and end, indicating that the subscript range of the list needs to be preserved, and all elements outside the range will be removed.

> rpush ireader go java python javascript ruby erlang rust cpp
(integer) 8
> ltrim ireader -3 -1
OK
> lrange ireader 0 -1
1) "erlang"
2) "rust"
3) "cpp"

If the end of the specified parameter corresponds to a real subscript that is smaller than start, the effect is equivalent to that of the del directive, since such a parameter indicates the need to keep the subscript range of the list element empty.

Quick List

If you dig a little deeper, you’ll find that the underlying Redis storage is not yet a simple linkedlist, but a structure called a quicklist. First of all a contiguous block of memory is used to store the list when there are fewer elements in it, and this structure is a ziplist, which is a compressed list. It stores all the elements next to each other and allocates a contiguous block of memory. It is only when there is a lot of data that it is changed to a quicklist, because a normal linked list would require too much additional pointer space and would be a waste of space. For example, if this list contains only int-type data, it needs two additional pointers, prev and next, so Redis combines chained lists and ziplists to form a quicklist, which means that multiple ziplists can be used with bidirectional pointers in a chain. This not only meets the fast insertion and deletion performance, but also does not appear too much space redundancy.

hash (computing)

Hash is equivalent to the Java language HashMap or Python language dict, in the realization of the structure it uses a two-dimensional structure, the first dimension is an array, the second dimension is a chained table, the contents of the key and value of the hash is stored in the chained table, the array is stored in the head of the chained table pointer. When looking for elements through the key, first calculate the key hashcode, and then use the hashcode of the length of the array to take the mode to locate the head of the chain table, and then traversing the chain table to obtain the corresponding value value, the role of the chain table is used to produce a “hash collision” of the elements of the string. and HashMap is no different. The length of the first dimension of the hash array is also 2^n.

Adding Elements You can add key-value pairs one at a time using hset, or you can add multiple key-value pairs at a time using hmset.

> hset ireader go fast
(integer) 1
> hmset ireader java fast python slow
OK

Getting Elements You can locate the value corresponding to a specific key by hget, get the values corresponding to multiple keys by hmget, get all key-value pairs by using hgetall, and get all key lists and value lists by using hkeys and hvals, respectively. These operations are similar to the Map interface of the Java language.

> hmset ireader go fast java fast python slow
OK
> hget ireader go
"fast"
> hmget ireader go python
1) "fast"
2) "slow"
> hgetall ireader
1) "go"
2) "fast"
3) "java"
4) "fast"
5) "python"
6) "slow"
> hkeys ireader
1) "go"
2) "java"
3) "python"
> hvals ireader
1) "fast"
2) "fast"
3) "slow"

Deleting Elements You can use hdel to delete a specified key. hdel supports deleting multiple keys at the same time.

> hmset ireader go fast java fast python slow
OK
> hdel ireader go
(integer) 1
> hdel ireader java python
(integer) 2

Determine whether the element exists Usually we use hget to get the key corresponding to the value is empty until the corresponding element exists, but if the value of the string length is particularly large, through this way to determine the existence of the element with or without a little waste, then you can use hexists command.

> hmset ireader go fast java fast python slow
OK
> hexists ireader go
(integer) 1

Counters The hash structure can also be used as a counter, for each key inside as a separate counter. If the value value is not an integer, calling the hincrby instruction will result in an error.

> hincrby ireader go 1
(integer) 1
> hincrby ireader python 4
(integer) 4
> hincrby ireader java 4
(integer) 4
> hgetall ireader
1) "go"
2) "1"
3) "python"
4) "4"
5) "java"
6) "4"
> hset ireader rust good
(integer) 1
> hincrby ireader rust 1
(error) ERR hash value is not an integer

Expansion Expansion is needed when the elements inside the hash are crowded (hash collisions are frequent). Expansion involves requesting a new array of twice the size, and then reallocating all key-value pairs into a chained table (rehash) corresponding to the subscripts of the new array. If the hash structure is large, say millions of key-value pairs, then a complete rehash can take a long time. This is a bit stressful for single-threaded Redis. So Redis uses a progressive rehash scheme. It keeps both the old and new hash structures, and gradually migrates the elements of the old structure to the new structure in subsequent timed tasks and hash structure read/write commands. This can avoid thread lag caused by the expansion.

Shrinking Redis hash structure not only expansion and contraction, from this point onwards, it is more powerful than Java’s HashMap, Java’s HashMap only expansion. The principle of shrinkage and expansion is the same, but the new array size than the old array is twice as small.

set

Java programmers know that the internal implementation of HashSet uses HashMap, except that all values point to the same object.Redis set structure is the same, it also uses the internal hash structure, all values point to the same internal value.

Adding Elements Multiple elements can be added at once

> sadd ireader go java python
(integer) 3

Read elements Use members to list all elements, use scard to get the length of the set, and use srandmember to get a random count of elements, defaulting to 1 if the count parameter is not provided.

> sadd ireader go java python
(integer) 3
> smembers ireader
1) "java"
2) "python"
3) "go"
> scard ireader
(integer) 3
> srandmember ireader
"java"

Deleting Elements Use rem to delete one or more elements, and spop to delete a random element.

> sadd ireader go java python rust erlang
(integer) 5
> srem ireader go java
(integer) 2
> spop ireader
"erlang"

Determine if an element exists Use the sismember directive, which can only receive a single element

> sadd ireader go java python rust erlang
(integer) 5
> sismember ireader rust
(integer) 1
> sismember ireader javascript
(integer) 0

sortedset

SortedSet(zset) is a very special data structure provided by Redis, on the one hand, it is equivalent to the Java data structure Map<String, Double> , you can give a weight to each element value score , on the other hand, it is similar to TreeSet , the internal elements will be sorted according to the weight score, you can get the rank of each element, but also through the scope of the score to get the list of elements.

The underlying implementation of zset uses two data structures, the first is a hash and the second is a jump list. The purpose of the hash is to associate the element value with the weight score, to guarantee the uniqueness of the element value, and to find the corresponding score value through the element value. The purpose of the jump list is to give the element value sorting, according to the scope of the score to get the list of elements.

Adding Elements One or more value/score pairs can be added with the zadd command, with the scores placed first.

> zadd ireader 4.0 python
(integer) 1
> zadd ireader 4.0 java 1.0 go
(integer) 2

Length The number of elements of zset can be obtained by the command zcard.

> zcard ireader
(integer) 3

Deleting Elements Elements in zset can be deleted with the command zrem, and more than one can be deleted at a time.

> zrem ireader go python
(integer) 2

Counter Like the hash structure, zset can be used as a counter.

> zadd ireader 4.0 python 4.0 java 1.0 go
(integer) 3
> zincrby ireader 1.0 python
"5"

Get rank and score Get the weight of the specified element with the zscore directive, the positive rank of the specified element with the zrank directive, and the negative rank [bottom one] of the specified element with the zrevrank directive. Positive is from smallest to largest, negative is from largest to smallest.

> zscore ireader python
"5"
> zrank ireader go  
(integer) 0
> zrank ireader java
(integer) 1
> zrank ireader python
(integer) 2
> zrevrank ireader python
(integer) 0

Get the list of elements according to the ranking range Get the corresponding list of elements by specifying the ranking range parameter through the zrange directive, carrying the withscores parameter to get the weights of the elements together. Get the list of elements by negative ranking [inverse] with the zrevrange directive. Positive direction is from small to large, negative direction is from large to small.

> zrange ireader 0 -1  
1) "go"
2) "java"
3) "python"
> zrange ireader 0 -1 withscores
1) "go"
2) "1"
3) "java"
4) "4"
5) "python"
6) "5"
> zrevrange ireader 0 -1 withscores
1) "python"
2) "5"
3) "java"
4) "4"
5) "go"
6) "1"

Get list based on score range Get a list of corresponding elements by specifying a score range with the zrangebyscore directive. Get a list of elements in reverse order via the zrevrangebyscore directive. Positive direction is from small to large, negative direction is from large to small. The parameter -inf means negative infinity, and +inf means positive infinity.

> zrangebyscore ireader 0 5
1) "go"
2) "java"
3) "python"
> zrangebyscore ireader -inf +inf withscores
1) "go"
2) "1"
3) "java"
4) "4"
5) "python"
6) "5"
> zrevrangebyscore ireader +inf -inf withscores  
1) "python"
2) "5"
3) "java"
4) "4"
5) "go"
6) "1"

Removing a list of elements based on ranges You can remove multiple elements at once, either by rank range or by score range.

> zremrangebyrank ireader 0 1
(integer) 2  
> zadd ireader 4.0 java 1.0 go
(integer) 2
> zremrangebyscore ireader -inf 4
(integer) 2
> zrange ireader 0 -1
1) "python"

Jump Lists The sorting function inside zset is realized by the “jump list” data structure, which is very special and complex. Readers should be prepared for the depth of this piece.

Because zset has to support random insertions and deletions, it’s not good to use an array to represent it. Let’s look at a normal chained table structure.

We need this chained table to be sorted by the SCORE value. This means that when a new element needs to be inserted, it needs to be located at a specific insertion point so that the chain can continue to be ordered. Usually we will find the insertion point by binary search, but the object of the binary search must be an array, only arrays can support fast positional positioning, the chain table can not do, so what to do?

Think of a startup that begins with just a few people, where everyone on the team is equal and is a co-founder. As the company grows, the number of people gradually becomes larger, and the cost of team communication increases. This is when the team leader system is introduced to divide the team. Each team will have a team leader. Meetings are held in teams, and multiple team leaders have their own meeting schedules. As the company expands further, another level is added – the department – and each department elects a representative from the list of team leaders to act as a minister. The ministers will also have their own schedule of high-level meetings with each other.

A jump list is something like this hierarchy, where all the elements of the bottom level are strung together. Then every few elements pick out a delegate, and string those delegates together again using another level of pointers. Then a second level of representatives is picked out of those and strung up again. Eventually a pyramid structure is formed.

Think about the location of your hometown in the world map: Asia->China->Anhui Province->Anqing City->Firyang County->Tangou Town->Tianmacun->xxxx No. It’s a similar structure.

A “Jump List” is “jumpy” because the elements within it may be “multi-tasking”. For example, the element in the center of the picture above is at L0, L1 and L2 at the same time, so it can quickly “jump” between different levels.

When locating the insertion point, you first locate it at the top level, then dive down to the next level of locating, and all the way down to the bottom level to find the right place to insert the new element. You may ask how that newly inserted element can have a chance to ‘wear several hats’?

An easy-to-understand tutorial on the fundamentals of Redis data structures

string

list

hash (computing)

set

sortedset

By lzz

Related Post

Leave a Reply Cancel reply

You Missed

8 Python practical scripts, save them for future use!

Python logging library logging summary – probably the best article summarizing the logging library so far

I hear you know Python?

An article on collection manipulation functions in Kotlin