A long cheatsheet for Redis

Setting up Redis: (on Ubuntu 16.04)

Install Redis server
apt-get install redis-server

Enable Redis as service (start on system boot)
systemctl enable redis-server.service

Note:  The service configuration can be found at: /etc/systemd/system/redis.service
There you can see that Redis will be running as user redis:
User=redis
Group=redis

The Redis server will always restart after reboot:
Restart=always

Stopping the Redis server
redis-cli shutdown

Starting the Redis server
/usr/local/bin/redis-server /etc/redis/redis.conf

Edit Redis configuration
nano /etc/redis/redis.conf

Make server accessible from outside
By default Redis is listening only to loopback network. You will see in its configuration file:
bind 127.0.0.1
We can change this:
bind 0.0.0.0

Set a memory limit for Redis
maxmemory 500mb

Stored keys are not erased in order to make space for new keys (useful if you use Redis as a session store)
maxmemory-policy noeviction

Check your Redis installation:

Check Redis server version
redis-server –version

Check Redis status
systemctl status redis

Check Redis is listening at the right port
netstat -nlpt | grep 6379

Check the IP address and port Redis is bound to
$ sudo netstat -plunt | grep -i redis

Check (local) Redis functionality
$ redis-cli
127.0.0.1:6379> ping
PONG
127.0.0.1:6379> GET mykey
(nil)
127.0.0.1:6379> SET mykey 23
OK

Connect to remote Redis server
$ sudo apt-get install redis-tools
$ redis-cli -h 10.144.62.3 -p 30000

Persistency:

Using persistency is optional. There are 2 options:

RDB : Takes snapshots of your dataset at specified intervals.  It allows faster restarts (compared to AOF) when the dataset is big.

To following line in Redis configuration file will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed:

save 60 1000

You can disable saving completely by commenting out all “save” lines.

AOF : Logs all write operations. The log is used when the server restarts in order to reconstruct the original dataset. Redis can rewrite the log when it gets too big.

You can turn on the AOF in your configuration file:

appendonly yes

Set the AOF file name with:

appendfilename “appendonly.aof”

Configure how often the accumulated changes will be written to the AOF file:

appendfsync everysec     (other options are “no” and “always”)

From now on, every time Redis receives a command that changes the dataset (e.g. SET) it will append it to the AOF. When you restart Redis it will re-play the AOF to rebuild the state. For log rewriting, check the configuration file for theauto-aof-rewrite-percentageparameter.

The 2 options can be used together. Generally speaking, AOF sacrifices some performance for better durability. The place where any persistancy file (RDB or AOF) is saved is defined in configuration file:

dir /var/lib/redis

Security:

Filesystem permissions:

1.  Check permissions for the directory where Redis will use for data persistance. It should not be accessible by normal users. By default it will probably be:
dir /var/lib/redis
and should be accessible only by the redis user (permissions: 700).

2.  Permissions on the redis configuration file are also important. Normally, it will be accessible only by root (permissions: 640 or 600)

Firewall:

This is a sample iptables configuration for input traffic, starting from a point with no rules at all.

Any incoming traffic we do not explicitly allow, will be rejected
-P INPUT DROP

No forwarding. This is not a router.
-P FORWARD DROP

Any outgoing traffic we do not explicitly allow, will be rejected
-P OUTPUT DROP

Create a new chain that will handle the Redis traffic
-N redis-protection

Allow incoming  traffic to loopback interface
-A INPUT -i lo -j ACCEPT

Allow Established and Related Incoming Connections
-A INPUT -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT

Allow All Incoming SSH connections on port 22
-A INPUT -p tcp -m tcp –dport 22 -m conntrack –ctstate NEW,ESTABLISHED -j ACCEPT

We want to be able to ping our server
-A INPUT -p icmp -m icmp –icmp-type 8 -j ACCEPT

Send incoming TCP traffic for port 6379 through the “redis-protection” chain
-A INPUT -p tcp -m tcp –dport 6379 -j redis-protection

Allow outgoing traffic to loopback interface
-A OUTPUT -o lo -j ACCEPT

Allow Established and Related Incoming Connections to loopback interface
-A OUTPUT -m conntrack –ctstate ESTABLISHED -j ACCEPT

Allow traffic from our web server
-A redis-protection -s <my_web_server_ip>/32 -j ACCEPT

Allow traffic from Redsmin server (for those who use Redsmin)
-A redis-protection -s 62.210.222.165/32 -j ACCEPT

Renaming/Disabling Redis commands:

As the documentation in Redis configuration file mentions: it is possible to change the name of dangerous commands in a shared environment. For instance the CONFIG command may be renamed into something hard to guess so that it will still be available for internal-use tools but not available for general clients.

rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52

It is also possible to completely kill a command by renaming it into an empty string:

rename-command CONFIG “”

Note that changing the name of commands that are logged into the AOF file or transmitted to slaves may cause problems. So, do the renaming when AOF is not used or, even better, right after the installation of Redis. Do not forget to rename the commands in all instances. If a renamed command is logged to AOF and we try to reply the AOF file in a Redis instance where the command is not renamed you will end up with inconsistencies (the renamed command will not be able to be replayed).

Setup a password:

AS Redis documentation says: “Redis is designed to be accessed by trusted clients inside trusted environments. This means that usually it is not a good idea to expose the Redis instance directly to the internet or, in general, to an environment where untrusted clients can directly access the Redis TCP port or UNIX socket.” So, the only reason I can see to enable a password is in case one of your web application servers has been hacked but the application it self is not compromised (the hacker can access the server as low privileged user and from there he can access Redis).  In that case, we can enable the use of a password, by:

requirepass <a_very_long_random_password>

Since Redis is fast and no throttling is applied (makes no sense), brute force attack can break weak passwords. So, choose a long and strong one.

Useful redis-cli commands:

A ping!  🙂  You should get a PONG as response.
ping

List all keys
KEYS *

Count all keys
dbsize

Display information about memory usage
info memory

Note: Memory RSS (Resident Set Size) is the memory (in bytes) the operating system has allocated to Redis. If ratio memory_rss/memory_used is greater than ~1.5, then it signifies memory fragmentation. The fragmented memory can be recovered by restarting the server.

Check Redis response latency
redis-cli –latency -h -p

It is measuring the time for the Redis server to respond to the Redis PING command in milliseconds.

“samples” :  This is the amount of times the redis-cli recorded issuing the PING command and receiving a response.

“min” : The minimum delay between the time the CLI issued PING and the time the reply was received.

“max” : The maximum delay between the time the CLI issued PING and the time the reply to the command was received.

“avg” : The average response time in milliseconds for all sampled data.

Replication and Clustering:

This is such a long subject that the only thing it works cheatsheet-ing is some central ideas about the usage of replication and clustering that can be hard for newcomers.

1. Redis replication (adding slave nodes) is mainly used to increase the availability of our data. Redis clustering (adding master nodes) is mainly used to increase capacity. Both of them can improve performance. Just in different ways. If no replication or clustering is used, the concepts of “master” and “slave” become meaningless. We just talk for standalone Redis instances.

2. A Redis cluster is a data sharding strategy. It requires at least 3 (master) nodes. A Redis cluster is not required to use replication. Replication can be used with or without a cluster. Of course, if there is no cluster, the promotion of a slave node to master, in case the latter fails, can be done manually.

3. If you are struggling (e.g scaling up is reaching its limits or has become financially inefficiant) to manage a high number of “writes”, you probably need to add nodes to your cluster. If the high number of “reads” is your problem, then adding replication nodes is probably the way to go.

4. It is not impossible to host 2 Redis nodes, a master and a slave, on the same server as long as the slave acts as a replication node for a master node that resides on another server.

5. Think carefully about your eviction policy. You may not even need one. For example, if you use Redis for session storage, it makes no sense to throw away an existing active session in order to store a new one.

6. Pick a Redis client carefully. Not all clients support all Redis features or, at least, not at the same degree. Especially, when it comes to clustering and replication. A list of Redis clients can be found here: https://redis.io/clients

7. A Redis client can be configured to use only masters, only slaves, or both, for “read” actions. Not all clients support this feature.
8. Do not expect to treat Redis nodes that are designated for caching the same way that you treat Redis nodes that are used as data store (e.g session store). For example, repartitioning (mapping the hash slots to the new set of nodes) is usually not a big trouble for the caching case, but requires data migration in the data store case.

9. Redis has sacrifised consistency in favor of performance. Data replication happens asynchronously and not immediately after a write finishes on a master node.

10. Data sharding can be achieved even without a Redis cluster by using standalone Redis nodes and letting the client to handle the routing (map each key to a node). This can be more efficient under normal conditions (no round-trip required in order to redirect the client to the correct node) but the administration of the sharding strategy becomes more cumbersome. Especially, when Redis is used as a data store, it is difficult or inefficient to manually handle the addition/removal of nodes, the failure of a master node or other troublesome situations.

11. Since clients for Redis cluster do not know in advance which node hosts every key, an extra round-trip is required in order to acquire this knowledge (except if we are lucky) the fist time the client deals with a specific key. After that, many clients cache the mapping for this key.

12. When masters are accompanied by slaves, persistence is usually enabled for master. Otherwise, when master restarts (e.g due to a crash) it will have an empty data set (no persistency). Slaves will replicate from master and they will lose their copy of data, too. So, when disabling the persistency of a master node it is also advisable to disable the auto-restart.

13. It is not easy to have a viable Redis cluster without replication. Yes, a caching Redis is not that sensitive as one that acts as a data store. But, still, a caching infrastructure is not a game. If your database can handle the load absorbed by your caching layer, then you may not even be using a caching layer. So, sometimes, the warm up time of a restarting Redis node may bring you serious troubles. Be sure this is not your case by testing such scenarios.