【Redis】Redis的持久化

Redis的数据持久化介绍

1. RDB

1.1 RDB介绍

RDB：Redis Database
在指定的时间间隔内将内存中的数据集快照写入磁盘，也就是行话讲的snapshot快照，它恢复时是将快照文件直接读到内存中。
Redis会单独创建（fork）一个子进程进行持久化，会先将数据写到一个临时文件中，待持久化过程都结束了，再用这个临时文件替换上次持久化好的文件。整个过程中，主进程是不进行任何IO操作的，这就确保了极高的性能。如果需要进行大规模的数据恢复，且对于数据恢复的完整性不是非常敏感，按RDB发方法要比AOF方式更加高效.
RDB的缺点：最后一次持久化的数据可能丢失。

1.2 Fork

Fork的作用是复制一个与当前进程一样的进程。新进程的所有数据（变量、环境变量、程序计数器等），数值都和原进程一直，但是是一个全新的进程，并作为原进程的子进程。

1.3 配置文件

################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:  保存DB在磁盘中
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given 一旦符合给定的秒数内执行的操作数，就会保存DB
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed  15分钟内修改1次
#   after 300 sec (5 min) if at least 10 keys changed  5分钟内修改300次
#   after 60 sec if at least 10000 keys changed  1分钟内修改10000次
#
#   Note: you can disable saving completely by commenting out all "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 900 1
save 300 10
save 60 10000

# By default Redis will stop accepting writes if RDB snapshots are enabled  后台保存失败的时候停止前台继续写入
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes

# Compress string objects using LZF when dump .rdb databases? 是否使用LZF算法压缩dump.rdb
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes

# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.  是否使用CRC64算法来进行数据校验
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes

# The filename where to dump the DB  将DB写入磁盘的文件名
dbfilename dump.rdb

# Remove RDB files used by replication in instances without persistence
# enabled. By default this option is disabled, however there are environments
# where for regulations or other security concerns, RDB files persisted on
# disk by masters in order to feed replicas, or stored on disk by replicas
# in order to load them for the initial synchronization, should be deleted
# ASAP. Note that this option ONLY WORKS in instances that have both AOF
# and RDB persistence disabled, otherwise is completely ignored.
#
# An alternative (and sometimes better) way to obtain the same effect is
# to use diskless replication on both master and replicas instances. However
# in the case of replicas, diskless is not always an option.
rdb-del-sync-files no

# The working directory.
#
# The DB will be written inside this directory, with the filename specified DB写入到指定文件目录下
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./

1.4 如何触发RDB快照

（1）配置文件中save设置
（2）使用命令save或bgsave
- save：只管保存，保存是全部阻塞，等保存完后才能重新使用
- bgsave：Redis会在后台异步进行快照操作
（3）执行flushall命令，但没有意义，数据被清空了

1.5 数据恢复

将备份文件移动的redis安装目录，并启动服务器即可

1.6 优势

适合大规模的数据恢复
对数据完整性和一致性要求不高

1.7 劣势

在一定时间间隔做一次备份，所以如果Redis以外关闭掉的话，就会丢失最后一次快照后的所有修改
Frok的时候，内存中的数量被克隆了一份，大致2倍的膨胀性需要考虑

2. AOF

2.1 AOF介绍

AOF：Append Only File
AOF以日志的形式来记录每个写操作，将Redis执行过的所有写指令记录下来（读操作不记录），只许追加维诺健但不可以改写文件，redis启动之初会读取该文件重新构建数据，换言之，redis重启的话就根据日志文件的内容将写指令从前到后执行一次以完成数据的恢复工作。

2.2 配置文件

############################## APPEND ONLY MODE ###############################

# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems. AOF与RDB可以同时共存，没有问题
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.

appendonly no  是否启动aof

# The name of the append only file (default: "appendonly.aof") aof文件名字

appendfilename "appendonly.aof"

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster. 不同步数据，只要让操作系统在需要的时候刷新数据即可，快。
# always: fsync after every write to the append only log. Slow, Safest.  每次写操作都同步添加到日志，慢，最安全
# everysec: fsync only one time every second. Compromise. 每秒同步日志一次。妥协版本。
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no   重写时是否可以使用Appendfsync

# Automatic rewrite of the append only file. 自动重写AOF
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is 当前大小大于设定的百分比，触发重写
# bigger than the specified percentage, the rewrite is triggered. Also  也要设定一个触发重写的最小大小
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100     原来的两倍就会触发重写
auto-aof-rewrite-min-size 64mb      文件超过64MB就会触发重写

# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes

# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
#   [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
aof-use-rdb-preamble yes

2.3 RDB与AOP

RDB与AOF可以同时存在，启动服务器的时候优先使用AOF，一旦AOF出现问题，服务器启动失败。
通过使用redis-check-aof --fix appendonly.aof，对AOF进行修复

2.4 重写

AOF采用文件追加方式，文件会越来越大，为了避免此种情况，新增了重写机制
当AOF文件的大小超过所设定的阈值时，Redis就会启动AOF文件的内容压缩，只保留可以恢复数据的最小指令集。可以使用命令bgwriteaof手动启动内容压缩功能

重写原理：
- AOF文件持续增长而过大时，会fork出一条新进程来将文件重写（也是先写临时文件最后在rename），遍历新进程的内存中数据，每条记录有一条Set语句。重写aof文件的操作，并没有读取就的aof文件，而是将整个内存中的数据库内容用命令的方式重写了一个新的aof文件。

触发机制：
- Redis会记录上次重写时的AOF大小，默认配置是当AOF文件大小时上次rewrite后大小的一倍且文件大于64M时触发

2.5 优势

可以灵活配置AOF的备份策略，可以每秒同步，每次修改同步，不同步

2.6 劣势

相同数据集的数据而言，aof文件要远大于rdb文件，恢复速度慢于rdb
aof运行效率要慢于rdb，每秒同步策略效率较好，不同步效率和rdb一样

3. 总结

RDB持久化方式能在指定的时间间隔对你的数据进行快照存储

AOF持久化方式记录每次对服务器的写操作，当服务器重启时会重新执行这些命令来恢复原始数据，AOF命令以Redis协议追加保存每次写的操作到文件末尾
Redis还支持AOF文件进行后台重写，使得AOF文件的体积不至于过大

只做缓存：如果只希望你的数据在服务器运行的时候存在，可以不适用任何的持久化方式

同时开始两种持久化：
- 重启Redis时会优先载入AOF文件来恢复原始的数据，因为RDB不同时，所以导致只会寻找AOF文件
- 但是不建议只使用AOF，RDB更合适备份数据库，并且留着作为一个万一的手段

性能建议
- RDB文件只用作后备用途，建议只保留15分钟内修改1次的备份条件即可
- AOF好处在与最恶劣情况只丢失2秒的数据，但带来了持续的IO。而且AOF最后会需要重写，重写会造成一定的阻塞。建议AOF重写的基础大小设置为5G以上。
- 后面会有Master-Slave Replication来代替AOF，可以省掉一大笔IO。代价是如果Master/Slave同时挂掉，会丢失十几分钟的数据。