《Redis原理分析丨Redis 作為 LRU 緩存的實(shí)現(xiàn)原理》要點(diǎn):
本文介紹了Redis原理分析丨Redis 作為 LRU 緩存的實(shí)現(xiàn)原理,希望對您有用。如果有疑問,可以聯(lián)系我們。
文 | zxszcaijin
本文轉(zhuǎn)載自:blog.chinaunix.net
在使用redis作為緩存的場景下,內(nèi)存淘汰策略決定的redis的內(nèi)存使用效率.在大部門場景下,我們會采用LRU(Least Recently Used)來作為redis的淘汰策略.本文將由淺入深的介紹redis lru策略的具體實(shí)現(xiàn).
首先我們來科普下,什么是LRU ?(以下來自維基百科)
Discards the least recently used items first. This algorithm requires keeping track of what was used when, which is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require keeping "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such an implementation, every time a cache-line is used, the age of all other cache-lines changes.
簡而言之,就是每次淘汰最近最少使用的元素 .一般的實(shí)現(xiàn),都是采用對存儲在內(nèi)存的元素采用 'age bits’ 來標(biāo)記該元素從上次拜訪到現(xiàn)在為止的時(shí)長,從而在每次用LRU淘汰時(shí),淘汰這些最長時(shí)間未被拜訪的元素.
這里我們先實(shí)現(xiàn)一個簡單的LRU Cache,以便于后續(xù)內(nèi)容的理解 .(來自leetcot,不外這里我重新用Python語言實(shí)現(xiàn)了)實(shí)現(xiàn)該緩存滿足如下兩點(diǎn):
1.get(key) - 如果該元素(總是正數(shù))存在,將該元素移動到lru頭部,并返回該元素的值,不然返回-1.
2.set(key,value) - 設(shè)置一個key的值為value(如果該元素存在),并將該元素移動到LRU頭部.不然插入一個key,且值為value.如果在設(shè)置前檢查到,該key插入后,會超過cache的容量,則根據(jù)LRU策略,刪除最近最少使用的key.
闡發(fā)
這里我們采用雙向鏈表來實(shí)現(xiàn)元素(k-v鍵值對)的存儲,同時(shí)采用hash表來存儲相關(guān)的key與item的對應(yīng)關(guān)系.這樣,我們既能在O(1)的時(shí)間對key進(jìn)行操作,同時(shí)又能利用Double LinkedList的添加和刪除節(jié)點(diǎn)的方便性.(get/set都能在O(1)內(nèi)完成).
具體實(shí)現(xiàn)(Python語言)
class Node:
key=None
value=None
pre=None
next=None
def __init__(self,key,value):
self.key=key
self.value=value
class LRUCache:
capacity=0
map={} # key is string ,and value is Node object
head=None
end=None
def __init__(self,capacity):
self.capacity=capacity
def get(self,key):
if key in self.map:
node=self.map[key]
self.remove(node)
self.setHead(node)
return node.value
else:
return -1
def getAllKeys(self):
tmpNode=None
if self.head:
tmpNode=self.head
while tmpNode:
print (tmpNode.key,tmpNode.value)
tmpNode=tmpNode.next
def remove(self,n):
if n.pre:
n.pre.next=n.next
else:
self.head=n.next
if n.next:
n.next.pre=n.pre
else:
self.end=n.pre
def setHead(self,n):
n.next=self.head
n.pre=None
if self.head:
self.head.pre=n
self.head=n
if not self.end:
self.end=self.head
def set(self,key,value):
if key in self.map:
oldNode=self.map[key]
oldNode.value=value
self.remove(oldNode)
self.setHead(oldNode)
else:
node=Node(key,value)
if len(self.map) >= self.capacity:
self.map.pop(self.end.key)
self.remove(self.end)
self.setHead(node)
else:
self.setHead(node)
self.map[key]=node
def main():
cache=LRUCache(100)
#d->c->b->a
cache.set('a','1')
cache.set('b','2')
cache.set('c',3)
cache.set('d',4)
#遍歷lru鏈表
cache.getAllKeys()
#改動('a','1') ==> ('a',5),使該節(jié)點(diǎn)從LRU尾端移動到開頭.
cache.set('a',5)
#LRU鏈表變?yōu)?a->d->c->b
cache.getAllKeys()
#拜訪key='c'的節(jié)點(diǎn),是該節(jié)點(diǎn)從移動到LRU頭部
cache.get('c')
#LRU鏈表變?yōu)?c->a->d->b
cache.getAllKeys()
if __name__ == '__main__':
main()
通過上面簡單的介紹與實(shí)現(xiàn),現(xiàn)在我們基本已經(jīng)了解了什么是LRU,下面我們來看看LRU算法在redis 內(nèi)部的實(shí)現(xiàn)細(xì)節(jié),以及其會在什么情況下帶來問題.在redis內(nèi)部,是通過全局結(jié)構(gòu)體redisServer 保留redis啟動之后相關(guān)的信息.比如:
struct redisServer {
pid_t pid; /* Main process pid. */
char *configfile; /* Absolute config file path, or NULL */
…..
unsigned lruclock:LRU_BITS; /* Clock for LRU eviction */
...
};
serverCron運(yùn)行頻率hz等lruclock:LRU_BITS,此中存儲了服務(wù)器自啟動之后的lru時(shí)鐘,該時(shí)鐘是全局的lru時(shí)鐘.該時(shí)鐘100ms(可以通過hz來調(diào)整,默認(rèn)情況hz=10,因此每1000ms/10=100ms執(zhí)行一次定時(shí)任務(wù))更新一次.
接下來我們看看LRU時(shí)鐘的具體實(shí)現(xiàn):
server.lruclock = getLRUClock();
getLRUClock函數(shù)如下:
#define LRU_CLOCK_RESOLUTION 1000 /* LRU clock resolution in ms */
#define LRU_BITS 24
#define LRU_CLOCK_MAX ((1<<LRU_BITS)-1) /* Max value of obj->lru */
/* Return the LRU clock, based on the clock resolution. This is a time
* in a reduced-bits format that can be used to set and check the
* object->lru field of redisObject structures. */
unsigned int getLRUClock(void) {
return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;
}
因此lrulock最大能到(2**24-1)/3600/24 = 194天,如果跨越了這個時(shí)間,lrulock重新開始.
對于redis server來說,server.lrulock表示的是一個全局的lrulock,那么對于每個redisObject都有一個本身的lrulock.這樣每redisObject就可以根據(jù)本身的lrulock和全局的server.lrulock比較,來確定是否能夠被淘汰掉.
redis key對應(yīng)的value的寄存對象:
typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS; /* LRU time (relative to server.lruclock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits decreas time). */
int refcount;
void *ptr;
} robj
那么什么時(shí)候,lru會被更新呢 ?拜訪該key,lru都會被更新,這樣該key就能及時(shí)的被移動到lru頭部,從而避免從lru中淘汰.下面是這一部分的實(shí)現(xiàn):
/* Low level key lookup API, not actually called directly from commands
* implementations that should instead rely on lookupKeyRead(),
* lookupKeyWrite() and lookupKeyReadWithFlags(). */
robj *lookupKey(redisDb *db, robj *key, int flags) {
dictEntry *de = dictFind(db->dict,key->ptr);
if (de) {
robj *val = dictGetVal(de);
/* Update the access time for the ageing algorithm.
* Don't do it if we have a saving child, as this will trigger
* a copy on write madness. */
if (server.rdb_child_pid == -1 &&
server.aof_child_pid == -1 &&
!(flags & LOOKUP_NOTOUCH))
{
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
unsigned long ldt = val->lru >> 8;
unsigned long counter = LFULogIncr(val->lru & 255);
val->lru = (ldt << 8) | counter;
} else {
val->lru = LRU_CLOCK();
}
}
return val;
} else {
return NULL;
}
}
接下來,我們在來闡發(fā),key的lru淘汰策略如何實(shí)現(xiàn),分別有哪幾種:
# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
# is reached. You can select among five behaviors:
#
# volatile-lru -> Evict using approximated LRU among the keys with an expire set. //在設(shè)置了過期光陰的key中,使用近似的lru淘汰策略
# allkeys-lru -> Evict any key using approximated LRU. //所有的key均使用近似的lru淘汰策略
# volatile-lfu -> Evict using approximated LFU among the keys with an expire set. //在設(shè)置了過期光陰的key中,使用lfu淘汰策略
# allkeys-lfu -> Evict any key using approximated LFU. //所有的key均使用lfu淘汰策略
# volatile-random -> Remove a random key among the ones with an expire set. //在設(shè)置了過期光陰的key中,使用隨機(jī)淘汰策略
# allkeys-random -> Remove a random key, any key. //所有的key均使用隨機(jī)淘汰策略
# volatile-ttl -> Remove the key with the nearest expire time (minor TTL) //使用ttl淘汰策略
# noeviction -> Don't evict anything, just return an error on write operations . //不允許淘汰,在寫操作發(fā)生,但內(nèi)存不夠時(shí),將會返回差錯
#
# LRU means Least Recently Used
# LFU means Least Frequently Used
#
# Both LRU, LFU and volatile-ttl are implemented using approximated
# randomized algorithms.
這里暫不討論LFU,TTL淘汰算法和noeviction的情況,僅僅討論lru所有場景下的,淘汰策略具體實(shí)現(xiàn).(LFU和TTL將在下一篇文章中詳細(xì)闡發(fā)).
LRU淘汰的場景:
1.主動淘汰.
1.1 通過定時(shí)任務(wù)serverCron按期的清理過期的key.
2.被動淘汰
2.1 每次寫入key時(shí),發(fā)現(xiàn)內(nèi)存不夠,調(diào)用activeExpireCycle釋放一部門內(nèi)存.
2.2 每次拜訪相關(guān)的key,如果發(fā)現(xiàn)key過期,直接釋放掉該key相關(guān)的內(nèi)存.
首先我們來闡發(fā)LRU主動淘汰的場景:
serverCron每間隔1000/hz ms會調(diào)用databasesCron辦法來檢測并淘汰過期的key.
void databasesCron(void){
/* Expire keys by random sampling. Not required for slaves
* as master will synthesize DELs for us. */
if (server.active_expire_enabled && server.masterhost == NULL)
activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
…..
}
主動淘汰是通過activeExpireCycle 來實(shí)現(xiàn)的,這部門的邏輯如下:
遍歷至多16個DB .【由宏CRON_DBS_PER_CALL界說,默認(rèn)為16】
隨機(jī)挑選20個帶過期時(shí)間的key.【由宏ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP界說,默認(rèn)20】
如果key過時(shí),則將key相關(guān)的內(nèi)存釋放,或者放入失效隊(duì)列.
如果操作時(shí)間超過允許的限定時(shí)間,至多25ms.(timelimit = 1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100,,ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC=25,server.hz默認(rèn)為10), 則此次淘汰操作結(jié)束返回,不然進(jìn)入5.
如果該DB下,有超過5個key (ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP/4=5) 實(shí)際失效,則進(jìn)入 2,不然選擇下一個DB,再次進(jìn)入2.
遍歷完成,結(jié)束.
流程圖如下
注:(圖中年夜于等于%5的可以是實(shí)際過期的,應(yīng)改為年夜于等于%25的key是實(shí)際過期的.iteration++是在遍歷20個key的時(shí)候,每次加1).
被動淘汰 - 內(nèi)存不夠,挪用activeExpireCycle釋放
該步調(diào)的實(shí)現(xiàn)方式如下:
processCommand 函數(shù)關(guān)于內(nèi)存淘汰策略的邏輯:
/* Handle the maxmemory directive.
*
* First we try to free some memory if possible (if there are volatile
* keys in the dataset). If there are not the only thing we can do
* is returning an error. */
if (server.maxmemory) {
int retval = freeMemoryIfNeeded();
/* freeMemoryIfNeeded may flush slave output buffers. This may result
* into a slave, that may be the active client, to be freed. */
if (server.current_client == NULL) return C_ERR;
/* It was impossible to free enough memory, and the command the client
* is trying to execute is denied during OOM conditions? Error. */
if ((c->cmd->flags & CMD_DENYOOM) && retval == C_ERR) {
flagTransaction(c);
addReply(c, shared.oomerr);
return C_OK;
}
}
每次執(zhí)行命令前,都會調(diào)用freeMemoryIfNeeded來檢查內(nèi)存的情況,并釋放相應(yīng)的內(nèi)存,如果釋放后,內(nèi)存仍然不夠,直接向哀求的客戶端返回OOM.
具體的步調(diào)如下:
獲取redis server當(dāng)前已經(jīng)使用的內(nèi)存mem_reported.
如果mem_reported < server.maxmemory ,則返回ok.不然mem_used=mem_reported,進(jìn)入步驟3.
遍歷該redis的所slaves,mem_used減去所有slave占用的ClientOutputBuffer.
如果配置了AOF,mem_used減去AOF占用的空間.sdslen(server.aof_buf)+aofRewriteBufferSize().
如果mem_used < server.maxmemory,返回ok.不然進(jìn)入步驟6.
如果內(nèi)存策略配置為noeviction,返回錯誤.不然進(jìn)入7.
如果是LRU策略,如果是VOLATILE的LRU,則每次從可失效的數(shù)據(jù)集中,每次隨機(jī)采樣maxmemory_samples(默認(rèn)為5)個key,從中選取idletime最大的key進(jìn)行淘汰.不然,如果是ALLKEYS_LRU則從全局?jǐn)?shù)據(jù)中進(jìn)行采樣,每次隨機(jī)采樣maxmemory_samples(默認(rèn)為5)個key,并從中選擇idletime最大的key進(jìn)行淘汰.
如果釋放內(nèi)存之后,還是跨越了server.maxmemory,則繼續(xù)淘汰,只到釋放后剩下的內(nèi)存小于server.maxmemory為止.
被動淘汰 - 每次拜訪相關(guān)的key,如果發(fā)現(xiàn)key過期,直接釋放掉該key相關(guān)的內(nèi)存:
每次拜訪key,都會調(diào)用expireIfNeeded來判斷key是否過期,如果過期,則釋放掉,并返回null,否則返回key的值.
總結(jié)
1.redis做為緩存,經(jīng)常采納LRU的策略來淘汰數(shù)據(jù),所以如果同時(shí)過期的數(shù)據(jù)太多,就會導(dǎo)致redis發(fā)起主動檢測時(shí)耗費(fèi)的時(shí)間過長(最大為250ms),從而導(dǎo)致最大應(yīng)用超時(shí) >= 250ms.
timelimit = 1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100
ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC=25
server.hz>=1(默認(rèn)為10)
timelimit <= 250ms
2.內(nèi)存使用率過高,則會導(dǎo)致內(nèi)存不夠,從而發(fā)起被動淘汰策略,從而使應(yīng)用拜訪超時(shí).
3.合理的調(diào)整hz參數(shù),從而控制每次主動淘汰的頻率,從而有效的緩解過時(shí)的key數(shù)量太多帶來的上述超時(shí)問題.
歡迎參與《Redis原理分析丨Redis 作為 LRU 緩存的實(shí)現(xiàn)原理》討論,分享您的想法,維易PHP學(xué)院為您提供專業(yè)教程。
轉(zhuǎn)載請注明本頁網(wǎng)址:
http://www.snjht.com/jiaocheng/9255.html