数据类型-string
数据类型-String
在讲String数据类型前,先引入一个新的对象RedisObject,其扮演着用户数据类型同底层数据结构之间桥梁的关系.
typedef struct redisObject {
unsigned type:4; //面向用户的数据类型(String/List/Hash/Set/ZSet等),4个bits
unsigned encoding:4; //每个数据类型对应的编码结构SDS/ziplist/intset/hashtable/skiplist等),4个bits
unsigned lru:LRU_BITS; //redisObject的LRU时间,LRU_BITS为24个bits
int refcount; //redisObject的引用计数,4个字节
void *ptr; //指具体的指针,8个字节
} robj;
//总计16个字节
优点:
-
为多种数据类型提供统一的表示方式, 一切皆对象。
-
同一种数据类型,对下底层可以对应不同实现,节省内存,对上统一对象,屏蔽底层实现细节。
-
支持对象共享和引用计数,共享对象存储一份,可多次使用,节省内存。
String的外部应用场景
-
缓存功能:存储会话的Token,验证码,对象序列化之后的数据
-
简单的排行榜,计数类功能
-
分布式锁
String的内部应用场景
- 在redis中,除字符串字面量,绝大部分可以被修改的字符串值,都是通过string来标识
String的常用命令
String的共享对象
当我们的Value类型为整数且<=10000时,将会直接使用Redis默认的共享String对象(常用命令也有共享对象)。(即提高了效率,又节约了内存,有点类似于对于热点数据的预加热)
//server.c 系统在启动时会创建默认的共享对象
void createSharedObjects(void) {
//...
for (j = 0; j < OBJ_SHARED_INTEGERS; j++) {
shared.integers[j] =
makeObjectShared(createObject(OBJ_STRING,(void*)(long)j));
shared.integers[j]->encoding = OBJ_ENCODING_INT;
}
}
/* Create a string object from a long long value. When possible returns a
* shared integer object, or at least an integer encoded one.
*
* If valueobj is non zero, the function avoids returning a shared
* integer, because the object is going to be used as value in the Redis key
* space (for instance when the INCR command is used), so we want LFU/LRU
* values specific for each key. */
robj *createStringObjectFromLongLongWithOptions(long long value, int valueobj) {
robj *o;
if (server.maxmemory == 0 ||
!(server.maxmemory_policy & MAXMEMORY_FLAG_NO_SHARED_INTEGERS))
{
/* If the maxmemory policy permits, we can still return shared integers
* even if valueobj is true. */
valueobj = 0;
}
//在共享对象表示的范围内,会使用共享对象
if (value >= 0 && value < OBJ_SHARED_INTEGERS && valueobj == 0) {
incrRefCount(shared.integers[value]);
o = shared.integers[value];
} else {
if (value >= LONG_MIN && value <= LONG_MAX) {
o = createObject(OBJ_STRING, NULL);
o->encoding = OBJ_ENCODING_INT;
o->ptr = (void*)((long)value);
} else {
o = createObject(OBJ_STRING,sdsfromlonglong(value));
}
}
return o;
}
String的三种编码结构
/* Try to encode a string object in order to save space */
robj *tryObjectEncoding(robj *o) {
long value;
sds s = o->ptr;
size_t len;
/* Make sure this is a string object, the only type we encode
* in this function. Other types use encoded memory efficient
* representations but are handled by the commands implementing
* the type. */
serverAssertWithInfo(NULL,o,o->type == OBJ_STRING);
/* We try some specialized encoding only for objects that are
* RAW or EMBSTR encoded, in other words objects that are still
* in represented by an actually array of chars. */
if (!sdsEncodedObject(o)) return o;
/* It's not safe to encode shared objects: shared objects can be shared
* everywhere in the "object space" of Redis and may end in places where
* they are not handled. We handle them only as values in the keyspace. */
if (o->refcount > 1) return o;
/* Check if we can represent this string as a long integer.
* Note that we are sure that a string larger than 20 chars is not
* representable as a 32 nor 64 bit integer. */
len = sdslen(s);
if (len <= 20 && string2l(s,len,&value)) {
/* This object is encodable as a long. Try to use a shared object.
* Note that we avoid using shared integers when maxmemory is used
* because every object needs to have a private LRU field for the LRU
* algorithm to work well. */
if ((server.maxmemory == 0 ||
!(server.maxmemory_policy & MAXMEMORY_FLAG_NO_SHARED_INTEGERS)) &&
value >= 0 &&
value < OBJ_SHARED_INTEGERS)
{
decrRefCount(o);
incrRefCount(shared.integers[value]);
return shared.integers[value];
} else {
if (o->encoding == OBJ_ENCODING_RAW) {
sdsfree(o->ptr);
o->encoding = OBJ_ENCODING_INT;
o->ptr = (void*) value;
return o;
} else if (o->encoding == OBJ_ENCODING_EMBSTR) {
decrRefCount(o);
return createStringObjectFromLongLongForValue(value);
}
}
}
/* If the string is small and is still RAW encoded,
* try the EMBSTR encoding which is more efficient.
* In this representation the object and the SDS string are allocated
* in the same chunk of memory to save space and cache misses. */
if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT) {
robj *emb;
if (o->encoding == OBJ_ENCODING_EMBSTR) return o;
emb = createEmbeddedStringObject(s,sdslen(s));
decrRefCount(o);
return emb;
}
/* We can't encode the object...
*
* Do the last try, and at least optimize the SDS string inside
* the string object to require little space, in case there
* is more than 10% of free space at the end of the SDS string.
*
* We do that only for relatively large strings as this branch
* is only entered if the length of the string is greater than
* OBJ_ENCODING_EMBSTR_SIZE_LIMIT. */
trimStringObjectIfNeeded(o);
/* Return the original object. */
return o;
}
编码类型 | 转换规则 |
---|---|
int | 长度小于20并且值范围要在长整型表示范围类(64位系统<2^63-1==9223372036854775808) |
embstr | 长度<=44 |
raw | 其他 |
注意事项
老规矩计算成本:(其中16为redisobject的固定开销)
编码类型 | 开销 | 备注 |
---|---|---|
int | 16 | redis只进行一次内存分配 |
embstr | 16+3+1+len | redis进行两次内存分配,但redisobject同sds部分是紧密相连。emb与raw编码的关键节点44=64-20 |
raw | 16+sdsheader+len+1 | redis进行两次内存分享,但redisobject同sds部分是分隔开的。 |
参考资料
极客时间,蒋德钧《Redis核心技术与实战》
黄建宏,《Redis设计与实现》
钱文品,《Redis深度历险:核心原理与应用实践》