SharedPreferences+MMAP+MMKV数据持久化原理解析

2022年06月03日

Android

SharedPreferences

介绍

SharedPreferences(以下统称为sp)是Android提供的数据持久化的一种手段，适合单进程、小批量的数据存储与访问。
由于sharedPreferences是基于xml文件实现的，所有持久化数据都是一次性加载，如果数据过大是不适合采用SP存放。

实际上是用xml文件存放数据，文件存保存放在/data/data//shared_prefs/

如何使用

SharedPreferences setting = getSharedPreferences("hello",MODE_PRIVATE);
//让setting处于编辑状态
SharedPreferences.Editor editor = setting.edit();
//存放数据
editor.putString("name","jacky");
//完成提交
editor.commit();
// editor.apply();;
//读取信息
String name = setting.getString("name","0");

源码分析

获取

如何获取SharedPreferences对象？getSharedPreferences方法是在ContextWrapper中，所以可以在Activity中直接获取

@Override
public SharedPreferences getSharedPreferences(String name, int mode) {
    return mBase.getSharedPreferences(name, mode);
}

实现在ContextImpl中的getSharedPreferences方法

@Override
public SharedPreferences getSharedPreferences(File file, int mode) {
    SharedPreferencesImpl sp;
    synchronized (ContextImpl.class) {
        final ArrayMap<File, SharedPreferencesImpl> cache = getSharedPreferencesCacheLocked();
        sp = cache.get(file);
        if (sp == null) {
            sp = new SharedPreferencesImpl(file, mode);
            cache.put(file, sp);
            return sp;
        }
    }
    return sp;
}

初始化

初始化：通过File读取文件,加载数据，然后通过XmlUtils解析文件。
SharedPreferencesImpl -> startLoadFromDisk -> loadFromDisk ->

private void loadFromDisk() {
    try {
        stat = Os.stat(mFile.getPath());
        if (mFile.canRead()) {
            BufferedInputStream str = null;
            try {
                str = new BufferedInputStream(
                        new FileInputStream(mFile), 16 * 1024);
                map = (Map<String, Object>) XmlUtils.readMapXml(str);
            } catch (Exception e) {
                Log.w(TAG, "Cannot read " + mFile.getAbsolutePath(), e);
            } finally {
                IoUtils.closeQuietly(str);
            }
        }
    }catch (Throwable t) {
        thrown = t;
    }
}

commit

commit -> enqueueDiskWrite -> writeToFile

public boolean commit() {
    long startTime = 0;

    if (DEBUG) {
        startTime = System.currentTimeMillis();
    }

    MemoryCommitResult mcr = commitToMemory();

    SharedPreferencesImpl.this.enqueueDiskWrite(
        mcr, null /* sync write on this thread okay */);
    try {
         //这个地方就让主线程卡住的原因，如果多个commit，就需要等待，就会阻塞。
        mcr.writtenToDiskLatch.await();
    } catch (InterruptedException e) {
        return false;
    } finally {
        if (DEBUG) {
            Log.d(TAG, mFile.getName() + ":" + mcr.memoryStateGeneration
                    + " committed after " + (System.currentTimeMillis() - startTime)
                    + " ms");
        }
    }
    notifyListeners(mcr);
    return mcr.writeToDiskResult;
}

private void enqueueDiskWrite(final MemoryCommitResult mcr,
                                final Runnable postWriteRunnable) {
    final boolean isFromSyncCommit = (postWriteRunnable == null);

    final Runnable writeToDiskRunnable = new Runnable() {
            @Override
            public void run() {
                synchronized (mWritingToDiskLock) {
                    writeToFile(mcr, isFromSyncCommit);
                }
                synchronized (mLock) {
                    mDiskWritesInFlight--;
                }
                if (postWriteRunnable != null) {
                    postWriteRunnable.run();
                }
            }
        };

    // 未true是commit提交，在当前线程提交，注意这里的锁.
    if (isFromSyncCommit) {
        boolean wasEmpty = false;
        synchronized (mLock) {
            wasEmpty = mDiskWritesInFlight == 1;
        }
        if (wasEmpty) {
            writeToDiskRunnable.run();
            return;
        }
    }
    // apply 如队列
    QueuedWork.queue(writeToDiskRunnable, !isFromSyncCommit);
}

在commit方法中，首先执行写入任务也就是enqueueDiskWrite这个方法，然后让调用线程处于等待状态，当写入任务执行成功后唤起调用commit的线程，假设调用commit的线程就是主线线程，并且写入任务耗时还比较多的，这不就阻塞住主线程了吗？

writeToFile

直接io操作写入的文件

private void writeToFile(MemoryCommitResult mcr, boolean isFromSyncCommit) {
    long startTime = 0;
    long existsTime = 0;
    long backupExistsTime = 0;
    long outputStreamCreateTime = 0;
    long writeTime = 0;
    long fsyncTime = 0;
    long setPermTime = 0;
    long fstatTime = 0;
    long deleteTime = 0;
    try {
        FileOutputStream str = createFileOutputStream(mFile);

        if (DEBUG) {
            outputStreamCreateTime = System.currentTimeMillis();
        }

        if (str == null) {
            mcr.setDiskWriteResult(false, false);
            return;
        }
        XmlUtils.writeMapXml(mcr.mapToWriteToDisk, str);

        writeTime = System.currentTimeMillis();

        FileUtils.sync(str);

        fsyncTime = System.currentTimeMillis();

        str.close();
        ContextImpl.setFilePermissionsFromMode(mFile.getPath(), mMode, 0);
        return;
    } catch (XmlPullParserException e) {
        Log.w(TAG, "writeToFile: Got exception:", e);
    } catch (IOException e) {
        Log.w(TAG, "writeToFile: Got exception:", e);
    }
}

apply

apply -> 异步延迟加载

public void apply() {
    final long startTime = System.currentTimeMillis();

    final MemoryCommitResult mcr = commitToMemory();
    final Runnable awaitCommit = new Runnable() {
            @Override
            public void run() {
                try {
                    mcr.writtenToDiskLatch.await();
                } catch (InterruptedException ignored) {
                }

                if (DEBUG && mcr.wasWritten) {
                    Log.d(TAG, mFile.getName() + ":" + mcr.memoryStateGeneration
                            + " applied after " + (System.currentTimeMillis() - startTime)
                            + " ms");
                }
            }
        };
    // 这个是保存到QueuedWork队列， 添加到LinkedList<Runnable> sFinishers = new LinkedList<>();里
    QueuedWork.addFinisher(awaitCommit);

    Runnable postWriteRunnable = new Runnable() {
            @Override
            public void run() {
                awaitCommit.run();
                QueuedWork.removeFinisher(awaitCommit);
            }
        };

    SharedPreferencesImpl.this.enqueueDiskWrite(mcr, postWriteRunnable);

    // Okay to notify the listeners before it's hit disk
    // because the listeners should always get the same
    // SharedPreferences instance back, which has the
    // changes reflected in memory.
    notifyListeners(mcr);
}

入队列，创建handler，sWork队列中数据最终在queued-work-looper 线程中依次得到执行

 public static void queue(Runnable work, boolean shouldDelay) {
    Handler handler = getHandler();

    synchronized (sLock) {
        sWork.add(work);

        if (shouldDelay && sCanDelay) {
            handler.sendEmptyMessageDelayed(QueuedWorkHandler.MSG_RUN, DELAY);
        } else {
            handler.sendEmptyMessage(QueuedWorkHandler.MSG_RUN);
        }
    }
}

最终processPendingWork执行，实际上就是for循环，轮询执行；

private static void processPendingWork() {
    long startTime = 0;

    if (DEBUG) {
        startTime = System.currentTimeMillis();
    }

    synchronized (sProcessingWork) {
        LinkedList<Runnable> work;

        synchronized (sLock) {
            work = (LinkedList<Runnable>) sWork.clone();
            sWork.clear();

            // Remove all msg-s as all work will be processed now
            getHandler().removeMessages(QueuedWorkHandler.MSG_RUN);
        }

        if (work.size() > 0) {
            for (Runnable w : work) {
                w.run();
            }
        }
    }
}

apply的中写入操作也是在异步线程执行，不会导致主线程卡顿，但是如果异步任务执行时间过长，当ActvityThread执行了handleStopActivity或者handleServiceArgs或者handlePauseActivity 等方法的时候都会调用QueuedWork.waitToFinish()方法,而此方法中会在异步任务执行完成前一直阻塞住主线程，所以卡顿问题就产生了。

public static void waitToFinish() {
       long startTime = System.currentTimeMillis();
       boolean hadMessages = false;

       Handler handler = getHandler();

       synchronized (sLock) {
           if (handler.hasMessages(QueuedWorkHandler.MSG_RUN)) {
               // Delayed work will be processed at processPendingWork() below
               handler.removeMessages(QueuedWorkHandler.MSG_RUN);

               if (DEBUG) {
                   hadMessages = true;
                   Log.d(LOG_TAG, "waiting");
               }
           }

           // We should not delay any work as this might delay the finishers
           sCanDelay = false;
       }

       StrictMode.ThreadPolicy oldPolicy = StrictMode.allowThreadDiskWrites();
       try {
           processPendingWork();
       } finally {
           StrictMode.setThreadPolicy(oldPolicy);
       }

       try {
           while (true) {
               Runnable finisher;

               synchronized (sLock) {
                   finisher = sFinishers.poll();
               }

               if (finisher == null) {
                   break;
               }

               finisher.run();
           }
       } finally {
           sCanDelay = true;
       }

       synchronized (sLock) {
           long waitTime = System.currentTimeMillis() - startTime;

           if (waitTime > 0 || hadMessages) {
               mWaitTimes.add(Long.valueOf(waitTime).intValue());
               mNumWaits++;

               if (DEBUG || mNumWaits % 1024 == 0 || waitTime > MAX_WAIT_TIME_MILLIS) {
                   mWaitTimes.log(LOG_TAG, "waited: ");
               }
           }
       }
   }

会从sFinishers队列中取出数据然后执行run方法，我们别忘了在apply的方法中，我们还添加了QueuedWork.addFinisher(awaitCommit);这个awaitCommit 就得到执行了但是awaitCommit中的代码确实是阻塞的代码，等待写入线程执行完毕才能唤起此线程。如果 apply中的写入代码不执行完，主线程就一直卡住了，也就出现了我们上面的问题。

 final Runnable awaitCommit = new Runnable() {
    @Override
    public void run() {
        try {
            mcr.writtenToDiskLatch.await();
        } catch (InterruptedException ignored) {
        }

        if (DEBUG && mcr.wasWritten) {
            Log.d(TAG, mFile.getName() + ":" + mcr.memoryStateGeneration
                    + " applied after " + (System.currentTimeMillis() - startTime)
                    + " ms");
        }
    }
};

读数据

public int getInt(String key, int defValue) {
    synchronized (mLock) {
        awaitLoadedLocked();
        Integer v = (Integer)mMap.get(key);
        return v != null ? v : defValue;
    }
}

关键awaitLoadedLocked 这个方法，当数据没有加载完，就让调用的线程处于等待中，阻塞住了

private void awaitLoadedLocked() {
    if (!mLoaded) {
        // Raise an explicit StrictMode onReadFromDisk for this
        // thread, since the real read will be in a different
        // thread and otherwise ignored by StrictMode.
        BlockGuard.getThreadPolicy().onReadFromDisk();
    }
    while (!mLoaded) {
        try {
            mLock.wait();
        } catch (InterruptedException unused) {
        }
    }
    if (mThrowable != null) {
        throw new IllegalStateException(mThrowable);
    }
}

所以获取数据也是阻塞的。

源码总结

从上面可以看出两者最后都是先调用commitToMemory，将更改提交到内存，在这一点上两者是一致的，之后又都调用了enqueueDiskWrite进行数据持久化任务，不过commit函数一般会在当前线程直接写文件，而apply则提交到一个队列里，延迟加载，之后直接返回。
（这里代码是android-28，可能新版本是是一个线程池，而非队列）

总结

xml格式保存

通过子线程使用IO读取整个文件，并进行xml解析，存入内存Map，完成初始化，默认大小是16k。

commit是同步提交，阻塞调用的线程，为啥？如果是使用commit方式提交，会阻塞调用commit方法的线程，如果写入任务很多比较耗时，就卡住了，所以不要在主线程执行写入文件的操作，否则阻塞主线程；apply是异步(延迟)提交，无法获取结果且可能数据丢失。apply 放法不会阻塞调用的线程，但是如果写入任务比较耗时，会阻塞住主线程，因为主线程有调用的代码，需要等写入任务执行完了才会继续往下执行。

更新，是吧map中数据，全部序列化XML，覆盖文件保存（全量更新）。
所以有没有一种方案改进sp的xml，io，并发问题？

MMAP

传统I/O

虚拟内存被操作系统划分为两块：用户空间和内核空间，用户空间是用户程序代码运行的地方，内核空间是内核代码运行的地方，内核空间由所有进程共享。为了安全，他们是隔离的，即使用户的程序崩溃了，内核也不受影响。
写文件的流程：

1、调用write，告诉内核需要写入数据的开始地址与长度。
2、内核将数据拷贝到内核页缓存。
3、由操作系统调用，将数据拷贝到磁盘，完成写入。

MMMAP

Linux 通过将一个虚拟内存区域与一个磁盘上的对象关联起来，以初始化这个虚拟内存区域的内容，这个过程称为内存映射(memory mapping).
对文件进行mmap，会在进程的虚拟内存分配地址空间创建映射关系。实现这样的映射关系后，就可以采用指针的方式读写操作这一段内存，而系统会自动回写到对应的文件磁盘上。

特点

MMAP对文件的读写操作只需要从磁盘到用户主存的一次数据拷贝过程，减少了数据的拷贝次数，提高了文件操作效率
MMAP使用逻辑内存对磁盘文件进行映射，操作内存就相当于操作文件，不需要开启线程，操作MMAP的速度和操作内存的速度一样快
MMAP提供一段可供随时写入的内存块，App只管往里面写数据，由操作系统如内存不足、进程退出等时候负责将内存回写到文件。
案例
微信Mars：
github
文档
美团Logan：
github
博客
网易android-mmap：
Binder通信
MMKV
MMKV 是基于 mmap 内存映射的 key-value 组件，底层序列化/反序列化使用 protobuf 实现，性能高，稳定性强。
原理
内存准备
通过 mmap 内存映射文件，提供一段可供随时写入的内存块，App 只管往里面写数据，由操作系统负责将内存回写到文件，不必担心 crash 导致数据丢失。
数据组织
数据序列化方面我们选用 protobuf 协议，pb 在性能和空间占用上都有不错的表现。
写入优化
考虑到主要使用场景是频繁地进行写入更新，我们需要有增量更新的能力。我们考虑将增量 kv 对象序列化后，append 到内存末尾。

空间增长
使用 append 实现增量更新带来了一个新的问题，就是不断 append 的话，文件大小会增长得不可控。我们需要在性能和空间上做个折中。

源码

初始化

MMKV_JNI void jniInitialize(JNIEnv *env, jobject obj, jstring rootDir, jstring cacheDir, jint logLevel) {
    if (!rootDir) {
        return;
    }
    // 获取rootDir的url char指针数组字符串，调⽤MMKV::initializeMMKV进⼀步初始化。
    const char *kstr = env->GetStringUTFChars(rootDir, nullptr);
    if (kstr) {
        MMKV::initializeMMKV(kstr, (MMKVLogLevel) logLevel);
        env->ReleaseStringUTFChars(rootDir, kstr);

        g_android_tmpDir = jstring2string(env, cacheDir);
    }
}

ThreadOnceToken_t once_control = ThreadOnceUninitialized;
void MMKV::initializeMMKV(const MMKVPath_t &rootDir, MMKVLogLevel logLevel) {
    g_currentLogLevel = logLevel;
    ThreadLock::ThreadOnce(&once_control, initialize);
    // 获取rootDir的url char指针数组字符串
    g_rootDir = rootDir;
    // 根据路径创建⽂件夹
    mkPath(g_rootDir);
}

获取MMKV对象

MMKV *MMKV::mmkvWithID(const string &mmapID, MMKVMode mode, string *cryptKey, MMKVPath_t *rootPath) {

    if (mmapID.empty()) {
        return nullptr;
    }
    // 加锁 
    SCOPED_LOCK(g_instanceLock);
    // 将 mmapID 与 relativePath 结合生成 mmapKey 
    auto mmapKey = mmapedKVKey(mmapID, rootPath);
    // 通过 mmapKey 在 map 中查找对应的 MMKV 对象并返回 
    auto itr = g_instanceDic->find(mmapKey);
    if (itr != g_instanceDic->end()) {
        MMKV *kv = itr->second;
        return kv;
    }
   // 如果找不到，构建路径后构建 MMKV 对象并加入 map 
    if (rootPath) {
        MMKVPath_t specialPath = (*rootPath) + MMKV_PATH_SLASH + SPECIAL_CHARACTER_DIRECTORY_NAME;
        if (!isFileExist(specialPath)) {
            mkPath(specialPath);
        }
        MMKVInfo("prepare to load %s (id %s) from rootPath %s", mmapID.c_str(), mmapKey.c_str(), rootPath->c_str());
    }

    //构造对象
    auto kv = new MMKV(mmapID, mode, cryptKey, rootPath);
    kv->m_mmapKey = mmapKey;
    (*g_instanceDic)[mmapKey] = kv;
    return kv;
}

获取path

extern bool mkPath(const MMKVPath_t &str) {
    // strdup拷贝⼀份字符串到path中。
    char *path = strdup(str.c_str());

    struct stat sb = {};
    bool done = false;
    char *slash = path;

    while (!done) {
        // strspn 是⼀直找到匹配字符串，直到出现第⼀个不是"/"
        slash += strspn(slash, "/");
        // strcspn 则是⼀直找不匹配的字符串，直到出现第⼀个“/”
        slash += strcspn(slash, "/");
        // 经过这样拆解，就能把路径⼀个个分割开。通过这中⽅式就能直到什么时候遍历完整个路径。
        done = (*slash == '\0');
        *slash = '\0';

        // stat获取path每⼀个⽂件夹的权限状态，必须保证每⼀级别的⽂件都是0777，也就是读写执⾏全部权限打开。

        if (stat(path, &sb) != 0) {
            if (errno != ENOENT || mkdir(path, 0777) != 0) {
                MMKVWarning("%s : %s", path, strerror(errno));
                free(path);
                return false;
            }
        } else if (!S_ISDIR(sb.st_mode)) {
            MMKVWarning("%s: %s", path, strerror(ENOTDIR));
            free(path);
            return false;
        }

        *slash = '/';
    }
    free(path);

    return true;
}

构造对象

MMKV::MMKV(const string &mmapID, MMKVMode mode, string *cryptKey, MMKVPath_t *rootPath){
    .......
     // 通过加密 key 构建 AES 加密对象 AESCrypt 
    #    ifndef MMKV_DISABLE_CRYPT
    if (cryptKey && cryptKey->length() > 0) {
        m_dicCrypt = new MMKVMapCrypt();
        m_crypter = new AESCrypt(cryptKey->data(), cryptKey->length());
    } else {
        m_dic = new MMKVMap();
    }
#    else
    m_dic = new MMKVMap();
#    endif
     ...................................
     m_lock->initialize();
     ................
       // sensitive zone
       // 加锁后通过 loadFromFile 方法从文件中读取数据，这里的锁是一个跨进程的文件共享锁;
    {
        SCOPED_LOCK(m_sharedProcessLock);
        loadFromFile();
    }
}
void initialize() {
    g_instanceDic = new unordered_map<string, MMKV *>;
    // 初始化了⼀个全局的线程锁
    g_instanceLock = new ThreadLock();
    g_instanceLock->initialize();
    // 。在MMKV中，设置好每⼀页(page)的⼤⼩，⼀般来说我们在32位的机⼦中⼀页都是
    4kb⼤⼩
    mmkv::DEFAULT_MMAP_SIZE = mmkv::getPageSize();
}

加载数据loadFromFile

void MMKV::loadFromFile() {
    //读缓存
        if (m_metaFile->isFileValid()) {
        m_metaInfo->read(m_metaFile->getMemory());
    }
    //读文件
    if (!m_file->isFileValid()) {
        m_file->reloadFromFile();
    }

    // loading 开始加载
    if (loadFromFile && m_actualSize > 0) {
        MMKVInfo("loading [%s] with crc %u sequence %u version %u", m_mmapID.c_str(), m_metaInfo->m_crcDigest,
                    m_metaInfo->m_sequence, m_metaInfo->m_version);
        // 读取 MMBuffer 
        MMBuffer inputBuffer(ptr + Fixed32Size, m_actualSize, MMBufferNoCopy);
        // 如果需要解密，对文件进行解密 
        if (m_crypter) {
            clearDictionary(m_dicCrypt);
        } else {
            clearDictionary(m_dic);
        }
         // 通过 MiniPBCoder 将 MMBuffer 转换为 Map 
        if (needFullWriteback) {
#ifndef MMKV_DISABLE_CRYPT
            if (m_crypter) {
                MiniPBCoder::greedyDecodeMap(*m_dicCrypt, inputBuffer, m_crypter);
            } else
#endif
            {
                MiniPBCoder::greedyDecodeMap(*m_dic, inputBuffer);
            }
        } else {
#ifndef MMKV_DISABLE_CRYPT
            if (m_crypter) {
                MiniPBCoder::decodeMap(*m_dicCrypt, inputBuffer, m_crypter);
            } else
#endif
            {
                MiniPBCoder::decodeMap(*m_dic, inputBuffer);
            }
        }
         // 构造用于输出的 CodeOutputData 
        m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
        m_output->seek(m_actualSize);
        if (needFullWriteback) {
            fullWriteback();
        }
    } else {
        // file not valid or empty, discard everything
        SCOPED_LOCK(m_exclusiveProcessLock);

        m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
        if (m_actualSize > 0) {
            writeActualSize(0, 0, nullptr, IncreaseSequence);
            sync(MMKV_SYNC);
        } else {
            writeActualSize(0, 0, nullptr, KeepSequence);
        }
    }
}

void MemoryFile::reloadFromFile() {
    ....................
    if (!m_diskFile.open()) {
        MMKVError("fail to open:%s, %s", m_diskFile.m_path.c_str(), strerror(errno));
    } else {
        FileLock fileLock(m_diskFile.m_fd);
        InterProcessLock lock(&fileLock, ExclusiveLockType);
        SCOPED_LOCK(&lock);
        // 页文件大小，32位大概是4k
        mmkv::getFileSize(m_diskFile.m_fd, m_size);
        // round up to (n * pagesize)
         // 将文件大小对齐到页大小的整数倍，用 0 填充不足的部分 
        if (m_size < DEFAULT_MMAP_SIZE || (m_size % DEFAULT_MMAP_SIZE != 0)) {
            size_t roundSize = ((m_size / DEFAULT_MMAP_SIZE) + 1) * DEFAULT_MMAP_SIZE;
            truncate(roundSize);
        } else {
            auto ret = mmap();
            if (!ret) {
                doCleanMemoryCache(true);
            }
        }
#    ifdef MMKV_IOS
        tryResetFileProtection(m_diskFile.m_path);
#    endif
    }
}
bool File::open() {
    // 打开对应的文件 
    m_fd = ::open(m_path.c_str(), OpenFlag2NativeFlag(m_flag), S_IRWXU);
    if (!isFileValid()) {
        MMKVError("fail to open [%s], %d(%s)", m_path.c_str(), errno, strerror(errno));
        return false;
    }
    MMKVInfo("open fd[%p], %s", m_fd, m_path.c_str());
    return true;
}

通过 mmap 将文件映射到内存

bool MemoryFile::mmap() {
    //通过 mmap 将文件映射到内存 
    m_ptr = (char *) ::mmap(m_ptr, m_size, PROT_READ | PROT_WRITE, MAP_SHARED, m_diskFile.m_fd, 0);
    if (m_ptr == MAP_FAILED) {
        MMKVError("fail to mmap [%s], %s", m_diskFile.m_path.c_str(), strerror(errno));
        m_ptr = nullptr;
        return false;
    }

    return true;
}

写入

Java 层的 MMKV 对象继承了 SharedPreferences 及 SharedPreferences.Editor 接口并实现了一系列如 putInt、putLong 的方法用于对存储的数据进行修改;最终还是会进入c方法：

bool MMKV::set(int32_t value, MMKVKey_t key) {
    if (isKeyEmpty(key)) {
        return false;
    }
    size_t size = pbInt32Size(value);
    MMBuffer data(size);
    // 构造值对应的 MMBuffer，通过 CodedOutputData 将其写入 Buffer
    CodedOutputData output(data.getPtr(), size);
    output.writeInt32(value);

    return setDataForKey(move(data), key);
}

获取到了写入的 value 在 protobuf 中所占据的大小，之后为其构造了对应的 MMBuffer 并将数据写入了这段 Buffer，最后调用到了 setDataForKey 方法;同时可以发现 CodedOutputData 是与 Buffer 交互的桥梁，可以通过它实现向 MMBuffer 中写入数据;

bool MMKV::setDataForKey(MMBuffer &&data, MMKVKey_t key, bool isDataHolder) {
    if ((!isDataHolder && data.length() == 0) || isKeyEmpty(key)) {
        return false;
    }
     // 获取写锁 
    SCOPED_LOCK(m_lock);
    SCOPED_LOCK(m_exclusiveProcessLock);
    // 确保数据已读入内存 
    checkLoadData();

#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        if (isDataHolder) {
            auto sizeNeededForData = pbRawVarint32Size((uint32_t) data.length()) + data.length();
            if (!KeyValueHolderCrypt::isValueStoredAsOffset(sizeNeededForData)) {
                data = MiniPBCoder::encodeDataWithObject(data);
                isDataHolder = false;
            }
        }
        // 将 data 写入 map 中 
        auto itr = m_dicCrypt->find(key);
        if (itr != m_dicCrypt->end()) {
#    ifdef MMKV_APPLE
            auto ret = appendDataWithKey(data, key, itr->second, isDataHolder);
#    else
            auto ret = appendDataWithKey(data, key, isDataHolder);
#    endif
            if (!ret.first) {
                return false;
            }
            if (KeyValueHolderCrypt::isValueStoredAsOffset(ret.second.valueSize)) {
                KeyValueHolderCrypt kvHolder(ret.second.keySize, ret.second.valueSize, ret.second.offset);
                memcpy(&kvHolder.cryptStatus, &t_status, sizeof(t_status));
                itr->second = move(kvHolder);
            } else {
                itr->second = KeyValueHolderCrypt(move(data));
            }
        } else {
            auto ret = appendDataWithKey(data, key, isDataHolder);
            if (!ret.first) {
                return false;
            }
            if (KeyValueHolderCrypt::isValueStoredAsOffset(ret.second.valueSize)) {
                auto r = m_dicCrypt->emplace(
                    key, KeyValueHolderCrypt(ret.second.keySize, ret.second.valueSize, ret.second.offset));
                if (r.second) {
                    memcpy(&(r.first->second.cryptStatus), &t_status, sizeof(t_status));
                }
            } else {
                m_dicCrypt->emplace(key, KeyValueHolderCrypt(move(data)));
            }
        }
    } else
#endif // MMKV_DISABLE_CRYPT
    {
        auto itr = m_dic->find(key);
        if (itr != m_dic->end()) {
            auto ret = appendDataWithKey(data, itr->second, isDataHolder);
            if (!ret.first) {
                return false;
            }
            itr->second = std::move(ret.second);
        } else {
            auto ret = appendDataWithKey(data, key, isDataHolder);
            if (!ret.first) {
                return false;
            }
            m_dic->emplace(key, std::move(ret.second));
        }
    }
    m_hasFullWriteback = false;
#ifdef MMKV_APPLE
    [key retain];
#endif
    return true;
}

数据已读入内存的情况下将 data 写入了对应的 map，之后调用了 appendDataWithKey 方法:


KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, const KeyValueHolder &kvHolder, bool isDataHolder) {
    SCOPED_LOCK(m_exclusiveProcessLock);

    uint32_t keyLength = kvHolder.keySize;
    // size needed to encode the key
    // 计算写入到映射空间中的 size 
    size_t rawKeySize = keyLength + pbRawVarint32Size(keyLength);

    // ensureMemorySize() might change kvHolder.offset, so have to do it early
    {
        auto valueLength = static_cast<uint32_t>(data.length());
        if (isDataHolder) {
            valueLength += pbRawVarint32Size(valueLength);
        }
        auto size = rawKeySize + valueLength + pbRawVarint32Size(valueLength);
        // 确定剩余映射空间足够 
        bool hasEnoughSize = ensureMemorySize(size);
        if (!hasEnoughSize) {
            return make_pair(false, KeyValueHolder());
        }
    }
    auto basePtr = (uint8_t *) m_file->getMemory() + Fixed32Size;
    MMBuffer keyData(basePtr + kvHolder.offset, rawKeySize, MMBufferNoCopy);

    return doAppendDataWithKey(data, keyData, isDataHolder, keyLength);
}

MMKV::doAppendDataWithKey(const MMBuffer &data, const MMBuffer &keyData, bool isDataHolder, uint32_t originKeyLength) {
    auto isKeyEncoded = (originKeyLength < keyData.length());
    auto keyLength = static_cast<uint32_t>(keyData.length());
    auto valueLength = static_cast<uint32_t>(data.length());
    if (isDataHolder) {
        valueLength += pbRawVarint32Size(valueLength);
    }
    // size needed to encode the key
    size_t size = isKeyEncoded ? keyLength : (keyLength + pbRawVarint32Size(keyLength));
    // size needed to encode the value
    size += valueLength + pbRawVarint32Size(valueLength);

    SCOPED_LOCK(m_exclusiveProcessLock);

    bool hasEnoughSize = ensureMemorySize(size);
    if (!hasEnoughSize || !isFileValid()) {
        return make_pair(false, KeyValueHolder());
    }

#ifdef MMKV_IOS
    auto ret = guardForBackgroundWriting(m_output->curWritePointer(), size);
    if (!ret.first) {
        return make_pair(false, KeyValueHolder());
    }
#endif
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        if (KeyValueHolderCrypt::isValueStoredAsOffset(valueLength)) {
            m_crypter->getCurStatus(t_status);
        }
    }
#endif
    // 重新构建并写入数据 
    try {
        if (isKeyEncoded) {
            m_output->writeRawData(keyData);
        } else {
            m_output->writeData(keyData);
        }
        if (isDataHolder) {
            m_output->writeRawVarint32((int32_t) valueLength);
        }
        m_output->writeData(data); // note: write size of data
    } catch (std::exception &e) {
        MMKVError("%s", e.what());
        return make_pair(false, KeyValueHolder());
    }

    auto offset = static_cast<uint32_t>(m_actualSize);
    auto ptr = (uint8_t *) m_file->getMemory() + Fixed32Size + m_actualSize;
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        m_crypter->encrypt(ptr, ptr, size);
    }
#endif
    m_actualSize += size;
    updateCRCDigest(ptr, size);

    return make_pair(true, KeyValueHolder(originKeyLength, valueLength, offset));
}

void CodedOutputData::writeRawData(const MMBuffer &data) {
    size_t numberOfBytes = data.length();
    if (m_position + numberOfBytes > m_size) {
        auto msg = "m_position: " + to_string(m_position) + ", numberOfBytes: " + to_string(numberOfBytes) +
                   ", m_size: " + to_string(m_size);
        throw out_of_range(msg);
    }
    memcpy(m_ptr + m_position, data.getPtr(), numberOfBytes);
    m_position += numberOfBytes;
}

注意的是：由于 protobuf 不支持增量更新，为了避免全量写入带来的性能问题，MMKV 在文件中的写入并不是通过修改文件对应的位置，而是直接在后面 append 一条新的数据，即使是修改了已存在的 key。而读取时只记录最后一条对应 key 的数据，这样显然会在文件中存在冗余的数据。这样设计的原因我认为是出于性能的考量，MMKV 中存在着一套内存重整机制用于对冗余的 key-value 数据进行处理。它正是在确保内存充足时实现的;

内存重整ensureMemorySize


// since we use append mode, when -[setData: forKey:] many times, space may not be enough
// try a full rewrite to make space
bool MMKV::ensureMemorySize(size_t newSize) {
    // 如果内存剩余大小不足以写入，尝试进行内存重整，将 map 中的数据重新写入 protobuf 文件 
    if (newSize >= m_output->spaceLeft() || (m_crypter ? m_dicCrypt->empty() : m_dic->empty())) {
        // try a full rewrite to make space
        auto fileSize = m_file->getFileSize();
        auto preparedData = m_crypter ? prepareEncode(*m_dicCrypt) : prepareEncode(*m_dic);
        auto sizeOfDic = preparedData.second;
        size_t lenNeeded = sizeOfDic + Fixed32Size + newSize;
        size_t dicCount = m_crypter ? m_dicCrypt->size() : m_dic->size();
        size_t avgItemSize = lenNeeded / std::max<size_t>(1, dicCount);
        size_t futureUsage = avgItemSize * std::max<size_t>(8, (dicCount + 1) / 2);
        // 1. no space for a full rewrite, double it
        // 2. or space is not large enough for future usage, double it to avoid frequently full rewrite
        if (lenNeeded >= fileSize || (lenNeeded + futureUsage) >= fileSize) {
            size_t oldSize = fileSize;
            // 如果内存重整后仍不足以写入，则将大小不断乘2直至足够写入，最后通过 mmap 重新映射文件 
            do {
                 // double 空间直至足够 
                fileSize *= 2;
            } while (lenNeeded + futureUsage >= fileSize);
            // if we can't extend size, rollback to old state
            if (!m_file->truncate(fileSize)) {
                return false;
            }
        }
        return doFullWriteBack(move(preparedData), nullptr);
    }
    return true;
}


bool MemoryFile::truncate(size_t size) {
    。。。。。。。。
    // 重新通过 mmap 映射 
    auto ret = mmap();
    if (!ret) {
        doCleanMemoryCache(true);
    }
    return ret;
}

bool MMKV::doFullWriteBack(pair<MMBuffer, size_t> preparedData, AESCrypt *newCrypter) {
    auto ptr = (uint8_t *) m_file->getMemory();
    auto totalSize = preparedData.second;
#ifdef MMKV_IOS
    auto ret = guardForBackgroundWriting(ptr + Fixed32Size, totalSize);
    if (!ret.first) {
        return false;
    }
#endif

#ifndef MMKV_DISABLE_CRYPT
    uint8_t newIV[AES_KEY_LEN];
    auto decrypter = m_crypter;
    auto encrypter = (newCrypter == InvalidCryptPtr) ? nullptr : (newCrypter ? newCrypter : m_crypter);
    if (encrypter) {
        AESCrypt::fillRandomIV(newIV);
        encrypter->resetIV(newIV, sizeof(newIV));
    }
#endif

    delete m_output;
    m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        memmoveDictionary(*m_dicCrypt, m_output, ptr, decrypter, encrypter, preparedData);
    } else {
#else
    {
        auto encrypter = m_crypter;
#endif
        memmoveDictionary(*m_dic, m_output, ptr, encrypter, totalSize);
    }

    m_actualSize = totalSize;
#ifndef MMKV_DISABLE_CRYPT
    if (encrypter) {
        recaculateCRCDigestWithIV(newIV);
    } else
#endif
    {
        recaculateCRCDigestWithIV(nullptr);
    }
    m_hasFullWriteback = true;
    // make sure lastConfirmedMetaInfo is saved
    sync(MMKV_SYNC);
    return true;
}

内存重整步骤如下：

当剩余映射空间不足以写入需要写入的内容，尝试进行内存重整;
内存重整会将文件清空，将 map 中的数据重新写入文件，从而去除冗余数据;

若内存重整后剩余映射空间仍然不足，不断将映射空间 double 直到足够，并用 mmap 重新映射;

读取


int32_t MMKV::getInt32(MMKVKey_t key, int32_t defaultValue, bool *hasValue) {
    if (isKeyEmpty(key)) {
        if (hasValue != nullptr) {
            *hasValue = false;
        }
        return defaultValue;
    }
    SCOPED_LOCK(m_lock);
    auto data = getDataForKey(key);
    if (data.length() > 0) {
        try {
            CodedInputData input(data.getPtr(), data.length());
            if (hasValue != nullptr) {
                *hasValue = true;
            }
            return input.readInt32();
        } catch (std::exception &exception) {
            MMKVError("%s", exception.what());
        }
    }
    if (hasValue != nullptr) {
        *hasValue = false;
    }
    return defaultValue;
}


MMBuffer MMKV::getDataForKey(MMKVKey_t key) {
    checkLoadData();
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        auto itr = m_dicCrypt->find(key);
        if (itr != m_dicCrypt->end()) {
            auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size;
            return itr->second.toMMBuffer(basePtr, m_crypter);
        }
    } else
#endif
    {
        auto itr = m_dic->find(key);
        if (itr != m_dic->end()) {
            auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size;
            return itr->second.toMMBuffer(basePtr);
        }
    }
    MMBuffer nan;
    return nan;
}

调用了 getDataForKey 方法获取到了 key 对应的 MMBuffer，之后通过 CodedInputData 将数据读出并返回;没有时，返回默认值MMBuffer nan;

remove

void MMKV::removeValueForKey(MMKVKey_t key) {
    if (isKeyEmpty(key)) {
        return;
    }
    SCOPED_LOCK(m_lock);
    SCOPED_LOCK(m_exclusiveProcessLock);
    checkLoadData();

    removeDataForKey(key);
}

它在数据读入内存的前提下，调用了 removeDataForKey 方法：


bool MMKV::removeDataForKey(MMKVKey_t key) {
    if (isKeyEmpty(key)) {
        return false;
    }
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        auto itr = m_dicCrypt->find(key);
        if (itr != m_dicCrypt->end()) {
            m_hasFullWriteback = false;
            构造了一条 size 为 null的 MMBuffer 
            static MMBuffer nan;
#    ifdef MMKV_APPLE
            auto ret = appendDataWithKey(nan, key, itr->second);
            if (ret.first) {
                auto oldKey = itr->first;
                m_dicCrypt->erase(itr);
                [oldKey release];
            }
#    else
            auto ret = appendDataWithKey(nan, key);
            if (ret.first) {
                m_dicCrypt->erase(itr);
            }
#    endif
            return ret.first;
        }
    }

    return false;
}

这里实际上是构造了一条 size 为null 的 MMBuffer 并调用 appendDataWithKey 将其 append 到 protobuf 文件中，并将 key 对应的内容从 map 中删除;读取时发现它的 size 为 null，则会认为这条数据已经删除;

总结

MMKV

1.protobuf文件存储，文件更小;
2.采用mmap，文件拷贝速度更快;
3.不阻塞主线程，采用缺页方式，避免数据丢失。
4.支持增量更新，不管key是否重复，直接将数据加载前数据后。
5.文件大小不够，就需要全量写入：但是需要首先去重，去重后，如果空间够，就把数据编码成mmkv文件格式，全量的覆盖写入文件，如果不够，就需要先扩容（*2）；
6.扩容，设置文件大小*2，通过unmap，解除映射；重新映射mmap(size * 2),这里是个do while循环判断；
7.如果文件损坏：回调给开发者或者默认直接丢失数据，重新保存数据。
8.支持多进程：文件锁flock。多进程数据同步：crc32校验

数据对比

参考资料

关于SharePreference使用以及内部原理简单解析
 庖丁解牛之SharedPreferences超级大卡顿
 MMKV
https://www.51cto.com/article/686452.html

Ursprünglicher Link: http://nunu03.github.io/2022/06/03/SharedPreferences-MMAP-MMKV数据持久化原理解析/

Copyright-Erklärung: 转载请注明出处.

chenyulong

理论是你知道是这样，但它却不好用；实践是它很好用，但你不知道是为什么；程序员将理论和实践结合到一起：既不好用，也不知道是为什么。

SharedPreferences+MMAP+MMKV数据持久化原理解析