ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

A read/write balanced high-performance key-value store

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2020.06.015
  • Received Date: 13 March 2020
  • Accepted Date: 04 June 2020
  • Rev Recd Date: 04 June 2020
  • Publish Date: 30 June 2020
  • Log-structured merge tree (LSM-tree) is widely used as the core data structure of modern key-value stores due to its ability to utilize sequential access performance of external storage. However, it suffers from expensive merge operations to maintain the layered and ordered data organization, which induces significant write amplification. Recent researches have proposed various optimizations to mitigate write amplification, but at the cost of query performance and space utilization. A new architecture for LSM-tree based key-value store is proposed. The key idea is to leverage the key-value separation design to mitigate merge overhead, while maintaining a certain degree of order for values by using a new tree structure called vTree to improve the range query performance. In the meantime, corresponding data merge and space reclaim algorithms were developed for the vTree. The experimental results show that the key-value store developed with this architecture has well balanced performance on write, point lookup, and range query, with low space cost.
    Log-structured merge tree (LSM-tree) is widely used as the core data structure of modern key-value stores due to its ability to utilize sequential access performance of external storage. However, it suffers from expensive merge operations to maintain the layered and ordered data organization, which induces significant write amplification. Recent researches have proposed various optimizations to mitigate write amplification, but at the cost of query performance and space utilization. A new architecture for LSM-tree based key-value store is proposed. The key idea is to leverage the key-value separation design to mitigate merge overhead, while maintaining a certain degree of order for values by using a new tree structure called vTree to improve the range query performance. In the meantime, corresponding data merge and space reclaim algorithms were developed for the vTree. The experimental results show that the key-value store developed with this architecture has well balanced performance on write, point lookup, and range query, with low space cost.
  • loading
  • [1]
    NISHTALA R, FUGAL H, GRIMM S, et al.Scaling memcache at facebook[C]// Presented as part of the 10th USENIX Symposium on Networked System Design and Implementation. 2013: 385-398.
    [2]
    O’NEIL P, CHENG E, GAWLICK D, et al. The log-structured merge-tree (lsm-tree)[J]. Acta Informatica, 1996, 33(4):351-385.
    [3]
    GHEMAWAT S, DEAN J. LevelDB[EB/OL]. [2020-06-01]. https://github.com/google/leveldb.
    [4]
    CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable: A distributed storage system for structured data[J]. ACM Transactions on Computer Systems, 2008, 26(2):1-24.
    [5]
    Facebook Database Engineering Team. RocksDB[EB/OL]. [2020-06-01]. https://github.com/facebook/rocksdb.
    [6]
    RAJU P, KADEKODI R, CHIDAMBARAM V, et al. Pebblesdb: Building key-value stores using fragmented log-structured merge trees[C]// Proceedings of the 26th Symposium on Operating Systems Principles. 2017: 497-514.
    [7]
    WU X, XU Y, SHAO Z, et al. Lsm-trie: An lsm-tree-based ultra-large key-value store for small data items[C]// 2015 USENIX Annual Technical Conference. 2015: 71-82.
    [8]
    LU L, PILLAIT S, GOPALAKRISHNAN H, et al. Wisckey: Separating keys from values in ssd-conscious storage[J]. ACM Transactions on Storage, 2017, 13(1): 1-28.
    [9]
    CAO Z, DONG S, VEMURI S, et al.Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook[C]// 18th USENIX Conference on File and Storage Technologies. 2020: 209-22.
    [10]
    MATSUNOBU Y. Innodb to myrocks migration in main mysql database at facebook[C]// USENIX Association, May 2017.
    [11]
    SEARS R, RAMAKRISHNAN R. BLSM: A general purpose log structured merge tree[C]// Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 2012: 217-228.
    [12]
    KIRSCH A, MITZENMACHER M. Less hashing, same performance: Building a better bloom filter[C]// European Symposium on Algorithms. Springer, 2006: 456-467.
    [13]
    ATIKOGLU B, XU Y, FRACHTENBERG E, et al. Workload analysis of a large-scale key-value store[C]// Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems. 2012: 53-64.
    [14]
    CHANH H, LIANG C J M, LI Y, et al. Hashkv: Enabling efficient updates inkv storage via hashing[C]// 2018 USENIX Annual Technical Conference. 2018: 1007-1019.
    [15]
    COOPER B F, SILBERSTEIN A, TAM E, et al. Benchmarking cloud serving systems with YCSB[C]// Proceedings of the 1st ACM Symposium on Cloud Computing. 2010: 143-154.
  • 加载中

Catalog

    [1]
    NISHTALA R, FUGAL H, GRIMM S, et al.Scaling memcache at facebook[C]// Presented as part of the 10th USENIX Symposium on Networked System Design and Implementation. 2013: 385-398.
    [2]
    O’NEIL P, CHENG E, GAWLICK D, et al. The log-structured merge-tree (lsm-tree)[J]. Acta Informatica, 1996, 33(4):351-385.
    [3]
    GHEMAWAT S, DEAN J. LevelDB[EB/OL]. [2020-06-01]. https://github.com/google/leveldb.
    [4]
    CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable: A distributed storage system for structured data[J]. ACM Transactions on Computer Systems, 2008, 26(2):1-24.
    [5]
    Facebook Database Engineering Team. RocksDB[EB/OL]. [2020-06-01]. https://github.com/facebook/rocksdb.
    [6]
    RAJU P, KADEKODI R, CHIDAMBARAM V, et al. Pebblesdb: Building key-value stores using fragmented log-structured merge trees[C]// Proceedings of the 26th Symposium on Operating Systems Principles. 2017: 497-514.
    [7]
    WU X, XU Y, SHAO Z, et al. Lsm-trie: An lsm-tree-based ultra-large key-value store for small data items[C]// 2015 USENIX Annual Technical Conference. 2015: 71-82.
    [8]
    LU L, PILLAIT S, GOPALAKRISHNAN H, et al. Wisckey: Separating keys from values in ssd-conscious storage[J]. ACM Transactions on Storage, 2017, 13(1): 1-28.
    [9]
    CAO Z, DONG S, VEMURI S, et al.Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook[C]// 18th USENIX Conference on File and Storage Technologies. 2020: 209-22.
    [10]
    MATSUNOBU Y. Innodb to myrocks migration in main mysql database at facebook[C]// USENIX Association, May 2017.
    [11]
    SEARS R, RAMAKRISHNAN R. BLSM: A general purpose log structured merge tree[C]// Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 2012: 217-228.
    [12]
    KIRSCH A, MITZENMACHER M. Less hashing, same performance: Building a better bloom filter[C]// European Symposium on Algorithms. Springer, 2006: 456-467.
    [13]
    ATIKOGLU B, XU Y, FRACHTENBERG E, et al. Workload analysis of a large-scale key-value store[C]// Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems. 2012: 53-64.
    [14]
    CHANH H, LIANG C J M, LI Y, et al. Hashkv: Enabling efficient updates inkv storage via hashing[C]// 2018 USENIX Annual Technical Conference. 2018: 1007-1019.
    [15]
    COOPER B F, SILBERSTEIN A, TAM E, et al. Benchmarking cloud serving systems with YCSB[C]// Proceedings of the 1st ACM Symposium on Cloud Computing. 2010: 143-154.

    Article Metrics

    Article views (55) PDF downloads(142)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return