ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

BDCode: An erasure code algorithm for big data storage systems

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2016.03.003
  • Received Date: 27 August 2015
  • Accepted Date: 01 December 2015
  • Rev Recd Date: 01 December 2015
  • Publish Date: 30 March 2016
  • An optimized algorithm, based on erasure coding technology towards the big data storage system that contains a lot of data, was proposed. By studying existing coding technologies and big data systems, this algorithm, named BDCode (big data code) can not only protect system reliability, but also improve the security and the utilization of storage space. Due to the high reliability and space saving rate of coding technology, coding mechanisms were introduced into big data systems. The storage nodes are divided into many virtual nodes to realize load balancing. By setting different virtual nodes’ storage groups for different codec servers , ensure system availability. And by using the parallel decoding computing of the nodes and the block of data, we can be ensured the recovery efficiency of the system can be proved when data is corrupted. Additionally, different users setting different coding parameters can improve the robustness of big data storage systems. We configured various data block m and calibration block k to improve the utilization rate in the quantitative experiments. The results show that parallel decoding speed can be nearly two times faster than the past serial decoding speed. The encoding efficiency with BDCode coding is, on average, 36.1% higher than using CRS and 58.2% higer than using RS coding. The decoding rate by using BDCode averages 19.3% higher than using CRS and 33.1% higher than using RS.
    An optimized algorithm, based on erasure coding technology towards the big data storage system that contains a lot of data, was proposed. By studying existing coding technologies and big data systems, this algorithm, named BDCode (big data code) can not only protect system reliability, but also improve the security and the utilization of storage space. Due to the high reliability and space saving rate of coding technology, coding mechanisms were introduced into big data systems. The storage nodes are divided into many virtual nodes to realize load balancing. By setting different virtual nodes’ storage groups for different codec servers , ensure system availability. And by using the parallel decoding computing of the nodes and the block of data, we can be ensured the recovery efficiency of the system can be proved when data is corrupted. Additionally, different users setting different coding parameters can improve the robustness of big data storage systems. We configured various data block m and calibration block k to improve the utilization rate in the quantitative experiments. The results show that parallel decoding speed can be nearly two times faster than the past serial decoding speed. The encoding efficiency with BDCode coding is, on average, 36.1% higher than using CRS and 58.2% higer than using RS coding. The decoding rate by using BDCode averages 19.3% higher than using CRS and 33.1% higher than using RS.
  • loading
  • [1]
    MORRIS R J T, TRUSKOWSKI B J. The evolution of storage systems[J]. IBM Systems Journal, 2003, 42(2): 205-217.
    [2]
    WANG Y, KAPRITSOS M, REN Z, et al. Robustness in the Salus scalable block store[C]// Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. Lombard, USA: ACM Press, 2013: 357-370.
    [3]
    LI Y, DHOTRE N, OHARA Y, et al. Horus: Fine-grained encryption-based security for large-scale storage[C]// Proceedings of the 11th USENIX Conference on File and Storage Systems. San Jose, USA: IEEE Press, 2013: 147-160.
    [4]
    TIWARI D, BOBOILA S, VAZHKUDAI S S, et al. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines[C]// Proceedings of the 11th USENIX Conference on File and Storage Systems. San Jose, USA: IEEE Press, 2013: 119-132.
    [5]
    WILKES J, GOLDING R, STAELIN C, et al. The HP AutoRAID hierarchical storage system[J]. ACM Transactions on Computer System, 1996, 14(1): 108-136.
    [6]
    GHEMAWAT S, GOBIOFF H, LEUNG S T. File and storage systems: The Google file system[C]// Proceedings of the 9th ACM Symposium on Operating Systems Principles. Bolton Landing, USA: IEEE Press, 2003: 29-43.
    [7]
    SHVACHKO K, KUANG H, RADIA S, et al. The Hadoop distributed file system[C]// Proceedings of Symposium on Mass Storage Systems & Technologies. Incline Village, USA: IEEE Press, 2010: 1-10.
    [8]
    Amazon simple storage service (S3)[EB/OL]. http://www.amazon.com/s3.
    [9]
    WEIL S A, BRANDT S A, MILLER E L, et al. Ceph: A scalable, high-performance distributed file system[C]// Proceedings of the 7th Conference on Operating Systems Design and Implementation. Seattle, USA: ACM Press, 2006: 307-320.
    [10]
    BLOMER J, KALFANE M, KARP R, et al. An XOR-based erasure-resilient coding scheme[R]. Technical Report TR-95-048, International Computer Science Institute, 1995.
    [11]
    PLANK J S. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems[J]. Software-Practice & Experience, 1997, 27(9): 995-1012.
    [12]
    STOICA I, MORRIS R, KARGER D, et al. Chord: A scalable peer-to-peer lookup service for Internet applications[J]. Proceedings of the ACM SIGCOMM Computer Communication Review, 2001, 31(4): 149-160.
    [13]
    Meister D, Brinkmann A. dedupv1: Improving deduplication throughput using solid state drives (SSD)[C]// Proceedings of the 26th Symposium on Mass Storage systems and Technology. Incline Village, NV: IEEE, 2010: 1-6.
    [14]
    XIA M Y, SAXENA M, BLAUM M et al. A tale of two erasure codes in HDFS[C]// Proceedings of the 13th USENIX Conference on File and Storage Technologies. Santa Clara, USA: ACM Press, 2015: 213-226.
    [15]
    FAN B, TANTISIRIROJ W, XIAO L, et al. Diskreduce: RAID for data-intensive scalable computing[C]// Proceedings of the 4th Annual Workshop on Petatascale Data Storage. Portlang, USA: ACM Press, 2009: 6-10.
    [16]
    VENKATARAMANI V, AMSDEN Z, BRONSON N, Get al. Tao: How Facebook serves the social graph[C]// Proceedings of the ACM SIGMOD International Conference on Management of Data. New York, USA: ACM Press, 2012: 791-792.
    [17]
    NISHTALA R, FUGAL H, GRIMM S, et al. Scaling memcache at Facebook[C]// Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. Berkeley, USA: ACM Press, 2013: 385-398.
    [18]
    Google-GFS2 Colossus [EB/OL]. http://www.quora.com/Colossus-Google- GFS2" Google, 2012.
    [19]
    TAMO I, BARG A. A family of optimal locally recoverable codes[J]. IEEE Transactions on Information Theory , 2014, 60(8): 4661-4676.
    [20]
    FENG G L, DENG R H, BAO F, et al. New efficient MDS array codes for RAID part I: Reed-Solomon-like codes for tolerating three disk failures[J]. IEEE Transactions on Computers, 2005, 54(9):1071-1080.
    [21]
    HAFNER J L. WEAVER Codes: Highly fault tolerant erasure codes for storage systems[C]// Proceedings of the 4th USENIX Conference on File and Storage Technology. San Francisco, USA: ACM Press, 2005: 211-224.
    [22]
    HAFNER J L. HoVer erasure codes for disk arrays[C]// International Conference on Dependable Systems and Networks. Philadelphia, USA: IEEE Press, 2006: 217-226.
    [23]
    XU L H, BRUCK J. X-Code: MDS array codes with optimal encoding[J]. IEEE Transactions on Information Theory, 1999, 45(1): 272-276.
    [24]
    YIN C, XIE C S, WAN J G, et al. BMCloud: Minimizing repair bandwidth and maintenance cost in cloud storage[J]. Mathematical Problems in Engineering, 2013, 2013(6): 1-11.
    [25]
    RASHMI K V, NAKKIRAN P, WANG J Y, et al. Having your cake and eating it too: Jointly optimal erasure codes for I/O, storage, and network-bandwidth[C]// Proceedings of the 13th USENIX Conference on File and Storage Technologies. Santa Clara, USA: ACM Press, 2015: 81-94.
    [26]
    STRZELCZAK P, ADAMCZYK E, HERMAN-IZYCKA U, et al. Concurrent Deletion in a Distributed Content-Addressable Storage System with Global Deduplication[C]// Proceedings of the 11th USENIX Conference on File and Storage Technologies. 2013: 161-174.
    [27]
    FU M, FENG D, HUA Y, et al. Design Tradeoffs for data deduplication performance in backup workloads[C]// Proceedings of the 13th USENIX Conference on File and Storage Technologies. Santa Clara, USA: ACM Press, 2015: 331-344.
    [28]
    REED I S, SOLOMON G. Polynomial codes over certain finite fields[J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300-304.)
  • 加载中

Catalog

    [1]
    MORRIS R J T, TRUSKOWSKI B J. The evolution of storage systems[J]. IBM Systems Journal, 2003, 42(2): 205-217.
    [2]
    WANG Y, KAPRITSOS M, REN Z, et al. Robustness in the Salus scalable block store[C]// Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. Lombard, USA: ACM Press, 2013: 357-370.
    [3]
    LI Y, DHOTRE N, OHARA Y, et al. Horus: Fine-grained encryption-based security for large-scale storage[C]// Proceedings of the 11th USENIX Conference on File and Storage Systems. San Jose, USA: IEEE Press, 2013: 147-160.
    [4]
    TIWARI D, BOBOILA S, VAZHKUDAI S S, et al. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines[C]// Proceedings of the 11th USENIX Conference on File and Storage Systems. San Jose, USA: IEEE Press, 2013: 119-132.
    [5]
    WILKES J, GOLDING R, STAELIN C, et al. The HP AutoRAID hierarchical storage system[J]. ACM Transactions on Computer System, 1996, 14(1): 108-136.
    [6]
    GHEMAWAT S, GOBIOFF H, LEUNG S T. File and storage systems: The Google file system[C]// Proceedings of the 9th ACM Symposium on Operating Systems Principles. Bolton Landing, USA: IEEE Press, 2003: 29-43.
    [7]
    SHVACHKO K, KUANG H, RADIA S, et al. The Hadoop distributed file system[C]// Proceedings of Symposium on Mass Storage Systems & Technologies. Incline Village, USA: IEEE Press, 2010: 1-10.
    [8]
    Amazon simple storage service (S3)[EB/OL]. http://www.amazon.com/s3.
    [9]
    WEIL S A, BRANDT S A, MILLER E L, et al. Ceph: A scalable, high-performance distributed file system[C]// Proceedings of the 7th Conference on Operating Systems Design and Implementation. Seattle, USA: ACM Press, 2006: 307-320.
    [10]
    BLOMER J, KALFANE M, KARP R, et al. An XOR-based erasure-resilient coding scheme[R]. Technical Report TR-95-048, International Computer Science Institute, 1995.
    [11]
    PLANK J S. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems[J]. Software-Practice & Experience, 1997, 27(9): 995-1012.
    [12]
    STOICA I, MORRIS R, KARGER D, et al. Chord: A scalable peer-to-peer lookup service for Internet applications[J]. Proceedings of the ACM SIGCOMM Computer Communication Review, 2001, 31(4): 149-160.
    [13]
    Meister D, Brinkmann A. dedupv1: Improving deduplication throughput using solid state drives (SSD)[C]// Proceedings of the 26th Symposium on Mass Storage systems and Technology. Incline Village, NV: IEEE, 2010: 1-6.
    [14]
    XIA M Y, SAXENA M, BLAUM M et al. A tale of two erasure codes in HDFS[C]// Proceedings of the 13th USENIX Conference on File and Storage Technologies. Santa Clara, USA: ACM Press, 2015: 213-226.
    [15]
    FAN B, TANTISIRIROJ W, XIAO L, et al. Diskreduce: RAID for data-intensive scalable computing[C]// Proceedings of the 4th Annual Workshop on Petatascale Data Storage. Portlang, USA: ACM Press, 2009: 6-10.
    [16]
    VENKATARAMANI V, AMSDEN Z, BRONSON N, Get al. Tao: How Facebook serves the social graph[C]// Proceedings of the ACM SIGMOD International Conference on Management of Data. New York, USA: ACM Press, 2012: 791-792.
    [17]
    NISHTALA R, FUGAL H, GRIMM S, et al. Scaling memcache at Facebook[C]// Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. Berkeley, USA: ACM Press, 2013: 385-398.
    [18]
    Google-GFS2 Colossus [EB/OL]. http://www.quora.com/Colossus-Google- GFS2" Google, 2012.
    [19]
    TAMO I, BARG A. A family of optimal locally recoverable codes[J]. IEEE Transactions on Information Theory , 2014, 60(8): 4661-4676.
    [20]
    FENG G L, DENG R H, BAO F, et al. New efficient MDS array codes for RAID part I: Reed-Solomon-like codes for tolerating three disk failures[J]. IEEE Transactions on Computers, 2005, 54(9):1071-1080.
    [21]
    HAFNER J L. WEAVER Codes: Highly fault tolerant erasure codes for storage systems[C]// Proceedings of the 4th USENIX Conference on File and Storage Technology. San Francisco, USA: ACM Press, 2005: 211-224.
    [22]
    HAFNER J L. HoVer erasure codes for disk arrays[C]// International Conference on Dependable Systems and Networks. Philadelphia, USA: IEEE Press, 2006: 217-226.
    [23]
    XU L H, BRUCK J. X-Code: MDS array codes with optimal encoding[J]. IEEE Transactions on Information Theory, 1999, 45(1): 272-276.
    [24]
    YIN C, XIE C S, WAN J G, et al. BMCloud: Minimizing repair bandwidth and maintenance cost in cloud storage[J]. Mathematical Problems in Engineering, 2013, 2013(6): 1-11.
    [25]
    RASHMI K V, NAKKIRAN P, WANG J Y, et al. Having your cake and eating it too: Jointly optimal erasure codes for I/O, storage, and network-bandwidth[C]// Proceedings of the 13th USENIX Conference on File and Storage Technologies. Santa Clara, USA: ACM Press, 2015: 81-94.
    [26]
    STRZELCZAK P, ADAMCZYK E, HERMAN-IZYCKA U, et al. Concurrent Deletion in a Distributed Content-Addressable Storage System with Global Deduplication[C]// Proceedings of the 11th USENIX Conference on File and Storage Technologies. 2013: 161-174.
    [27]
    FU M, FENG D, HUA Y, et al. Design Tradeoffs for data deduplication performance in backup workloads[C]// Proceedings of the 13th USENIX Conference on File and Storage Technologies. Santa Clara, USA: ACM Press, 2015: 331-344.
    [28]
    REED I S, SOLOMON G. Polynomial codes over certain finite fields[J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300-304.)

    Article Metrics

    Article views (33) PDF downloads(62)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return