ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

MCDS: Large-scale mobile communication data computation on just a PC

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2016.01.006
  • Received Date: 27 August 2015
  • Accepted Date: 29 September 2015
  • Rev Recd Date: 29 September 2015
  • Publish Date: 30 January 2016
  • Mobile data has the characteristics of high volume, variety, velocity and value. Mobile communication data is an important part of mobile data, and it has great research value. It is of tremendous significance to efficiently store and retrieve mobile data. At present, utilizing parallel technology to perform data mining has become the main stream, but the technology is very costly in terms of hardware, and code debugging and optimization of parallel algorithms is difficult. A mobile communication data processing system operational on a single PC was proposed. MCDS is based on GraphChi, and improves GraphChi from 3 aspects: data format, sharding mechanism and memory replacement algorithm. Experimental results verify the effectiveness of MCDS, and it provides a feasible experimental environment for mobile communication data mining.
    Mobile data has the characteristics of high volume, variety, velocity and value. Mobile communication data is an important part of mobile data, and it has great research value. It is of tremendous significance to efficiently store and retrieve mobile data. At present, utilizing parallel technology to perform data mining has become the main stream, but the technology is very costly in terms of hardware, and code debugging and optimization of parallel algorithms is difficult. A mobile communication data processing system operational on a single PC was proposed. MCDS is based on GraphChi, and improves GraphChi from 3 aspects: data format, sharding mechanism and memory replacement algorithm. Experimental results verify the effectiveness of MCDS, and it provides a feasible experimental environment for mobile communication data mining.
  • loading
  • [1]
    Kang U, Tong H H, Sun J M, et al. Gbase: An efficient analysis platform for large graphs [J]. The VLDB Journal, 2012, 21(5): 637-50.
    [2]
    Kang U, Tong H H, Sun J M, et al. Gbase: A scalable and general graph management system[C]// Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, USA: ACM Press, 2011: 1091-1099.
    [3]
    Malewicz G, Austern M H, Bik A J C, et al. Pregel: A system for large-scale graph processing[C]// Proceedings of the ACM SIGKDD International Conference on Management of Data. Calgary, Canada: ACM Press, 2010: 135-146.
    [4]
    Low Y C, Bickson D, Gonzalez J, et al. Distributed GraphLab: a framework for machine learning and data mining in the cloud[J]. Proceedings of the VLDB Endowment, 2012, 5(8): 716-27.
    [5]
    Gonzalez J E, Low Y C, Gu H J, et al. PowerGraph: Distributed graph-parallel computation on natural graphs[C]// Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. Berkeley, USA: USENIX Association, 2012: 1-5.
    [6]
    Kyrola A, Blelloch G, Guestrin C. GraphChi: Large-scale graph computation on just a PC[C]// Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. Berkeley, USA: USENIX Association, 2012: 31-46.
    [7]
    Han W S, Lee S, Park K, et al. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC[C]// Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 2013: 77-85.
    [8]
    Holme P, Saram倞ki J. Temporal networks[J]. Physics Reports, 2012, 519(3): 97-125.
    [9]
    Megiddo N, Modha D S. ARC: A Self-Tuning, Low Overhead Replacement Cache[C]// Proceedings of the 2nd USENIX Conference on File and Storage Technologies. San Francisco, USA: USENIX Association, 2003: 115-130.
    [10]
    Eagle N, Pentland A. Reality mining: sensing complex social systems[J]. Personal and Ubiquitous Computing, 2006, 10(4): 255-68.
    [11]
    Ficek M, Kencl L. Spatial extension of the reality mining dataset[C]// Proceedings of the International Conference on Mobile Adhoc and Sensor Systems. San Francisco: IEEE Press, 2010: 666-673.
    [12]
    Pentland A. Reality Mining of Mobile Communications: Toward a New Deal on Data [M]. Springer, 2009.
    [13]
    Kiukkonen N, Blom J, Dousse O, et al. Towards rich mobile phone datasets: Lausanne data collection campaign[C]// Proceedings of the International Conference on Pervasive Services. Berlin, Germany: ACM Press, 2010: 1-7.)
  • 加载中

Catalog

    [1]
    Kang U, Tong H H, Sun J M, et al. Gbase: An efficient analysis platform for large graphs [J]. The VLDB Journal, 2012, 21(5): 637-50.
    [2]
    Kang U, Tong H H, Sun J M, et al. Gbase: A scalable and general graph management system[C]// Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, USA: ACM Press, 2011: 1091-1099.
    [3]
    Malewicz G, Austern M H, Bik A J C, et al. Pregel: A system for large-scale graph processing[C]// Proceedings of the ACM SIGKDD International Conference on Management of Data. Calgary, Canada: ACM Press, 2010: 135-146.
    [4]
    Low Y C, Bickson D, Gonzalez J, et al. Distributed GraphLab: a framework for machine learning and data mining in the cloud[J]. Proceedings of the VLDB Endowment, 2012, 5(8): 716-27.
    [5]
    Gonzalez J E, Low Y C, Gu H J, et al. PowerGraph: Distributed graph-parallel computation on natural graphs[C]// Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. Berkeley, USA: USENIX Association, 2012: 1-5.
    [6]
    Kyrola A, Blelloch G, Guestrin C. GraphChi: Large-scale graph computation on just a PC[C]// Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. Berkeley, USA: USENIX Association, 2012: 31-46.
    [7]
    Han W S, Lee S, Park K, et al. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC[C]// Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 2013: 77-85.
    [8]
    Holme P, Saram倞ki J. Temporal networks[J]. Physics Reports, 2012, 519(3): 97-125.
    [9]
    Megiddo N, Modha D S. ARC: A Self-Tuning, Low Overhead Replacement Cache[C]// Proceedings of the 2nd USENIX Conference on File and Storage Technologies. San Francisco, USA: USENIX Association, 2003: 115-130.
    [10]
    Eagle N, Pentland A. Reality mining: sensing complex social systems[J]. Personal and Ubiquitous Computing, 2006, 10(4): 255-68.
    [11]
    Ficek M, Kencl L. Spatial extension of the reality mining dataset[C]// Proceedings of the International Conference on Mobile Adhoc and Sensor Systems. San Francisco: IEEE Press, 2010: 666-673.
    [12]
    Pentland A. Reality Mining of Mobile Communications: Toward a New Deal on Data [M]. Springer, 2009.
    [13]
    Kiukkonen N, Blom J, Dousse O, et al. Towards rich mobile phone datasets: Lausanne data collection campaign[C]// Proceedings of the International Conference on Pervasive Services. Berlin, Germany: ACM Press, 2010: 1-7.)

    Article Metrics

    Article views (17) PDF downloads(77)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return