ISSN 0253-2778

CN 34-1054/N

2014 Vol. 44, No. 7

Display Method:
Original Paper
A density-based hierarchical clustering algorithm of gene data based on MapReduce
TU Jinjin, YANG Ming, GUO Lina
2014, 44(7): 537-543. doi: 10.3969/j.issn.0253-2778.2014.07.001
The amount of gene expression data scale is increasing sharply with the rapid development of bio-informatics technology, which poses a serious challenge for traditional clustering algorithms. Density-based hierarchical clustering (DHC) can solve the problem of the nested class of gene expression data and has good robustness, but for handling huge amounts of data. Therefore, a density-based hierarchical clustering algorithm on MapReduce(DisDHC) was proposed. It partitioned data sets into smaller blocks, clustered each block using DHC in parallel, gathered the result for re-clustering, and produced all density centers of each cluster. The experiments on GAL dataset, Cell cycle dataset, and Serum dataset show that DisDHC reduces clustering time and achieves high performance.
A novel anonymization method based on anatomy and reconstruction in LBS privacy preservation
LIN Yu, HAN Jianmin, YU Juan, JIA Jiong, ZHAN Huangbin
2014, 44(7): 544-553. doi: 10.3969/j.issn.0253-2778.2014.07.002
Most of the existing methods are realized by temporal and spatial cloaking techniques. However, these cloaking-based methods are disadvantageous due to their high computation loads and long response delays, which lowers service quality. To address these problems, a novel technique, anatomy and reconstruction, was proposed. This technique first partitions the LBS query set into several equivalence classes, making sure that each equivalence class satisfies the given anonymity constraints. Then it reconstructs the LBS queries in each equivalence class according to the predefined strategies separately, and generates a new set of anonymous queries. Considering various privacy requirements, a series of anonymity models were proposed, and a unified anonymization algorithm MBFAA was introduced to realize these models. Experimental results show that the proposed method can effectively implement all the anonymity models.
A novel video replica placement strategy for storage cloud-based CDN
YAO Shijia, ZHU Ming, CUI Haoming
2014, 44(7): 554-562. doi: 10.3969/j.issn.0253-2778.2014.07.003
Online video service needs the support of CDN (content delivery network) which can be costly. Using cloud storage nodes to deliver video content can help solve the problem. To guarantee the users QoS, CDN should pre-deploy the content files of online video service to the edge nodes which are close to the users. The existing GS (greedy site) placement algorithm can satisfy the QoS requirements at a relatively smaller cost when the historical log is provided. However, the GS algorithm will result in bad load balance among cloud storage nodes. A novel replica placement algorithm named GUCP (greedy user core preallocation) was proposed to solve this problem. The algorithm assigned users on overloaded nodes to lightly-loaded ones using the k-means algorithm, in order to balance the load. The numerical experimental results show that the cost and QoS performance of GUCP are very close to those of GS, while its performance of load balance is much better than that of GS.
Community detection based on structure and fitness
GAO Qihang, JING Liping, YU Jian, LIN Youfang
2014, 44(7): 563-569. doi: 10.3969/j.issn.0253-2778.2014.07.004
Many systems can be described as complex social networks, and increasing attention has been paid to the detection of social communities out of complex social networks. Structured-based community detection can be achieved locally without knowledge of the overall situation. The community fitness characteristics of social networks can help to identify community structures at different fitnesses. A new algorithm based on structure and fitness was proposed to test large generated networks and real networks. Experiments had shown its better efficiency and higher accuracy.
The k-means algorithm based on Finsler geometry
XU Qing, LI Fanzhang, ZOU Peng
2014, 44(7): 570-575. doi: 10.3969/j.issn.0253-2778.2014.07.005
The problems with the k-means algorithm that the optimization effect of similarity measure and criterion function is not ideal and the analysis performance of multi-dimensional manifold data is ineffective, a modified version based on Finsler geometry was proposed, which introduces Finsler metric. Experimental results in comparison with traditional k-means algorithm and SBKM algorithm on UCI data sets and ORL face image sets show the feasibility and effectiveness of the algorithm.
A posts recommendation method based on the collaborative filtering and PageRank
CAO Yang, LIU Song, GUO Jianyi, YU Zhengtao, ZHOU Feng, MAO Cunli
2014, 44(7): 576-581. doi: 10.3969/j.issn.0253-2778.2014.07.006
In order to solve the problem of information overload in the post bar, a method of information filtration was proposed based on the users commenting behavior. After analyzing the properties of the recommended posts, the importance of an individual user was evaluated by the PageRank algorithm, in which the weight of replies to the posts among users and the weight of reply intervals were taken into consideration. The users with a high PageRank score were then taken as a cluster center in k-means clustering. The similarity between two groups of users (one from the clustering analysis and the other from the recommending system) was calculated by a collaborative filtering algorithm. The posts with high correlations to the users were presented as the recommended results. Experimental results show that the proposed method performs better than the recommending methods in use.
An improved model for information dissemination and prediction on microblog networks
DING Xin, LIU Qicheng, ZHANG Wei
2014, 44(7): 582-589. doi: 10.3969/j.issn.0253-2778.2014.07.007
The rapid development of microblogging not only has led people into the We-Media era, but also creates favorable conditions for the rapid dissemination of information. Taking direct immunization into consideration, an improved SIR model was presented which uses differential equations to study the rules of information dissemination and to predict information on microblogs. The experiment result shows that the improved SIR model can effectively reflect the rules of information dissemination, and can predict microblog information dissemination more precisely than the microblog standard SIR model.
Research of a novel cloud task scheduling algorithm
ZHOU Fachao, WANG Zhijian, YE Feng
2014, 44(7): 590-598. doi: 10.3969/j.issn.0253-2778.2014.07.008
With its flexibility, guaranteed quality of service and on-demand features such as resource allocation model, cloud computing is often used to handle large computing tasks, so efficient task scheduling strategies of cloud computing play a vital role. Given the uncertainty in the number of tasks and the time of arrival at the server, and the fact that, users tend to have certain expectations (such as task priority, execution time, etc.) for the implementation of the tasks, reasonable allocation of computing resources for task scheduling to satisfy users QoS requirements is of great importance. A novel QoS-aware task scheduling mechanism (QTS) was proposed, this scheduling mechanism can best meet the users QoS requirements. By comparing QTS with RR, Max-Min and Min-Min scheduling policies by CloudSim simulation, it was found that QTS is a more effective task scheduling mechanism.
Universe model of abstract interpretation
WANG Zhenzhen, NI Qingjian, ZHANG Zhizheng, XING Hancheng
2014, 44(7): 599-604. doi: 10.3969/j.issn.0253-2778.2014.07.009
Since its introduction in 1977, abstract interpretation has inspired a lot of research and is now widely applied in program analyses and verification fields. Therefore, a universe model was constructed for existing studies on abstract interpretation, which unifies and is equivalent to all the current frameworks of abstract interpretation. Based on this, several fundamental problems were raised about abstract interpretation that need to be solved. This model and the relevant problems can be viewed as the basic points for further development of abstract interpretation theory.
Stator-flux-based sliding-mode MRAS speed estimation of doubly-fed wind power generator
WANG Qinglong, WANG Zengfu, YANG Shuying, XIE Zhen, ZHANG Xing
2014, 44(7): 605-611. doi: 10.3969/j.issn.0253-2778.2014.07.010
Under the execrable operating environment,the mechanical sensor for detecting the position and speed signals of the doubly-fed wind power generator leads to more faults and inconvenient maintenance. To solve the problem, a stator flux-based sliding-mode model reference adaptive system (MRAS) speed estimator was proposed. In the proposed sliding-mode MRAS estimator, the stator flux voltage model of the doubly-fed wind power generator was used to obtain the reference model and its current model as the adaptive model, and a slide-mode surface was designed according to the two model output cross product, and the reaching conditions for the sliding mode were analyzed by using the small-signal model. Meanwhile the continuous saturation function is introduced to overcome the high frequency chattering problem. And the performances of the proposed control method were tested in Matlab/SimuLink.
A firefly algorithm with chaotic diversity control
XU Huali, SU Shoubao, YAN Renrong, MA Yan
2014, 44(7): 612-617. doi: 10.3969/j.issn.0253-2778.2014.07.011
To overcome the disadvantage of premature convergence in the firefly algorithm, a firefly algorithm based on chaos diversity control (CDFA) was proposed. Applying chaotic mapping, CDFA achieved an initial firefly population that is high quality and uniformly distributed; it then disturbed some individuals with low fitness values by chaotic mapping in the process of the search so as to keep the groups activity and reduce the possibility of falling into local optimum; meanwhile, in order to increase the diversity of the population, the proposed algorithm used the physical reflection theory to control the position of the firefly outside the borders. Experimental results of bench mark functions show that CDFA can effectively improve the ability of the global search and local exploitation and has a better optimization precision and convergence rate than the basic FA.
Control scheme for networked systems based on model parameter identification
LI Meilian
2014, 44(7): 618-622. doi: 10.3969/j.issn.0253-2778.2014.07.012
In view of the limited bandwidth issues prevalently existing in the network control system (NCS), a system control scheme based on model parameter identification was presented. The system consists of several subsystems,each subsystem including an identification module, an update module and a controller module. In the closed-loop system update cycle, NCS constructed a dynamic model of the controlled object and reconstructed its state through online identification by the identification module, the update module updated the model parameter of the controlled object by the identified parameters, and finally the controller module output control variables. By designing rational cycle time and adjusting the matching degree of the module with the controlled object, the systems requirement for network bandwidth was reduced, and network utilization was improved, and thus the model-based control scheme was achieved. Matlab simulation was conducted in combination with the networked system of DC motors and the result shows the effectiveness of the proposed method.