ISSN 0253-2778

CN 34-1054/N

2017 Vol. 47, No. 1

Display Method:
Research Article
National matriculation test prediction based on support vector machines
ZHANG Li, LU Xingning, LU Conglin, WANG bangjun, LI Fanzhang
2017, 47(1): 1-9. doi: 10.3969/j.issn.0253-2778.2017.01.001
Support vector machine(SVM), one of machine learning methods, is very impressive for its good generalization and powerful nonlinearly processing ability. SVM was combined with national matriculation, where scores of six mock exams are taken as training data to predict the final admission scores. Three situations were considered. First, the scores of NMT were predicted using scores in six simulation tests. Second, the admission batch was predicted by using scores in six simulation tests and NMT. Third, the admission batch was predicted by using scores in six simulation tests and the estimated scores in NMT. In all experiments, SVMs were compared with neural networks (NNs). Experimental results show that SVMs are much more stable and have better prediction ability.
Self-adaption fusion algorithm for lung cancer PET/CT based on Piella frame and DT-CWT
ZHOU Tao, LU Huiling, WEI Xinyu, XIA Yong
2017, 47(1): 10-17. doi: 10.3969/j.issn.0253-2778.2017.01.002
By analyzing the Piella framework and multi-scale analysis theory, four methods, or fusion paths, for constructing pixel level fusion rules are presented on the basis of the Piella framework. A self-adaption fusion algorithm of PET/CT based on Piella frame and DT-CWT was proposed on the basis of the first fusion path, Firstly, DTCWT was used to decompose the registration PET and CT image to get the low-frequency and high-frequency components. Secondly, according to the characteristics of low-frequency, fully considering the area of lesions position was smaller in the whole image and the vital importance to highlight the lesions by dealing with the background of medical image reasonably, the low-frequency components are fused by self-adaption combination of membership function. Thirdly,according to the characteristics of high-frequency sub-bands which reflected details of images and edge information, and their great influences on the degree of image sharpness and edge distortion, the energy difference of decomposition coefficient was used as the matching measure, regional energy was used as an activity measure, and the combination of weighting and selection method was used to determine decision factor in high frequency component. Finally, two experiments were done, one a comparison with the other pixel-level fusion algorithms and the other an objective evaluation of fusion effect. The experimental results shown that the algorithm can better retain and show the edge and texture information of lesions.
A computing method for attribute importance based on BP neural network
PAN Qingxian, DONG Hongbin, HAN Qilong, WANG Yingjie, DING Rui
2017, 47(1): 18-25. doi: 10.3969/j.issn.0253-2778.2017.01.003
As an important method for machine learning, artificial neural network has been applied successfully in artificial intelligence, pattern recognition, image processing and other fields. As the essence of neural network learning, BP network utilizes the error back propagation to correct weights continually in order to achieve the best-fit. The multi-attribute decision-making problem is a hotspot in decision theory. When involving multiple attributes, it needs to analyze the importance degrees for different attributes, i.e., weights of attributes.According to the correlation and importance problems of multiple input attributes for multi-classification output results, an importance method for calculating complex input attributes based on BP neural network was proposed. In addition, the BP neural network model for calculating the importance degrees of attributes was established through researching the number of nodes, the layers of network, learning strategies and learning factors in neural networks. The data of teaching evaluation of Yantai University is utilized to verify the feasibility and validity of the proposed method through applying k-fold approach.
An unsupervised boundary detection algorithm based on orientation contrast model
2017, 47(1): 26-31. doi: 10.3969/j.issn.0253-2778.2017.01.004
For large image sets on the Web, due to the absense of a ground truth boundary or the high cost of getting one, an unsupervised boundary detection algorithm based on orientation contrast model was proposed. The model is especially suited for detecting object boundaries surrounded by natural textures. In the Rug image database, the algorithm outperforms the state-of-the-art unsupervised boundary detection algorithm, which verifies the validity of the model.
Prototype based relative attribute learning
LIU Dakun, QIN Xiaoqian
2017, 47(1): 32-39. doi: 10.3969/j.issn.0253-2778.2017.01.005
According to the research on representation learning, a proper feature representation of data has a greater impact than classifiers on classification. It’s almost become the most important part in system design. In this paper, based on prototype theorem in psychology, a new feature is proposed. Specifically, the prototype dataset is composed of representative data of extra datasets. Then, the rank functions are derived based on the relationship between the prototype dataset and any data set. Thus, any data could be represented via the rank functions and the values of the functions are their new features. The proposed method has been checked on the MINST database and Pubfig database. Compared with the gray-scale feature and attribute, the prototype based relative attribute is more reasonable and has better performance.
A dominance-based multigranulation rough sets approach for dynamic updating approximations
HU Chengxiang, ZHAO Guozhu
2017, 47(1): 40-47. doi: 10.3969/j.issn.0253-2778.2017.01.006
With the variation of the collected data, useful information obtained dynamically from the information system plays an important role in decision making. The properties of updating approximations in dominance-based optimistic and pessimistic multigranulation rough sets were discussed. An approach to dynamically updating approximations in dominance-based optimistic and pessimistic multigranulation rough sets while adding a granulation structure in multigranulation environment was presented. The approach does not need to recalculate the dominance classes and approximations of each granulation structure in the universe. The dominance classes of each object were calculated with respect to the added granulation structure, and then the approximations can be obtained by the properties of updating approximations in dominance-based optimistic and pessimistic multigranulation rough sets which can improve the efficiency of updating approximations. The experimental results demonstrate the validity of the proposed approach while comparing with the static algorithm.
A visualization method for analyzing sub-topics of hot events in microblogs
LI Yilin, ZHU Jiaqi, WU Yunkun, WANG Hongan
2017, 47(1): 48-56. doi: 10.3969/j.issn.0253-2778.2017.01.007
Abundant information can be gained from massive microblog data. Microblogs record the whole process of hot events and people’s reactions. It is increasingly important to obtain meaningful and useful information from microblogs, shape a clear picture of the evolution process of hot event and discover some turning points in the hot event. Existing solutions are mainly based on word frequency, which lacks abstract description to sub-topics. This paper proposes a new interactive visualization method that combines the techniques of topic extraction and word frequency statistics, to visualize the evolution process of sub-topics in different granularities. By observing the variation of word distributions in sub-topics for adjacent time intervals, turning-point events related to some sub-topics can be discovered, and then corresponding contents in the microblog can be tracked with the aid of word co-occurrence graphs. During the interactive process, the parameters in the method can be adjusted by users and optimal values can be eventually determined for a better understanding of turning-point events as well as the evolution process of the hot event. Experiments are conducted on real Sina Weibo datasets, and the results demonstrate that this method is more effective than existing ones based on word frequency and topic trends separately.
Kinship classification through random bilinear classifier
QIN Xiaoqian, LIU Dakun, WANG Dong
2017, 47(1): 57-62. doi: 10.3969/j.issn.0253-2778.2017.01.008
Kinship verification has seen extensive applications in recent years, such as determination of the identity of a suspect and finding missing children. Recent research has demonstrated that machine learning algorithms can handle kinship verification fairly well. However, kinship verification has remained a major challenge in the field of computer vision, answering such questions as which parents a child in a photo belongs to. Understanding such questions would have a fundamental impact on the behavior of an artificial intelligent agent working in a human world. To address this issue, a random bilinear classifier (RBC) for kinship classification was presented by effectively exploring the dependence structure between child and parents in two aspects: similarity measure and classifier design. In addition, the stability of the random selection of samples was ensured by imposing the constraint of the similarity of those non-kin relationship image groups. Extensive experiments on TSKinFace and Family101 show that the proposed method can obtain better or comparable results.
Improving emotion expression extraction in Chinese microblogs via new words detection
WAN Qi, YU Zhonghua, CHEN Li, SONG Leilei, DIN Gejian
2017, 47(1): 63-69. doi: 10.3969/j.issn.0253-2778.2017.01.009
Emotion expression extraction is one of the important tasks of fine-grained sentiment mining. Existing methods lack efficiency in dealing with this task in Chinese microblogs because there are many new words and non-standard words in them. It’s found in this paper that a large number of new words are distributed in emotional expressions of the text in Chinese microblogs. A combined extraction model based on CRF is proposed, which incorporates new word detection into the task to improve the original work. The experimental results show that new word detection has good correlation with emotion expression extraction from Chinese microblogs, and that F1 value increases more than 2% on both the data sets of the movie field and the open field in Chinese microblogs.
A k-medoids based clustering algorithm in location based social networks
LUO Weijia, QIAO Shaojie, HAN Nan, YUAN Changan, BI Yingzhou, SHU Hongping
2017, 47(1): 70-79. doi: 10.3969/j.issn.0253-2778.2017.01.010
The commonly-used clustering algorithms have several drawbacks. Aiming to solve the above problems, an improved k-medoids algorithm was proposed based on the initial radius r, which is used for clustering using location data. The algorithm is actually a density-based clustering approach. The difference is that the k value depends on the radius r. Extensive experiments are conducted on real check-in data, and the results show that the improved k-mediods algorithm on the radius r is more stable. In addition, by comparing the sum of the square of distance between objects in the same cluster among different algorithms, the proposed algorithm can obtain better clustering results and convergence speed when applied to location based social networks. Compared to the traditional k-medoids algorithm, the cost has obviously reduced, as for and the degraded k-medoids algorithm, the cost can be reduced among 1.2% and 2%.
On automatic construction of the word base for historical program repository
SUN Weisong, SUN Xiaobing, LI Bin, YANG Hui
2017, 47(1): 80-86. doi: 10.3969/j.issn.0253-2778.2017.01.011
Developers or maintainers are finging it harder to understand and maintain software. Given a system under maintenance, developers may use a code search technique to locate the code of their interests. However, it may be difficult for them to understand the elements and the relation among them for the system at hand. Thus, the returned code may not fit their needs. It is thus necessary to have a word base to recover the elements and their relations for a target system. A tool, WB4HPR(word base for historical program repository) was introduced which focuses on building the word base for a specific system. WB4HPR can retrieve the words, recover the relationship between them, and display the evolution of these words during the program evolution to help developers and maintainers comprehend the program, while effectively keeping the consistency of the use of words during the software maintenance process.