ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

A dual encoder-based approach to predicting stock price by leveraging online social network

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2020.08.008
  • Received Date: 17 June 2020
  • Accepted Date: 02 July 2020
  • Rev Recd Date: 02 July 2020
  • Publish Date: 31 August 2020
  • We propose a dual-encoder which encodes the investor sentiment and technical indicators separately to improve the accuracy of the encoder-decoder model in predicting stock price by using two types of information. For the dual-encoder and decoder, we revise the gated recurrent unit (GRU) by removing the reset gate, using the update gate instead of the reset gate function and replacing tanh activation function with ReLU activation function to improve the speed of network training and the accuracy of the model. We regard market sentiment as a discrete-time stochastic process. When fixed time, market sentiment is a variable subject to a certain probability distribution. Sentiment score formulas are built for investor sentiment by a pseudo-label based sentiment classifier, and the market sentiment is estimated through ensemble Bagging learning. The orthogonal table experiment design is used to select parameters in our dual-encoder based model, which greatly reduces the time of parameter adjustment. Finally, experiments are conducted to show that our dual-encoder based model is more accurate than encoder-decoder model, and investor sentiment helps improve the stock forecasting in our model.
    We propose a dual-encoder which encodes the investor sentiment and technical indicators separately to improve the accuracy of the encoder-decoder model in predicting stock price by using two types of information. For the dual-encoder and decoder, we revise the gated recurrent unit (GRU) by removing the reset gate, using the update gate instead of the reset gate function and replacing tanh activation function with ReLU activation function to improve the speed of network training and the accuracy of the model. We regard market sentiment as a discrete-time stochastic process. When fixed time, market sentiment is a variable subject to a certain probability distribution. Sentiment score formulas are built for investor sentiment by a pseudo-label based sentiment classifier, and the market sentiment is estimated through ensemble Bagging learning. The orthogonal table experiment design is used to select parameters in our dual-encoder based model, which greatly reduces the time of parameter adjustment. Finally, experiments are conducted to show that our dual-encoder based model is more accurate than encoder-decoder model, and investor sentiment helps improve the stock forecasting in our model.
  • loading
  • [1]
    CHONG E, HAN C, PARK F C. Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies[J]. Expert Systems with Applications, 2017, 83: 187-205.
    [2]
    FISCHER T, KRAUSS C. Deep learning with long short-term memory networks for financial market predictions[J]. European Journal of Operational Research, 2017, 270: 654-669.
    [3]
    QIN Y, SONG D, CHEN H, et al. A dual-stage attention-based recurrent neural network for time series prediction[DB/OL]. [2020-03-01] https://arxiv.org/abs/1704.02971.
    [4]
    罗伯特·E·霍尔,马可·利伯曼.股票市场和宏观经济[M]// 经济学:原理与应用.2版. 北京:中信出版社, 2003.
    [5]
    DE LONG J B, SHLEIFER A, SUMMER L H, et al. Positive feedback investment strategies and destabilizing rational speculation[J]. The Journal of Finance, 1990, 45: 379-395.
    [6]
    NOFER M, HINZ O. Using Twitter to predict the stock market[J]. Business & Information Systems Engineering, 2015, 57: 229-242.
    [7]
    PENG Y, HUI J. Leverage financial news to predict stock price movements using word embeddings and deep neural networks[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1506.07220.
    [8]
    CHEN W, YEO C K, LAU C T, et al. Leveraging social media news to predict stock index movement using RNN-boost[J]. Data & Knowledge Engineering, 2018, 118: 14-24.
    [9]
    BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
    [10]
    BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1409.0473.
    [11]
    CHO K, VAN MERRIENBOER B,GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: Association for Computational Linguistics, 2014: 1724-1734.
    [12]
    BAHDANAU D, CHOROWSKI J, SERDYUK D, et al. End-to-end attention-based large vocabulary speech recognition[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1508.04395.
    [13]
    DEVLIN J, CHANG M W, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1810.04805.
    [14]
    LEE D H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks[C]// ICML 2013 Workshop : Challenges in Representation Learning (WREPL), Atlanta, Georgia, USA, 2013.
    [15]
    OLIVER A, ODENA A, RAFFEL C, et al. Realistic evaluation of deep semi-supervised learning algorithms[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1804.09170,2018.
    [16]
    LU X, NI B. BERT-CNN:A hierarchical patent classier based on a pre-trained language model[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1911.06241.
    [17]
    BENGIO Y, SIMARD P, FRASCONI P. Learning long-term dependencies with gradient descent is difficult[J]. IEEE Transactions on Neural Networks, 1994, 5(2): 157-166.
    [18]
    HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
    [19]
    CHUNG J, GULCLEHRE, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1412.3555.
    [20]
    JOZEFOWICZ R, ZAREMBA W, SUTSKEVER I. An empirical exploration of recurrent network architectures[C]// ICML’15: Proceedings of the 32nd International Conference on International Conference on Machine Learning. JMLR.org, 2015, 37: 2342-2350.
    [21]
    HEDAYAT A S, SLOANE N J A, STUFKEN J. Orthogonal Arrays: Theory and Applications[M]. New York: Springer, 1999.
    [22]
    KINGMA D P, BA J. Adam: A method for stochastic optimization[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1412.6980.
    [23]
    KHAIDEM L, SAHA S, DEY S R. Predicting the direction of stock market prices using random forest[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1605.00003.
    [24]
    BROWN R G, MEYER R F. The fundamental theorem of exponential smoothing[J]. Operations Research, 1961, 9(5): 673-685.
    [25]
    ZHOU G B, WU J, ZHANG C L, et al. Minimal gated unit for recurrent neural networks[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1603.09420.
    [26]
    RAVANELLI M, BRAKEL P,OMOLOGO M, et al. Light gated recurrent units for Speech Recognition[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2018, 2: 92-102.)
  • 加载中

Catalog

    [1]
    CHONG E, HAN C, PARK F C. Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies[J]. Expert Systems with Applications, 2017, 83: 187-205.
    [2]
    FISCHER T, KRAUSS C. Deep learning with long short-term memory networks for financial market predictions[J]. European Journal of Operational Research, 2017, 270: 654-669.
    [3]
    QIN Y, SONG D, CHEN H, et al. A dual-stage attention-based recurrent neural network for time series prediction[DB/OL]. [2020-03-01] https://arxiv.org/abs/1704.02971.
    [4]
    罗伯特·E·霍尔,马可·利伯曼.股票市场和宏观经济[M]// 经济学:原理与应用.2版. 北京:中信出版社, 2003.
    [5]
    DE LONG J B, SHLEIFER A, SUMMER L H, et al. Positive feedback investment strategies and destabilizing rational speculation[J]. The Journal of Finance, 1990, 45: 379-395.
    [6]
    NOFER M, HINZ O. Using Twitter to predict the stock market[J]. Business & Information Systems Engineering, 2015, 57: 229-242.
    [7]
    PENG Y, HUI J. Leverage financial news to predict stock price movements using word embeddings and deep neural networks[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1506.07220.
    [8]
    CHEN W, YEO C K, LAU C T, et al. Leveraging social media news to predict stock index movement using RNN-boost[J]. Data & Knowledge Engineering, 2018, 118: 14-24.
    [9]
    BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
    [10]
    BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1409.0473.
    [11]
    CHO K, VAN MERRIENBOER B,GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: Association for Computational Linguistics, 2014: 1724-1734.
    [12]
    BAHDANAU D, CHOROWSKI J, SERDYUK D, et al. End-to-end attention-based large vocabulary speech recognition[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1508.04395.
    [13]
    DEVLIN J, CHANG M W, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1810.04805.
    [14]
    LEE D H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks[C]// ICML 2013 Workshop : Challenges in Representation Learning (WREPL), Atlanta, Georgia, USA, 2013.
    [15]
    OLIVER A, ODENA A, RAFFEL C, et al. Realistic evaluation of deep semi-supervised learning algorithms[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1804.09170,2018.
    [16]
    LU X, NI B. BERT-CNN:A hierarchical patent classier based on a pre-trained language model[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1911.06241.
    [17]
    BENGIO Y, SIMARD P, FRASCONI P. Learning long-term dependencies with gradient descent is difficult[J]. IEEE Transactions on Neural Networks, 1994, 5(2): 157-166.
    [18]
    HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
    [19]
    CHUNG J, GULCLEHRE, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1412.3555.
    [20]
    JOZEFOWICZ R, ZAREMBA W, SUTSKEVER I. An empirical exploration of recurrent network architectures[C]// ICML’15: Proceedings of the 32nd International Conference on International Conference on Machine Learning. JMLR.org, 2015, 37: 2342-2350.
    [21]
    HEDAYAT A S, SLOANE N J A, STUFKEN J. Orthogonal Arrays: Theory and Applications[M]. New York: Springer, 1999.
    [22]
    KINGMA D P, BA J. Adam: A method for stochastic optimization[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1412.6980.
    [23]
    KHAIDEM L, SAHA S, DEY S R. Predicting the direction of stock market prices using random forest[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1605.00003.
    [24]
    BROWN R G, MEYER R F. The fundamental theorem of exponential smoothing[J]. Operations Research, 1961, 9(5): 673-685.
    [25]
    ZHOU G B, WU J, ZHANG C L, et al. Minimal gated unit for recurrent neural networks[DB/OL]. [2020-03-01]. https://arxiv.org/abs/1603.09420.
    [26]
    RAVANELLI M, BRAKEL P,OMOLOGO M, et al. Light gated recurrent units for Speech Recognition[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2018, 2: 92-102.)

    Article Metrics

    Article views (60) PDF downloads(173)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return