Skip to main content
Log in

Impact of public news sentiment on stock market index return and volatility

  • Original Paper
  • Published:
Computational Management Science Aims and scope Submit manuscript

Abstract

Recent advances in natural language processing have contributed to the development of market sentiment measures through text content analysis in news providers and social media. The effectiveness of these sentiment variables depends on the implemented techniques and the type of source on which they are based. In this paper, we investigate the impact of the release of public financial news on the S&P 500. Using automatic labeling techniques based on either stock index returns or dictionaries, we apply a classification problem based on long short-term memory neural networks to extract alternative proxies of investor sentiment. Our findings provide evidence that there exists an impact of those sentiments in the market in a 20-min time frame. We find that dictionary-based sentiment provides meaningful results that outperform those based on stock index returns, which partly fails in the mapping process between news and financial returns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. According to the authors, net buying pressure can be defined as “the difference between the number of buyer-initiated trades and the number of seller-initiated trades calibrated from the bid-ask quotes”.

  2. To avoid the look-ahead bias, the testing set is both out-of-sample and out-of-time in order to ensure that the training and validation sets contain observations earlier in time than the observations in the testing set.

  3. Further information can be found at https://www.merriam-webster.com/words-at-play/longest-words-ever.

  4. The integer values are discussed in the Sect. 4.1 on the classification results of the stock index returns approach.

  5. An alternative approach would be to use value thresholds for returns, denoted as \(t_1<0\) and \(t_2>0\), by imposing, for example, \(t_1 = -t_2\) or possibly \(t_1 \ne -t_2\). Although using value thresholds for positive and negative returns may appear to be a simple and immediate solution for categorization, it raises several issues related to selecting appropriate threshold values. The selection of threshold magnitudes requires careful consideration specific to the financial asset being studied and the data frequency, which ultimately involves analyzing distributional features. The choice of value thresholds should be specific to the dataset, making it necessary to analyze each case carefully. The only exception to this would be in the case of positive/negative categorizations, where a single threshold centered at zero can be used to identify the two classes.

  6. Full results are available upon request to the authors.

  7. GARCH type models have been successfully applied in measuring the relationship between investor sentiment and stock market returns-volatility (i.e., Rupande et al. 2019).

  8. Full results are available upon request to the authors.

  9. Additionally, we have repeated the analysis of the changes in the VIX which represents the forward-looking volatility of the S&P 500. Results are included in Appendix 1 and are qualitatively similar.

  10. The “correct” class of each article is the one that has been previously assigned through automatic labeling.

  11. We thank an anonymous Reviewer for this beneficial reference.

References

  • Atkins A, Niranjan M, Gerding E (2018) Financial news predicts stock market volatility better than close price. J Finance Data Sci 4:120–137

    Article  Google Scholar 

  • Behrendt S, Schmidt A (2018) The twitter myth revisited: intraday investor sentiment, twitter activity and individual-level stock return volatility. J Banking Finance 96:355–367

    Article  Google Scholar 

  • Black F (1976) Studies of stock price changes. In: Proceeding of the 1976 meetings of the American Statistical Association, pp 177–181

  • Caporin M, Poli F (2017) Building news measures from textual data and an application to volatility forecasting. Econometrics 5:1–46

    Article  Google Scholar 

  • Chen S, Guo Z, Zhao X (2021) Predicting mortgage early delinquency with machine learning methods. Eur J Oper Res 290:358–372

    Article  Google Scholar 

  • Chung S-L, Hung C-H, Yeh C-Y (2012) When does investor sentiment predict stock returns? J Empir Financ 19:217–240

    Article  Google Scholar 

  • Costola M, Iacopini M, Santagiustina CRMA (2020) Google search volumes and the financial markets during the COVID-19 outbreak. Finance Res Lett 42:101884

    Article  Google Scholar 

  • Feng L, Fu T, Shi Y (2022) How does news sentiment affect the states of Japanese stock return volatility? Int Rev Financ Anal 84:102267

    Article  Google Scholar 

  • Frugier A (2016) Returns, volatility and investor sentiment: evidence from European stock markets. Res Int Bus Financ 38:45–55

    Article  Google Scholar 

  • Garcia D (2013) Sentiment during recessions. J Financ 68:1267–1300

    Article  Google Scholar 

  • Groß-Klußmann A, Hautsch N (2011) When machines read the news: using automated text analytics to quantify high frequency news-implied market reactions. J Empir Financ 18:321–340

    Article  Google Scholar 

  • Hajek P, Myskova R, Olej V (2021) Predicting stock return volatility using sentiment analysis of corporate annual reports. In: The essentials of machine learning in finance and accounting. Routledge, pp 75–96

  • Harrison J (2022) R-package ‘rselenium’. https://github.com/ropensci/RSelenium

  • Harvard University (1960) General inquirer. http://www.wjh.harvard.edu/~inquirer/

  • Henry E (2008) Are investors influenced by how earnings press releases are written? J Bus Commun 45:363–407

    Article  Google Scholar 

  • Houlihan P, Creamer GG (2017) Can sentiment analysis and options volume anticipate future returns? Comput Econ 50:669–685

    Article  Google Scholar 

  • Huang X, Zhang W, Tang X, Zhang M, Surbiryala J, Iosifidis V, Liu Z, Zhang J (2021) Lstm based sentiment analysis for cryptocurrency prediction. In: International conference on database systems for advanced applications. Springer, pp 617–621

  • Iacopini M, Santagiustina CR (2021) Filtering the intensity of public concern from social media count data with jumps. J Roy Stat Soc Ser A (Stat Soc)

  • Jiang GJ, Tian YS (2005) The model-free implied volatility and its information content. Rev Financ Stud 18:1305–1342

    Article  Google Scholar 

  • Jin Z, Yang Y, Liu Y (2020) Stock closing price prediction based on sentiment analysis and lstm. Neural Comput Appl 32:9713–9729

    Article  Google Scholar 

  • Karpathy A (2015) CS231n convolutional neural networks for visual recognition. Linear classification: Support Vector Machine, Softmax classifier. http://cs231n.github.io/linear-classify/#softmax

  • Li X, Xie H, Chen L, Wang J, Deng X (2014) News impact on stock price return via sentiment analysis. Knowl Based Syst 69:14–23

    Article  Google Scholar 

  • Liu Y, Qin Z, Li P, Wan T (2017) Stock volatility prediction using recurrent neural networks with sentiment analysis. In: International conference on industrial. Springer, Engineering and Other Applications of Applied Intelligent Systems, pp 192–201

  • Loughran T, McDonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J Financ 66:35–65

    Article  Google Scholar 

  • Loughran T, McDonald B (2015) The use of word lists in textual analysis. J Behav Financ 16:1–11

    Article  Google Scholar 

  • Mandal PK, Mahto R (2019) Deep cnn-lstm with word embeddings for news headline sarcasm detection. In: 16th International conference on information technology-new generations (ITNG 2019). Springer, pp. 495–498

  • Mangee N (2018) Stock returns and the tone of marketplace information: Does context matter? J Behav Financ 19:396–406

    Article  Google Scholar 

  • Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5:1093–1113

    Article  Google Scholar 

  • Nigam K, Lafferty J, McCallum, A (1999) Using maximum entropy for text classification. In: IJCAI-99 workshop on machine learning for information filtering, Stockholom, Sweden, vol. 1, pp 61–67

  • Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  • Renault T (2017) Intraday online investor sentiment and return patterns in the us stock market. J Banking Finance 84:25–40

    Article  Google Scholar 

  • Rupande L, Muguto HT, Muzindutsi P-F (2019) Investor sentiment and stock return volatility: evidence from the Johannesburg stock exchange. Cogent Econ Finance 7:1–16

    Article  Google Scholar 

  • Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the azfin text system. ACM Trans Inf Syst (TOIS) 27:1–19

    Article  Google Scholar 

  • Shi Y, Ho K-Y, Liu W-M (2016) Public information arrival and stock return volatility: evidence from news sentiment and Markov regime-switching approach. Int Rev Econ Finance 42:291–312

    Article  Google Scholar 

  • Sohangir S, Wang D, Pomeranets A, Khoshgoftaar TM (2018) Big data: deep learning for financial sentiment analysis. J Big Data. https://doi.org/10.1186/s40537-017-0111-6

    Article  Google Scholar 

  • Souma W, Vodenska I, Aoyama H (2019) Enhanced news sentiment analysis using deep learning methods. J Comput Soc Sci 2:33–46

    Article  Google Scholar 

  • Uysal AK, Gunal S (2014) The impact of preprocessing on text classification. Inf Process Manag 50:104–112

    Article  Google Scholar 

  • Vicari M, Gaspari M (2021) Analysis of news sentiments using natural language processing and deep learning. AI Soc 36:931–937

    Article  Google Scholar 

  • Wan X, Yang J, Marinov S, Calliess J-P, Zohren S, Dong X (2021) Sentiment correlation in financial news networks and associated market movements. Sci Rep 11:1–12

    Article  Google Scholar 

  • Wang G, Wang T, Wang B, Sambasivan D, Zhang Z, Zheng H, Zhao BY (2015) Crowds on Wall Street: Extracting value from social investing platforms. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing. ACM, pp 17–30

  • Wang C, Wang T, Yuan C, Rong JY (2022) Learning to trade on sentiment. J Econ Finance 46:308–323

    Article  Google Scholar 

  • Wickham H (2016) R-package ‘rvest’, p 156. https://cran.r-project.org/web/packages/rvest/rvest.pdf

  • Xing FZ, Cambria E, Welsch RE (2018) Natural language based financial forecasting: a survey. Artif Intell Rev 50:49–73

    Article  Google Scholar 

  • Yadav R, Kumar AV, Kumar A (2019) News-based supervised sentiment analysis for prediction of futures buying behaviour. IIMB Manag Rev 31:157–166

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Corazza.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank very much the Associate Editor and two anonymous Reviewers for their valuable suggestions and insightful remarks. We also thank Luca Coraggio of the University of Naples, Tomasz Gubiec of the University of Warsaw, Dian Kusumaningrum of Prasetiya Mulya University, Giovanni Zambruno of the University of Milano-Bicocca, and the other participants of the 2nd One-Day Workshop on Machine Learning for Finance, held at the Ca’ Foscari University of Venice in 2020.

A: Stock index returns approach using the VIX

A: Stock index returns approach using the VIX

In this appendix, we replicate the stock index returns approach using the VIX as the matching variable. The VIX is considered a superior predictor of historical volatility since it is based on option prices that reflect the future expectations of market participants (see, for instance, Jiang and Tian 2005).

Due to shock asymmetry, volatility is usually higher when the S&P 500 Index returns are negative and might be lower when the S&P 500 Index returns are positive. Following the analysis in the main text, we do not impose any asymmetry in the weighting structure and therefore, process news for the VIX following the same method as for returns. As discussed in the paper, news articles were classified as positive in case of log returns higher than the 55th percentile, negative in case of log returns lower than the 45th percentile, and neutral otherwise. The results are presented in Table 12. The period considered and number of articles analyzed were the same as for stock returns. Table 13 shows the accuracy of the results for the four models according to the time windows and the presence of the lagging method. Finally, Table 14 shows the four sentiment variables built on the basis of the classification methods presented in Table 13.

Table 12 Number of negative, neutral, and positive news from Reuters obtained through automatic labeling for the stock index return approach based on different classification methods
Table 13 Best accuracy of the classification obtained from the LSTM neural network applied to articles labeled with the log difference of the VIX Index
Table 14 Sentiment variables based on the classification obtained from the LSTM neural network applied on articles labeled with the log difference of the VIX Index

The estimates for the EGARCH model are presented in Tables 15 and 16 for \(S_{vix20}\) and \(S_{vix20lag}\), respectively. In both cases, we found similar evidence as for the sentiment built on the S&P 500. For instance, \(\lambda _m\) is significant and very close to zero in both \(S_{vix20}\) and \(S_{vix20lag}\) in seven and nine of the ten cases, respectively. In addition, in this case, there is a discordant sign among the different versions of the sentiment, confirming that the approach based on stock index returns is highly sensitive to the initial settings. Analogously, \(\lambda _m\) is significant in six and nine of ten cases for \(S_{vix20}\) and \(S_{vix20lag}\), respectively. As the tables shows, the coefficient exhibits different signs according to the different versions of sentiment. Also in the paper, we conclude that stock index returns do not represent a reliable approach since it fails in the mapping process between news and financial returns.

Table 15 \(S_{vix20}\) as exogenous variable in the mean and variance equations of the EGARCH(1,1) model with skewed Student’s t-distribution fitted on S&P 500 Index intraday log returns with 20-min time intervals
Table 16 \(S_{vix20lag}\) as lagged exogenous variable in the mean and variance equations of the EGARCH(1,1) model with skewed Student’s t-distribution fitted on S&P 500 Index intraday log returns with 20-min time intervals

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anese, G., Corazza, M., Costola, M. et al. Impact of public news sentiment on stock market index return and volatility. Comput Manag Sci 20, 20 (2023). https://doi.org/10.1007/s10287-023-00454-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10287-023-00454-2

Keywords

JEL Classification

Navigation