Macroeconomic forecasting is a classic problem, today most often modeled using time series analysis. Few attempts have been made using machine learning methods, and even fewer incorporating unconventional data, such as that from social media.
Therefore we utilize a Generative Adversarial Network (GAN) to predict U.S. unemployment and the U.S. stock index S&P 500. Furthermore, the Natural Language Processing (NLP) model DistilBERT is incorporated into the model for attempts at using Twitter data as a predictor.
The GAN model performs very well when predicting unemployment and beats the ARIMA benchmark on all horizons. The S&P 500 index is proven more difficult for the model. The attempts at using Twitter data and NLP do not beat the benchmark for the unemployment data but they do, however, show promising results with predictive power. When applied to S&P 500 data, the Twitter data does improve the accuracy and lets the model beat the benchmark on one horizon. This shows the potential of social media data when predicting a more erratic, and less seasonal, index that is more responsive to current trends in public discourse. The results also show that Twitter data can be used to predict trends in both unemployment and the S&P 500 index. This sets the stage for further research into NLP-GAN models for macroeconomic predictions using social media data.