Traditional movie gross predictions are based on numerical and categorical movie data from The Internet Movie Database (IMDB). In this paper, we use the quantitative news data generated by Lydia, our system for large-scale news analysis, to help people to predict movie grosses. By analyzing two different models (regression and k-nearest neighbor models), we find models using only news data can achieve similar performance to those using IMDB data. Moreover, we can achieve better performance by using the combination of IMDB data and news data. Further, the improvement is statistically significant.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below