Language models for financial news recommendation

  • Lavrenko V
  • Schmill M
  • Lawrie D
 et al. 
  • 2


    Mendeley users who have this article in their library.
  • N/A


    Citations of this article.


We present a unique approach to identifying news stories that influence the behavior of financial markets. Specficially, we describe the design and implementation of e-analyst, a system that can recommend interesting news stories - stories that are likely to affect market behavior. e-analyst operates by correlating the content of news stories with trends in financial time series. We identify trends in time series using piecewise linear fitting and then assign labels to the trends according to an automated binning procedure. We use language models to represent patterns of language that are highly associated with particular labeled trends. e-analyst can then identify and recommend news stories that are highly indicative of future trends. We evaluate the system in terms of its ability to recommend the stories that will affect the behavior of the stock market. We demonstrate that stories recommended by e-analyst could be used to profitably predict forthcoming trends in stock prices.

Author-supplied keywords

  • 10-fold randomization (aka cross-validation?)
  • Bayesian language modeling
  • Detection Error Tradeoff (DET) curves
  • IR techniques
  • Okapi TF IDF weighting
  • Recall Precision curves
  • TFIDf
  • Window
  • activity monitoring
  • affect market behaviour
  • alignment methods
  • application
  • bag-of-words approach
  • binning procedure
  • clustering
  • correlate the content of news stories
  • definition of a trend
  • distance based agglomerative clustering algorithm
  • e-analyst
  • evaluation
  • false alarm rate
  • financial markets
  • five-hour alignment (length of the window)
  • identifying trends
  • indicative of future trends
  • labeled set of pairs
  • labels
  • language models
  • low recall
  • news stories
  • piecewise linear regression
  • piecewise segmentation
  • profitably predict forthcoming trends
  • randomized tests to ensure statistical significanc
  • recommender system
  • recomment interesting news stories
  • relevant assignments
  • relevant documents
  • richness
  • scoring function
  • stock-specific (language) models
  • surges, slight +, plunges, slight-
  • t-test
  • test set
  • text mining
  • time series
  • trading strategy
  • traditional vector-space model
  • training set
  • trends
  • trends in financial time series
  • universal (language) models
  • useless stories

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in


  • Victor Lavrenko

  • Matt Schmill

  • Dawn Lawrie

  • Paul Ogilvie

  • David Jensen

  • James Allan

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free