Characterizing mammography reports for health analytics

  • Rojas C
  • Patton R
  • Beckerman B
  • 23


    Mendeley users who have this article in their library.
  • 3


    Citations of this article.


As massive collections of digital health data are becoming available, the opportunities for large-scale automated analysis increase. In particular, the widespread collection of detailed health information is expected to help realize a vision of evidence-based public health and patient-centric health care. Within such a framework for large scale health analytics we describe the transformation of a large data set of mostly unlabeled and free-text mammography data into a searchable and accessible collection, usable for analytics. We also describe several methods to characterize and analyze the data, including their temporal aspects, using information retrieval, supervised learning, and classical statistical techniques. We present experimental results that demonstrate the validity and usefulness of the approach, since the results are consistent with the known features of the data, provide novel insights about it, and can be used in specific applications. Additionally, based on the process of going from raw data to results from analysis, we present the architecture of a generic system for health analytics from clinical notes.

Author-supplied keywords

  • Clinical notes
  • Mammography reports
  • Temporal analysis
  • Text analysis

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Carlos C. Rojas

  • Robert M. Patton

  • Barbara G. Beckerman

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free