Sensitive Unclassified information is defined as any unclassified information that may cause adverse consequences against the government facilities. In this chapter, we explore the use of categorization techniques and information extraction to discover this kind of information in scanned documents. We show here that the combined use of a K-Dependence Bayesian categorization engine and a semi-automated review application reduce by nearly 95% the number of man hours required to redact sensitive unclassified information. We also discuss and provide statistics on how OCR errors can affect the information extraction tasks. © 2009 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Taghva, K. (2009). Identification of sensitive unclassified information. In Computational Methods for Counterterrorism (pp. 89–108). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-01141-2_6
Mendeley helps you to discover research relevant for your work.