Abstract
This paper demonstrates how an author recognition system could be benchmarked, as a prerequisite for admission in court. The system used in the demonstration is the FEDERALES system, and the experimental data used were taken from the British National Corpus. The system was given several tasks, namely attributing a text sample to a specific text, verifying that a text sample was taken from a specific text, and verifying that a text sample was produced by a specific author. For the former two tasks, 1,099 texts with at least 10,000 words were used; for the latter 1,366 texts with known authors, which were verified against models for the 28 known authors for whom there were three or more texts. The experimental tasks were performed with different sampling methods (sequential samples or samples of concatenated random sentences), different sample sizes (1,000, 500, 250 or 125 words), varying amounts of training material (between 2 and 20 samples) and varying amounts of test material (1 or 3 samples). Under the best conditions, the system performed very well: with 7 training and 3 test samples of 1,000 words of randomly selected sentences, text attribution had an equal error rate of 0.06% and text verification an equal error rate of 1.3%; with 20 training and 3 test samples of 1,000 words of randomly selected sentences, author verification had an equal error rate of 7.5%. Under the worst conditions, with 2 training and 1 test sample of 125 words of sequential text, equal error rates for text attribution and text verification were 26.6% and 42.2%, and author verification did not perform better than chance. Furthermore, the quality degradation curves with slowly worsening conditions were not smooth, but contained steep drops. All in all, the results show the importance of having a benchmark which is as similar as possible to the actual court material for which the system is to be used, since the measured system quality differed greatly between evaluation scenarios and system degradation could not be predicted easily on the basis of the chosen scenario parameters.
Cite
CITATION STYLE
Van Halteren, H. (2019). Benchmarking Author Recognition Systems for Forensic Application. Linguistic Evidence in Security, Law and Intelligence, 3. https://doi.org/10.5195/lesli.2019.20
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.