Abstract
This paper shows how machine learning can help in analyzing and understanding historical change. Using data from the Canadian census of 1901, we discover the influences on bilingualism in Canada at the beginning of the last century. The discovered theories partly agree with, and partly complement, the existing views of historians on this question. Our approach, based around a decision tree, not only infers theories directly from data, but also evaluates existing theories and revises them to improve their consistency with the data. One novel aspect of this work is the use of confidence intervals to determine which factors are both statistically and practically significant, and thus contribute appreciably to the overall accuracy of the theory. When inducing a decision tree directly from data, confidence interrvals determine when new tests should be added. If an existing theory is being evaluated, confidence intervals also determine when old tests should be replaced, or deleted, to improve the theory. Our aim is to minimize the changes made to an existing theory to accommodate the new data. To this end, we propose a semantic measure of similarity between trees and demonstrate how this can be used to limit the changes made.
Cite
CITATION STYLE
Drummond, C., Matwin, S., & Gaffield, C. (2006). Inferring and revising theories with confidence: Analyzing bilingualism in the 1901 canadian census. Applied Artificial Intelligence, 20(1), 1–33. https://doi.org/10.1080/08839510500313711
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.