In this chapter, we discuss several schemes for estimating the parameters (i.e., the m-and u-probabilities) of the Fellegi-Sunter model discussed in Chapter 8. 9.1. Basic Estimation of Parameters Under Simple Agreement/Disagreement Patterns For each ∈ , Fellegi and Sunter [1969] considered PPPP = PPP r ∈ MMPPr ∈ MM + PPP r ∈ UUPPr ∈ UU and noted that the proportion of record pairs, r, having each possible agreement/disagreement pattern, ∈ , could be computed directly from the available data. For example, if = 1 2 3 consists of a simple agree/disagree (zero/one) pattern associated with three variables, then a typical value for would be (1, 1, 0). Then, by our usual conditional independence assumption, there exist vector constants (marginal probabilities) m = m 1 m 2 m n and u = u 1 u 2 u n such that, for all 2 n possible values of 1 2 n P 1 2 n r ∈ M = n i=1 m i i 1 − m i 1− i and P 1 2 n r ∈ U = n i=1 u i i 1 − u i 1− i For the case in which n ≥ 3, Fellegi and Sunter [1969] showed how to use the equations above to find solutions for the 2n + 1 independent parameters-m 1 m 2 m n u 1 u 2 u n , and P[M]. (We obtain P[U] as PPUU = 1 − PPMM.) The reader can obtain further details from Fellegi and Sunter [1969]. 93
CITATION STYLE
Herzog, T. N., Scheuren, F. J., & Winkler, W. E. (2007). Estimating the Parameters of the Fellegi–Sunter Record Linkage Model. In Data Quality and Record Linkage Techniques (pp. 93–106). Springer New York. https://doi.org/10.1007/0-387-69505-2_9
Mendeley helps you to discover research relevant for your work.