Abstract
The population recovery problem asks one to recover an unknown distribution over n-bit strings given access to independent noisy samples of strings drawn from the distribution. Recently, Ban et al. [BCF+19] studied the problem where the noise is induced through the deletion channel. This problem generalizes the famous trace reconstruction problem, where one wishes to learn a single string under the deletion channel. Ban et al. showed how to learn l-sparse distributions over strings using exp (n1/2 · (log n)O(l)) samples. In this work, we learn the distribution using only exp (Õ(n1/3) · l2) samples, by developing a higher-moment analog of the algorithms of [DOS17a, NP17], which solve trace reconstruction in exp (Õ(n1/3)) samples. We also give the first algorithm with a runtime subexponential in n, solving population recovery in exp (Õ(n1/3) · l3) samples and time. Notably, our dependence on n nearly matches the upper bound of [DOS17a, NP17] when l = O(1), and we reduce the dependence on l from doubly to singly exponential. Therefore, we are able to learn large mixtures of strings: while Ban et al.'s algorithm can only learn a mixture of O(log n/log log n) strings with a subexponential number of samples, we are able to learn a mixture of no(1) strings in exp (n1/3+o(1)) samples and time.
Cite
CITATION STYLE
Narayanan, S. (2021). Improved algorithms for population recovery from the deletion channel. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1259–1278). Association for Computing Machinery. https://doi.org/10.1137/1.9781611976465.77
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.