Abstract
We investigate the computational complexity of the task of detecting dense regions of an unknown distribution from unlabeled samples of this distribution. We introduce a formal learning model for this task that uses a hypothesis class as it "anti-overfitting" mechanism. The learning task in our model can be reduced to a combinatorial optimization problem. We can show that for some constants, depending on the hypothesis class, these problems are NP-hard to approximate to within these constant factors. We go on and introduce a new criterion for the success of approximate optimization geometric problems. The new criterion requires that the algorithm competes with hypotheses only on the points that are separated by some margin μ from their boundaries. Quite surprisingly, we discover that for each of the two hypothesis classes that we investigate, there is a "critical value" of the margin parameter μ. For any value below the critical value the problems are NP-hard to approximate, while, once this value is exceeded, the problems become poly-time solvable. © 2002 Elsevier Science (USA).
Cite
CITATION STYLE
Ben-David, S., Eiron, N., & Simon, H. U. (2002). The computational complexity of densest region detection. Journal of Computer and System Sciences, 64(1), 22–47. https://doi.org/10.1006/jcss.2001.1797
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.