Kernelized sorting is an approach for matching objects from two sources (or domains) that does not require any prior no- tion of similarity between objects across the two sources. Un- fortunately, this technique is highly sensitive to initialization and high dimensional data. We present variants of kernelized sorting to increase its robustness and performance on several Natural Language Processing (NLP) tasks: document match- ing from parallel and comparable corpora, machine transliter- ation and even image processing. Empirically we show that, on these tasks, a semi-supervised variant of kernelized sorting outperforms matching canonical correlation analysis.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below