The development of Information Retrieval (IR) techniques heavily depends on empirical studies over real world data collections. Unfortunately, those real world data sets are often unavailable to researchers due to privacy concerns. In fact, the lack of publicly available industry data sets has become a serious bottleneck hindering IR research. To address this problem, we propose to bridge the gap between academic research and industry data sets through a privacy-preserving evaluation platform. The novelty of the platform lies in its “data-centric” mechanism, where the data sit on a secure server and IR algorithms to be evaluated would be uploaded to the server. The platform will run the codes of the algorithms and return the evaluation results. Preliminary experiments with retrieval models reveal interesting new observations and insights about state of the art retrieval models, demonstrating the value of an industry data set.
CITATION STYLE
Yang, P., Zhou, M., Chang, Y., Zhai, C., & Fang, H. (2017). Towards Privacy-Preserving Evaluation for Information Retrieval Models Over Industry Data Sets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10648 LNCS, pp. 210–221). Springer Verlag. https://doi.org/10.1007/978-3-319-70145-5_16
Mendeley helps you to discover research relevant for your work.