User geocoding queries in map applications often contain noisy tokens such as typos in street, city name, wrong postal code, redundant words due to copy-paste action, etc. This issue becomes worse with the rapid growth of mobile devices, where errors from user input are inevitable. Such noisy tokens may fail the searching process if they are passed as-is to the downstream query processing components. In particular, there might be nothing or irrelevant results returned to the user. Therefore, noisy tokens in geocoding queries should be recognized and handled properly prior to the searching process. In this paper, a deep learning based noise prediction model for geocoding queries is proposed. It combines a novel Word Geospatial Embedding (WGE) and a Bidirectional LSTM based sequence tagging model. The proposed WGE is the first language model that allows geospatial semantics to be encoded into the vector representations. It allows geo-related machine learning/deep learning models making spatial-aware prediction.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Vu, T., Liu, S., Wang, R., & Valegerepura, K. (2020). Noise Prediction for Geocoding Queries using Word Geospatial Embedding and Bidirectional LSTM. In GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems (pp. 127–130). Association for Computing Machinery. https://doi.org/10.1145/3397536.3422201