Although BERT has shown its effectiveness in a number of IR-related tasks, especially document ranking, the understanding of its internal mechanism remains insufficient. To increase the explainability of the ranking process performed by BERT, we investigate a state-of-the-art BERT-based ranking model with focus on its attention mechanism and interaction behavior. Firstly, we look into the evolving of the attention distribution. It shows that in each step, BERT dumps redundant attention weights on tokens with high document frequency (such as periods). This may lead to a potential threat to the model robustness and should be considered in future studies. Secondly, we study how BERT models interactions between query and document and find that BERT aggregates document information to query token representations through their interactions, but extracts query-independent representations for document tokens. It indicates that it is possible to transform BERT into a more efficient representation-focused model. These findings help us better understand the ranking process by BERT and may inspire future improvement.
CITATION STYLE
Zhan, J., Mao, J., Liu, Y., Zhang, M., & Ma, S. (2020). An Analysis of BERT in Document Ranking. In SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1941–1944). Association for Computing Machinery, Inc. https://doi.org/10.1145/3397271.3401325
Mendeley helps you to discover research relevant for your work.