Absolute visual localization is of significant importance for unmanned aerial vehicles when the satellite-based localization system is not available. With the rapid evolution in the field of deep learning, the real-time visual detection and tracking of landmarks by an unmanned aerial vehicle could be implemented onboard. This study demonstrates a landmark-based visual localization framework for unmanned aerial vehicles flying at low altitudes. YOLOv5 and DeepSORT are used for multi-object detection and tracking, respectively. The unmanned aerial vehicle localization is achieved according to the geometric similarity between the geotagged transmission towers and the annotated images captured by a monocular camera. The validation is accomplished both in the Rflysim-based simulation and the quadrotor-based real flight. The localization precision is about 10 m, and the location update frequency reaches 5 Hz with a commercially available entry-level edge artificial intelligence platform. The proposed visual localization strategy needs no satellite image as a reference map, which saves a significant amount of the GPU memory and makes possible the end-to-end implementation on small unmanned aerial vehicles.
CITATION STYLE
Ma, L., Meng, D., Zhao, S., & An, B. (2023). Visual localization with a monocular camera for unmanned aerial vehicle based on landmark detection and tracking using YOLOv5 and DeepSORT. International Journal of Advanced Robotic Systems, 20(3). https://doi.org/10.1177/17298806231164831
Mendeley helps you to discover research relevant for your work.