Hierarchical Loop Closure Detection For Long-Term Visual Slam With Semantic-Geometric Descriptors

- Improving autonomous agent’s ability to recognize its location by leveraging on visual contextual information of the environment

Autonomous agents such as mobile robots and unmanned air vehicles need to estimate their position and the map of their environment to navigate safely and smoothly. This is achieved through a technology called SLAM (Simultaneous Localization and Mapping). A core component of SLAM is place recognition, which allows the agent to recognize places it has visited before. However existing place recognition methods fail in environments where the scene changes from time to time (such as due to dynamic objects or scene variations). This research overcomes this problem by utilizing visual semantic information to improve the accuracy and efficiency of place recognition in SLAM systems such that the agent can operate long-term in large scale dynamic environments. 

We propose a viewpoint-invariant global semantic-geometric descriptor to incorporate contextual information of the visited places. To reduce the search time, we group locations with similar semantic-geometric structures. Our proposed hierarchical loop detection method finds semantically similar places, and refines place recognition using locally learned visual words. A new location is created dynamically when the appearance of the environment, determined from the semantic descriptor, differs notably from past locations. Our methods have the following advantages.

  1. High accuracy.
  2. Low query time.
  3. Adaptive.

Current works use local feature descriptors for describing places which lack contextual information. They use an offline vocabulary to search places which limits its adaptivity to new scenes. As such, the current methods fail in the presence of dynamic objects, scene variations (as offline vocabulary becomes outdated), and the large vocabulary often leads to long query time that increases proportionally to the search space.

We have evaluated our method using popular public datasets (e.g., KITTI, CBD) that have dynamic objects. We compare the proposed method with state-of-the-art loop closure detection methods, FABMAP 2.0, SeqSLAM, iBOW-LCD and HTMap. The proposed method obtains highest accuracy and is fastest among all methods.

Our methods can be applied to visual SLAM systems in autonomous robots, unmanned aerial vehicles, augmented and virtual reality systems.

Some reference on the use cases can be found in the following links:

https://addverb.com/types-of-mobile-robots-what-to-use-where/

https://www.geospatialworld.net/blogs/indoor-positioning-indoors-gps-stops-working/

https://www.viewar.com/blog/augmented-reality-indoor-navigation-positioning/

Gaurav Singh, Meiqing Wu, Siew-Kei Lam, and Do Van Minh “Hierarchical Loop Closure Detection for Long-term Visual SLAM with Semantic-Geometric Descriptors”, 24th IEEE International Conference on Intelligent Transportation (ITSC), September 2021. (Link)

Tags: 
Computer vision
Deep learning 

Contact:
For more details on the above research and its applications, please contact
Singtel Cognitive and Artificial Intelligence Lab for Enterprises@NTU 
(SCALE@NTU)

Nanyang Technological University
School of Computer Science and Engineering

Nanyang Avenue, Block N4 #B3A-02, Singapore 639798

Email: [email protected]