Traditional SLAM (Simultaneous Localization and Mapping) only contains geometric information, often in the form of 3D points. While such map does tell us where the obstacles are and is valuable for simple missions such as navigation and path planning, it does not provide object-level semantic information and falls short when we want the robot to perform some high-level tasks, such as “pick up the cup on the table”, “Avoid the people and shopping cart until reaching the storage room.” Semantic SLAM, on the other hand, incorporates semantic information into the map, hence increase the range and sophistication of interactions that a robot may have with the world. This project is to design integrated, task-oriented planning and mapping with high-level task specifications in temporal logic.
Our research:
Our initial effort is focused on building a complete framework for Semantic SLAM. Our algorithm would take monocular and odometry measurements as input and output the estimations of the sensor state trajectory and the positions, shapes, and classes of the objects in the environment. Similar to traditional SLAM, the Semantic SLAM is also composed of front-end and back-end. However, unlike traditional SLAM, an object detector is integrated into the front-end to provide object classes and bounding boxes. The data association and loop closure in the front-end are performed on semantic objects instead of 2d image features. And in the back-end, the MAP (maximum a posteriori) estimation is performed to output quadrics enclosing 3D semantic objects, apart from the sensor state trajectory. A video showing the algorithm running in real-time can be found below. This project is at an early stage, please check back later for new updates!