Building 3D Texture Model from Video and CT for Endoscopic Sinus Surgery

Dec 2017 - Present Linhao Jin

This project contains the following five modules:

1. Preprocessing of CT (3D scene) and endoscopic (video) data.
2. CT model texture mapping.
3. Front end optical flow-based tracking and registration.
4. Back end motion-only bundle adjustment.
5. Loop closing.
CT Model Texture Mapping:

Given the camera poses and RGB images, the color of the 3D model is retrieved by using reprojection. Each 3D vertex is reprojected back to the image plan to retrieve the color. A Z-Buffer (depth map) is created at the same time to guarantee that only the nearest points (front surface) are colored. All visible points at the surface of the model are processed iteratively with a sequence of camera poses, and sensor messages including the camera pose /tf, /pointcloud, /imageraw and /camerainfo are published in real time. The output is viewed in RViz and saved as .pcd and .ply file (not sure why so lag in the recording).


Output polygon mesh viewed in MeshLab.



Front End Motion Estimation

The image registration algorithm is similar to sparse direct approach, which uses Lucas-Kanade optical flow to track feature points. Given a 3D model, the first step is to project 3D vertices onto the image plane. These points after projection are referred as reprojected points. Due to the noise and uncertainty introduced by the motion prior, a minor deviation occurs that causes reprojected 3D points fall onto a pixel location away from their correct locations. This normally happens in the region that has sharp intensity contrast such as corners and edges.


To map the reprojected points to correct corresponding pixels, at the same time avoiding the complexity of feature extration and description, we applied Lucas-Kanade Optical Flow to track the reprojected point by determining a searching direction and a magnitude. Starting from the reprojected point, if any point along that direction is found to have the same pixel intensity, this point is selected as the corresponding point and registered to the reprojected point. This means that the reprojected point "flows" or "shifts" from that location to its current lcoation due to noises. If there is no match of intensity within certain searching distance, then the reprojected point is consider void and discarded.


Optical flow-based tracking of back projected points.



For each iteration, the pose estimated from previous iteration will be used for back-projection, which leads to the calculation of a new estimated pose. As shown in the image below, for a single frame, the edge of the back-projected vertices moves towards the edges of the image as the number of iteration increases. It stops when certain criteria are met, and two edges are aligned.


Visualization of back-projected vertices with estimated poses for different iterations.



Back End Optimization

I will update this part after thesis submission.

Indoor 3D reconstruction and mapping for high ceiling spray painting robot
Event-based camera and ultra-wide-band based multi-UAV localizations
Stereo Vision-based Gantry Detection for Autonomous Driving