# DeepV2D **Repository Path**: itking666/DeepV2D ## Basic Information - **Project Name**: DeepV2D - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-03-29 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # DeepV2D This repository contains the source code for our paper: [DeepV2D: Video to Depth with Differentiable Structure from Motion](https://arxiv.org/abs/1812.04605)
Zachary Teed and Jia Deng
International Conference on Learning Representations (ICLR) 2020
## Requirements Our code was tested using Tensorflow 1.12.0 and Python 3. To use the code, you need to first install the following python packages: First create a clean virtualenv ```Shell virtualenv --no-site-packages -p python3 deepv2d_env source deepv2d_env/bin/activate ``` ```Shell pip install tensorflow-gpu==1.12.0 pip install h5py pip install easydict pip install scipy pip install opencv-python pip install pyyaml pip install toposort pip install vtk ``` You can optionally compile our cuda backprojection operator by running ```Shell cd deepv2d/special_ops && ./make.sh && cd ../.. ``` This will reduce peak GPU memory usage. You may need to change CUDALIB to where you have cuda is installed. ## Demos ### Video to Depth (V2D) Try it out on one of the provided test sequences. First download our pretrained models ```Shell ./data/download_models.sh ``` The demo code will output a depth map and display a point cloud for visualization. Once the depth map has appeared, press any key to open the point cloud visualization. [NYUv2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html): ```Shell python demos/demo_v2d.py --model=models/nyu.ckpt --sequence=data/demos/nyu_0 ``` [ScanNet](http://www.scan-net.org/): ```Shell python demos/demo_v2d.py --model=models/scannet.ckpt --sequence=data/demos/scannet_0 ``` [KITTI](http://www.cvlibs.net/datasets/kitti/): ```Shell python demos/demo_v2d.py --model=models/kitti.ckpt --sequence=data/demos/kitti_0 ``` You can also run motion estimation in `global` mode which updates all the poses jointly as a single optimization problem ```Shell python demos/demo_v2d.py --model=models/nyu.ckpt --sequence=data/demos/nyu_0 --mode=global ``` ### Uncalibrated Video to Depth (V2D-Uncalibrated) If you do not know the camera intrinsics you can run DeepV2D in uncalibrated mode. In the uncalibrated setting, the motion module estimates the focal length during inference. ```Shell python demos/demo_uncalibrated.py --video=data/demos/golf.mov ``` ### SLAM / VO DeepV2D can also be used for tracking and mapping on longer videos. First, download some test sequences ```Shell ./data/download_slam_sequences.sh ``` Try it out on [NYU-Depth](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html), [ScanNet](http://www.scan-net.org/), [TUM-RGBD](https://vision.in.tum.de/data/datasets/rgbd-dataset), or [KITTI](http://www.cvlibs.net/datasets/kitti/). Using more keyframes `--n_keyframes=?` reduces drift but results in slower tracking. ```Shell python demos/demo_slam.py --dataset=kitti --n_keyframes=2 ``` ```Shell python demos/demo_slam.py --dataset=scannet --n_keyframes=3 ``` The `--cinematic` flag forces the visualization to follow the camera ```Shell python demos/demo_slam.py --dataset=nyu --n_keyframes=3 --cinematic ``` The `--clear_points` flag can be used so that only the point cloud of the current depth is plotted. ```Shell python demos/demo_slam.py --dataset=tum --n_keyframes=3 --clear_points ``` ## Evaluation You can evaluate the trained models on one of the datasets... #### [NYUv2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html): ```Shell ./data/download_nyu_data.sh python evaluation/eval_nyu.py --model=models/nyu.ckpt ``` #### [KITTI](http://www.cvlibs.net/datasets/kitti/): First download the dataset using this [script](http://www.cvlibs.net/download.php?file=raw_data_downloader.zip) provided on the official website. Then run the evaluation script where KITTI_PATH is the location of where the dataset was downloaded ```Shell ./data/download_kitti_data.sh python evaluation/eval_kitti.py --model=models/kitti.ckpt --dataset_dir=KITTI_PATH ``` #### [ScanNet](http://www.scan-net.org/): First download the [ScanNet](https://github.com/ScanNet/ScanNet) dataset. Then run the evaluation script where SCANNET_PATH is the location of where you downloaded ScanNet ```Shell python evaluation/eval_scannet.py --model=models/scannet.ckpt --dataset_dir=SCANNET_PATH ``` ## Training You can train a model on one of the datasets #### [NYUv2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html): First download the training tfrecords file [here](https://drive.google.com/file/d/1-kfW55tpwxFVfv9AL76IFXWuNMBE3b7Y/view?usp=sharing ) (143Gb) containing the NYU data. Once the data has been downloaded, train the model by running the command (training takes about 1 week on a Nvidia 1080Ti GPU) ```Shell python training/train_nyu.py --cfg=cfgs/nyu.yaml --name=nyu_model --tfrecords=nyu_train.tfrecords ``` Note: this creates a temporary directory which is used to store intermediate depth predictions. You can specify the location of the temporary directory using the `--tmp` flag. You can use multiple gpus by using the `--num_gpus` flag. If you train with multiple gpus, you can reduce the number of training iterations in cfgs/nyu.yaml. #### [KITTI](http://www.cvlibs.net/datasets/kitti/): First download the dataset using this [script](http://www.cvlibs.net/download.php?file=raw_data_downloader.zip) provided on the official website. Once the dataset has been downloaded, write the training sequences to a tfrecords file ```Shell python training/write_tfrecords.py --dataset=kitti --dataset_dir=KITTI_DIR --records_file=kitti_train.tfrecords ``` You can now train the model (training takes about 1 week on a Nvidia 1080Ti GPU). Note: this creates a temporary directory which is used to store intermediate depth predictions. You can specify the location of the temporary directory using the `--tmp` flag. You can use multiple gpus by using the `--num_gpus` flag. ```Shell python training/train_kitti.py --cfg=cfgs/kitti.yaml --name=kitti_model --tfrecords=kitti_train.tfrecords ```