# DeepV2D
**Repository Path**: itking666/DeepV2D
## Basic Information
- **Project Name**: DeepV2D
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-03-29
- **Last Updated**: 2020-12-19
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# DeepV2D
This repository contains the source code for our paper:
[DeepV2D: Video to Depth with Differentiable Structure from Motion](https://arxiv.org/abs/1812.04605)
Zachary Teed and Jia Deng
International Conference on Learning Representations (ICLR) 2020
## Requirements
Our code was tested using Tensorflow 1.12.0 and Python 3. To use the code, you need to first install the following python packages:
First create a clean virtualenv
```Shell
virtualenv --no-site-packages -p python3 deepv2d_env
source deepv2d_env/bin/activate
```
```Shell
pip install tensorflow-gpu==1.12.0
pip install h5py
pip install easydict
pip install scipy
pip install opencv-python
pip install pyyaml
pip install toposort
pip install vtk
```
You can optionally compile our cuda backprojection operator by running
```Shell
cd deepv2d/special_ops && ./make.sh && cd ../..
```
This will reduce peak GPU memory usage. You may need to change CUDALIB to where you have cuda is installed.
## Demos
### Video to Depth (V2D)
Try it out on one of the provided test sequences. First download our pretrained models
```Shell
./data/download_models.sh
```
The demo code will output a depth map and display a point cloud for visualization. Once the depth map has appeared, press any key to open the point cloud visualization.
[NYUv2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html):
```Shell
python demos/demo_v2d.py --model=models/nyu.ckpt --sequence=data/demos/nyu_0
```
[ScanNet](http://www.scan-net.org/):
```Shell
python demos/demo_v2d.py --model=models/scannet.ckpt --sequence=data/demos/scannet_0
```
[KITTI](http://www.cvlibs.net/datasets/kitti/):
```Shell
python demos/demo_v2d.py --model=models/kitti.ckpt --sequence=data/demos/kitti_0
```
You can also run motion estimation in `global` mode which updates all the poses jointly as a single optimization problem
```Shell
python demos/demo_v2d.py --model=models/nyu.ckpt --sequence=data/demos/nyu_0 --mode=global
```
### Uncalibrated Video to Depth (V2D-Uncalibrated)
If you do not know the camera intrinsics you can run DeepV2D in uncalibrated mode. In the uncalibrated setting, the motion module estimates the focal length during inference.
```Shell
python demos/demo_uncalibrated.py --video=data/demos/golf.mov
```
### SLAM / VO
DeepV2D can also be used for tracking and mapping on longer videos. First, download some test sequences
```Shell
./data/download_slam_sequences.sh
```
Try it out on [NYU-Depth](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html), [ScanNet](http://www.scan-net.org/), [TUM-RGBD](https://vision.in.tum.de/data/datasets/rgbd-dataset), or [KITTI](http://www.cvlibs.net/datasets/kitti/). Using more keyframes `--n_keyframes=?` reduces drift but results in slower tracking.
```Shell
python demos/demo_slam.py --dataset=kitti --n_keyframes=2
```
```Shell
python demos/demo_slam.py --dataset=scannet --n_keyframes=3
```
The `--cinematic` flag forces the visualization to follow the camera
```Shell
python demos/demo_slam.py --dataset=nyu --n_keyframes=3 --cinematic
```
The `--clear_points` flag can be used so that only the point cloud of the current depth is plotted.
```Shell
python demos/demo_slam.py --dataset=tum --n_keyframes=3 --clear_points
```
## Evaluation
You can evaluate the trained models on one of the datasets...
#### [NYUv2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html):
```Shell
./data/download_nyu_data.sh
python evaluation/eval_nyu.py --model=models/nyu.ckpt
```
#### [KITTI](http://www.cvlibs.net/datasets/kitti/):
First download the dataset using this [script](http://www.cvlibs.net/download.php?file=raw_data_downloader.zip) provided on the official website. Then run the evaluation script where KITTI_PATH is the location of where the dataset was downloaded
```Shell
./data/download_kitti_data.sh
python evaluation/eval_kitti.py --model=models/kitti.ckpt --dataset_dir=KITTI_PATH
```
#### [ScanNet](http://www.scan-net.org/):
First download the [ScanNet](https://github.com/ScanNet/ScanNet) dataset.
Then run the evaluation script where SCANNET_PATH is the location of where you downloaded ScanNet
```Shell
python evaluation/eval_scannet.py --model=models/scannet.ckpt --dataset_dir=SCANNET_PATH
```
## Training
You can train a model on one of the datasets
#### [NYUv2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html):
First download the training tfrecords file [here](https://drive.google.com/file/d/1-kfW55tpwxFVfv9AL76IFXWuNMBE3b7Y/view?usp=sharing
) (143Gb) containing the NYU data. Once the data has been downloaded, train the model by running the command (training takes about 1 week on a Nvidia 1080Ti GPU)
```Shell
python training/train_nyu.py --cfg=cfgs/nyu.yaml --name=nyu_model --tfrecords=nyu_train.tfrecords
```
Note: this creates a temporary directory which is used to store intermediate depth predictions. You can specify the location of the temporary directory using the `--tmp` flag. You can use multiple gpus by using the `--num_gpus` flag. If you train with multiple gpus, you can reduce the number of training iterations in cfgs/nyu.yaml.
#### [KITTI](http://www.cvlibs.net/datasets/kitti/):
First download the dataset using this [script](http://www.cvlibs.net/download.php?file=raw_data_downloader.zip) provided on the official website. Once the dataset has been downloaded, write the training sequences to a tfrecords file
```Shell
python training/write_tfrecords.py --dataset=kitti --dataset_dir=KITTI_DIR --records_file=kitti_train.tfrecords
```
You can now train the model (training takes about 1 week on a Nvidia 1080Ti GPU). Note: this creates a temporary directory which is used to store intermediate depth predictions. You can specify the location of the temporary directory using the `--tmp` flag. You can use multiple gpus by using the `--num_gpus` flag.
```Shell
python training/train_kitti.py --cfg=cfgs/kitti.yaml --name=kitti_model --tfrecords=kitti_train.tfrecords
```