# SfM
**Repository Path**: itking666/SfM
## Basic Information
- **Project Name**: SfM
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-12-09
- **Last Updated**: 2024-10-12
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty
_Benjamin Graham, David Novotny_
[3DV 2020](http://3dv2020.dgcv.nii.ac.jp/)
This is the official implementation of **RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty** in PyTorch.
[Link to paper](https://arxiv.org/abs/2011.10359) | [Poster](ridgesfm_poster.pdf)
## RidgeSfM applied to the ScanNet test set
[**Scene 0707_00 frame skip rate _k=1_**](output/scannet/0707/README.md)
|
[**Scene 0708_00 frame skip rate _k=3_**](output/scannet/0708/README.md)
|
[**Scene 0709_00 frame skip rate _k=10_**](output/scannet/0709/README.md)
|
[**Scene 0710_00 frame skip rate _k=30_**](output/scannet/0710/README.md)
|
Below we illustrate the depth uncertainty factors of variation for a frame from scene 0708.

Top left: an input image.
Bottom left: the predicted depth.
Middle and right: We use SVD to reduce the 32 FoV planes down to 12 planes, and display them as 4 RGB images; each of the 4x3 color planes represents one factor of variation.
|
## RidgeSfM applied to a video taken on a mobile phone
We applied RidgeSfM to a short video taken using a mobile phone camera.
There is no ground truth pose, so the bottom right hand corner of the video is blank.
[**Living room - skip rate _k=3_**](output/cubot/scene0/README.md)
|
## RidgeSfM applied to the KITTI odometry dataset
We trained a depth prediction network on the [KITTI depth prediction](http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction) training set. We then processed videos from the [KITTI Visual Odometry dataset](http://www.cvlibs.net/datasets/kitti/eval_odometry.php). We used the 'camera 2' image sequences, cropping the input to RGB images of size 1216x320\. We used R2D2 as the keypoint detector. We used a frame skip rate of _k_=3\. The scenes are larger spatially, so for visualization we increased the number of K-Means centroids to one million.
[**Scene 6 - skip rate _k=3_**](output/kitti/scene6/README.md)
|
[**Scene 7 - skip rate _k=3_**](output/kitti/scene7/README.md)
|
# Setup
- Download the [ScanNet dataset](http://www.scan-net.org/) to `ridgesfm/data/scannet_sens/[train|test]/`
- Download [SensReader](https://github.com/ScanNet/ScanNet/tree/master/SensReader/python) to `ridgesfm/data`
- Download [SuperPoint](https://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w9/DeTone_SuperPoint_Self-Supervised_Interest_CVPR_2018_paper.pdf) to ridgesfm/
```
wget https://github.com/magicleap/SuperGluePretrainedNetwork/blob/master/models/weights/superpoint_v1.pth?raw=true -O ridgesfm/weights/superpoint_v1.pth
wget https://raw.githubusercontent.com/magicleap/SuperGluePretrainedNetwork/master/models/superpoint.py -O ridgesfm/superpoint.py
```
- Run `bash prepare_scannet.sh` in `ridgesfm/data/`
- Run `python ridgesfm.py scene.n=0 scene.frameskip=10`
Try process your own video,
- calibrate your camera using `calibrate/calibrate.ipynb`
- then run `python ridgesfm.py scenes=calibrate/ scene.n=0 scene.frameskip=10`
Videos are scaled and/or cropped to resoltion 640x480. The notebook calculates a camera intrinsic matrix for the rescaled video. RidgeSfM will work best when the C.I. matrix is similar to that of the depth prediction network's training data, i.e. [[578, 0, 319.5], [0, 578, 239.5], [0, 0, 1]].
## Dependencies:
- Python 3.7+
- PyTorch and TorchVision
- [Faiss](https://github.com/facebookresearch/faiss)
- [PyKeOps](https://pypi.org/project/pykeops/)
- OpenCV
- imageio and imageio-ffmpeg
- [PyTorch3D](https://github.com/facebookresearch/pytorch3d)
## License
RidgeSfM is CC-BY-NC licensed, as found in the LICENSE file. [Terms of use](https://opensource.facebook.com/legal/terms). [Privacy](https://opensource.facebook.com/legal/privacy)
## Citations
If you find this code useful in your research then please cite:
```
@InProceedings{ridgesfm2020,
author = "Benjamin Graham and David Novotny",
title = "Ridge{S}f{M}: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty",
booktitle = "International Conference on 3D Vision (3DV)",
year = "2020",
}
```