# HorizonNet

**Repository Path**: itking666/HorizonNet

## Basic Information

- **Project Name**: HorizonNet
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-12-19
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# HorizonNet

This is the implementation of our CVPR'19 "[
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation](https://arxiv.org/abs/1901.03861)" ([project page](https://sunset1995.github.io/HorizonNet/)).

**News, June 15, 2019** - Critical bug fix for general layout (`dataset.py`, `inference.py` and `misc/post_proc.py`)\
**News, Aug 19, 2019** - Report results on [Structured3D dataset](https://structured3d-dataset.org/). (See [the report :clipboard: on ST3D](README_ST3D.md)).

![](assets/teaser.jpg)

This repo is a **pure python** implementation that you can:
- **Inference on your images** to get cuboid or general shaped room layout
- **3D layout viewer**
- **Correct pose** for your panorama images
- **Pano Stretch Augmentation** copy and paste to apply on your own task
- **Quantitative evaluatation** (3D IoU, Corner Error, Pixel Error)
    - cuboid shape
    - general shape
- **Your own dataset** preparation and training

**Method Pipeline overview**:
![](assets/pipeline.jpg)

## Requirements
- Python 3
- pytorch>=1.0.0
- numpy
- scipy
- sklearn
- Pillow
- tqdm
- tensorboardX
- opencv-python>=3.1 (for pre-processing) (also can't be too new, the latest opencv removed a key algorithm due to patent, 3.1.0.5 works. )
- open3d>=0.7 (for layout 3D viewer)
- shapely
- torchvision


### Download
#### Dataset
- PanoContext/Stanford2D3D Dataset
    - [Download preprocessed pano/s2d3d](https://drive.google.com/open?id=1e-MuWRx3T4LJ8Bu4Dc0tKcSHF9Lk_66C) for training/validation/testing
        - Put all of them under `data` directory so you should get:
            ```
            HorizonNet/
            ├──data/
            |  ├──layoutnet_dataset/
            |  |  |--finetune_general/
            |  |  |--test/
            |  |  |--train/
            |  |  |--valid/
            ```
        - `test`, `train`, `valid` are processed from [LayoutNet's cuboid dataset](https://github.com/zouchuhang/LayoutNet).
        - `finetune_general` is re-annotated by us from `train` and `valid`. It contains  65 general shaped rooms.
- Structured3D Dataset
    - See [the tutorial](https://github.com/sunset1995/HorizonNet/blob/master/README_ST3D.md#dataset-preparation) to prepare training/validation/testing for HorizonNet.


#### Pretrained Models
- [resnet50_rnn__panos2d3d.pth](https://drive.google.com/open?id=1aieMd61b-3BoOeTRv2pKu9yTk5zEinn0)
    - Trained on PanoContext/Stanford2d3d 817 pano images.
    - Trained for 300 epoch
- [resnet50_rnn__st3d.pth](https://drive.google.com/open?id=16v1nhL9C2VZX-qQpikCsS6LiMJn3q6gO)
    - Trained on Structured3D 18362 pano images with setting of original furniture and lighting.
    - Trained for 50 epoch.
    - Select 50th epoch according to loss function on validation set.


## Inference on your images

In below explaination, I will use `assets/demo.png` for example.
- ![](assets/demo.png) (modified from PanoContext dataset)


### 1. Pre-processing (Align camera rotation pose)
- **Execution**: Pre-process the above `assets/demo.png` by firing below command.
    ```bash
    python preprocess.py --img_glob assets/demo.png --output_dir assets/preprocessed/
    ```
    - `--img_glob` telling the path to your 360 room image(s).
        - support shell-style wildcards with quote (e.g. `"my_fasinated_img_dir/*png"`).
    - `--output_dir` telling the path to the directory for dumping the results.
    - See `python preprocess.py -h` for more detailed script usage help.
- **Outputs**: Under the given `--output_dir`, you will get results like below and prefix with source image basename.
    - The aligned rgb images `[SOURCE BASENAME]_aligned_rgb.png` and line segments images `[SOURCE BASENAME]_aligned_line.png`
        - `demo_aligned_rgb.png` | `demo_aligned_line.png`
          :--------------------: | :---------------------:
          ![](assets/preprocessed/demo_aligned_rgb.png) | ![](assets/preprocessed/demo_aligned_line.png)
    - The detected vanishing points `[SOURCE BASENAME]_VP.txt` (Here `demo_VP.txt`)
        ```
        -0.002278 -0.500449 0.865763
        0.000895 0.865764 0.500452
        0.999999 -0.001137 0.000178
        ```


### 2. Estimating layout with HorizonNet
- **Execution**: Predict the layout from above aligned image and line segments by firing below command.
    ```bash
    python inference.py --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize
    ```
    - `--pth` path to the trained model.
    - `--img_glob` path to the preprocessed image.
    - `--output_dir` path to the directory to dump results.
    - `--visualize` optinoal for visualizing model raw outputs.
    - `--force_cuboid` add this option if you want to estimate cuboid layout (4 walls).
- **Outputs**: You will get results like below and prefix with source image basename.
    - The 1d representation are visualized under file name `[SOURCE BASENAME].raw.png`
    - The extracted corners of the layout `[SOURCE BASENAME].json`
        ```
        {"z0": 50.0, "z1": -59.03114700317383, "uv": [[0.029913906008005142, 0.2996523082256317], [0.029913906008005142, 0.7240479588508606], [0.015625, 0.3819984495639801], [0.015625, 0.6348703503608704], [0.056027885526418686, 0.3881891965866089], [0.056027885526418686, 0.6278984546661377], [0.4480381906032562, 0.3970482349395752], [0.4480381906032562, 0.6178648471832275], [0.5995567440986633, 0.41122356057167053], [0.5995567440986633, 0.601679801940918], [0.8094607591629028, 0.36505699157714844], [0.8094607591629028, 0.6537724137306213], [0.8815288543701172, 0.2661873996257782], [0.8815288543701172, 0.7582473754882812], [0.9189453125, 0.31678876280784607], [0.9189453125, 0.7060701847076416]]}
        ```


### 3. Layout 3D Viewer
- **Execution**: Visualizing the predicted layout in 3D using points cloud.
    ```bash
    python layout_viewer.py --img assets/preprocessed/demo_aligned_rgb.png --layout assets/inferenced/demo_aligned_rgb.json --ignore_ceiling
    ```
    - `--img` path to preprocessed image
    - `--layout` path to the json output from `inference.py`
    - `--ignore_ceiling` prevent showing ceiling
    - See `python layout_viewer.py -h` for usage help.
- **Outputs**: In the window, you can use mouse and scroll wheel to change the viewport
    - ![](assets/demo_3d_layout.jpg)


## Your own dataset
See [tutorial](README_PREPARE_DATASET.md) on how to prepare it.


## Training
To train on a dataset, see `python train.py -h` for detailed options explaination.\
Example:
```bash
python train.py --id resnet50_rnn
```
- Important arguments:
    - `--id` required. experiment id to name checkpoints and logs
    - `--ckpt` folder to output checkpoints (default: ./ckpt)
    - `--logs` folder to logging (default: ./logs)
    - `--pth` finetune mode if given. path to load saved checkpoint.
    - `--backbone` backbone of the network (default: resnet50)
        - other options: `{resnet18,resnet34,resnet50,resnet101,resnet152,resnext50_32x4d,resnext101_32x8d,densenet121,densenet169,densenet161,densenet201}`
    - `--no_rnn` whether to remove rnn (default: False)
    - `--train_root_dir` root directory to training dataset. (default: `data/layoutnet_dataset/train`)
    - `--valid_root_dir` root directory to validation dataset. (default: `data/layoutnet_dataset/valid/`)
        - If giveng, the epoch with best 3DIoU on validation set will be saved as `{ckpt}/{id}/best_valid.pth`
    - `--batch_size_train` training mini-batch size (default: 4)
    - `--epochs` epochs to train (default: 300)
    - `--lr` learning rate (default: 0.0001)


## Quantitative Evaluation - Cuboid Layout
To evaluate on PanoContext/Stanford2d3d dataset, first running the cuboid trained model for all testing images:
```bash
python inference.py --pth ckpt/resnet50_rnn__panos2d3d.pth --img_glob "data/layoutnet_dataset/test/img/*" --output_dir output/panos2d3d/resnet50_rnn/ --force_cuboid
```
- `--img_glob` shell-style wildcards for all testing images.
- `--output_dir` path to the directory to dump results.
- `--force_cuboid` enfoce output cuboid layout (4 walls) or the PE and CE can't be evaluated.

To get the quantitative result:
```bash
python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/*txt"
```
- `--dt_glob` shell-style wildcards for all the model estimation.
- `--gt_glob` shell-style wildcards for all the ground truth.

If you want to:
- just evaluate PanoContext only `python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/pano*txt"`
- just evaluate Stanford2d3d only `python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/camera*txt"`

:clipboard: The quantitative result for the released `resnet50_rnn__panos2d3d.pth` is shown below:

| Testing Dataset | 3D IoU(%) | Corner error(%) | Pixel error(%) |
| :-------------: | :-------: | :------: | :--------------: |
| PanoContext     | `83.39` | `0.76` | `2.13` |
| Stanford2D3D    | `84.09` | `0.63` | `2.06` |
| All             | `83.87` | `0.67` | `2.08` |


## Quantitative Evaluation - General Layout
- See [the report :clipboard: on ST3D](README_ST3D.md) for more detail.
- See [the report :clipboard: on MP3D](README_MP3D.md) for more detail.


## TODO
- Faster pre-processing script (top-fron alignment) (maybe cython implementation or [fernandez2018layouts](https://github.com/cfernandezlab/Lines-and-Vanishing-Points-directly-on-Panoramas))


## Acknowledgement
- Credit of this repo is shared with [ChiWeiHsiao](https://github.com/ChiWeiHsiao).
- Thanks [limchaos](https://github.com/limchaos) for the suggestion about the potential boost by fixing the non-expected behaviour of Pytorch dataloader. (See [Issue#4](https://github.com/sunset1995/HorizonNet/issues/4))


## Citation
Please cite our paper for any purpose of usage.
```
@inproceedings{SunHSC19,
  author    = {Cheng Sun and
               Chi{-}Wei Hsiao and
               Min Sun and
               Hwann{-}Tzong Chen},
  title     = {HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch
               Data Augmentation},
  booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR}
               2019, Long Beach, CA, USA, June 16-20, 2019},
  pages     = {1047--1056},
  year      = {2019},
}
```