Spaces:
Runtime error
Runtime error
| # FAQ | |
| ________________________________________________________________________________ | |
| **Q1: What should I do if I encounter OOM (out-of-memory) while training the | |
| models?** | |
| **A1**: To avoid OOM, you could try: | |
| 1. reducing the training crop size (i.e., the flag `crop_size` in | |
| `train_dataset_options`, and see Q2 for more details), which reduces the | |
| input size during training, | |
| 2. using a larger output stride (e.g., 32) in the backbone (i.e., the flag | |
| `output_stride` in `model_options`, and see Q3 for more details), which | |
| reduces the usage of atrous convolution, | |
| 3. using a smaller backbone, such as ResNet-50. | |
| ________________________________________________________________________________ | |
| **Q2: What is the `crop_size` I need to set?** | |
| **A2**: DeepLab framework always uses `crop_size` equal to `output_stride` * k + | |
| 1, where k is an integer. | |
| * During inference/evaluation, since DeepLab framework uses whole-image | |
| inference, we need to set k so that the resulting `crop_size` (in | |
| `eval_dataset_options`) is slightly larger the largest image dimension in | |
| the dataset. For example, we set eval_crop_size = 1025x2049 for Cityscapes | |
| images whose image dimension is all equal to 1024x2048. | |
| * During training, we could set k to be any integer as long as it fits to your | |
| device memory. However, we notice a better performance when we have the same | |
| `crop_size` during training and evaluation (i.e., also use whole-image crop | |
| size during training). | |
| ________________________________________________________________________________ | |
| **Q3: What output stride should I use in the backbone?** | |
| **A3**: Using a different output stride leads to a different accuracy-and-memory | |
| trade-off. For example, DeepLabv1 uses output stride = 8, but it requires a lot | |
| of device memory. In DeepLabv3+ paper, we found that using output stride = 16 | |
| strikes the best accuracy-and-memory trade-off, which is then our default | |
| setting. If you wish to further reduce the memory usage, you could set output | |
| stride to 32. Additionally, we suggest adjusting the `atrous_rates` in the ASPP | |
| module as follows. | |
| * If `backbone.output_stride` = 32, use `atrous_rates` = [3, 6, 9]. | |
| * If `backbone.output_stride` = 16, use `atrous_rates` = [6, 12, 18]. | |
| * If `backbone.output_stride` = 8, use `atrous_rates` = [12, 24, 36]. | |
| Note that these settings may not be optimal. You may need to adjust them to | |
| better fit your dataset. | |
| ________________________________________________________________________________ | |
| **Q4: Why are the results reported by the provided evaluation code slightly | |
| different from the official evaluation code (e.g., | |
| [Cityscapes](https://github.com/mcordts/cityscapesScripts))?** | |
| **A4**: In order to run everything end-to-end in the TensorFlow system (e.g., | |
| the on-line evaluation during training), we re-implemented the evaluation codes | |
| in TensorFlow. Additionally, our whole system, including the training and | |
| evaluation pipelines, uses the panoptic label format (i.e., `panoptic_label = | |
| semantic_label * label_divisor + instance_id`, where the `label_divisor` should | |
| be larger than the maximum number of instances per image), instead of the JSON | |
| [COCO formats](https://cocodataset.org/#format-data). These two changes along | |
| with rounding and similar issues result in some minor differences. Therefore, | |
| our re-implemented evaluation code is mainly used for TensorFlow integration | |
| (e.g., the support of on-line evaluation in TensorBoard). The users should run | |
| the corresponding official evaluation code in order to compare with other | |
| published papers. Note that all the reported numbers in our papers are evaluated | |
| with the official evaluation code. | |
| To facilitate the conversion between prediction formats, we also provide | |
| instructions for running the official evaluation codes on | |
| [Cityscapes](setup/cityscapes_test_server_evaluation.md) and | |
| [COCO](setup/coco_test_server_evaluation.md). | |
| ________________________________________________________________________________ | |
| **Q5: What should I do, if I could not manage to compile TensorFlow along with | |
| the provided efficient merging operation `merge_semantic_and_instance_maps`?** | |
| **A5**: In this case, we provide another fallback solution, which implements the | |
| merging operation with pure tf functions. This fallback solution does not | |
| require any TensorFlow compilation. However, note that compared to our provided | |
| TensorFlow merging operation `merge_semantic_and_instance_maps`, its inference | |
| speed is slower and the resulting segmentation performance may also be slightly | |
| lower. | |
| To use the pure-tf-function version of `merge_semantic_and_instance_maps`, set | |
| `merge_semantic_instance_with_tf_op` to `false` in your config's | |
| `evaluator_options`. | |