# Training

The prebuilt models will always be improved by adding data from the target area. In our work, we have found that even an hour's worth of carefully chosen hand-annotation can yield enormous improvements in accuracy and precision. 5-10 epochs of fine-tuning with the prebuilt model are often adequate at a low learning rate.

Consider an annotations.csv file in the following format

testfile_deepforest.csv

```
image_path, xmin, ymin, xmax, ymax, label
OSBS_029.jpg,256,99,288,140,Tree
OSBS_029.jpg,166,253,225,304,Tree
OSBS_029.jpg,365,2,400,27,Tree
OSBS_029.jpg,312,13,349,47,Tree
OSBS_029.jpg,365,21,400,70,Tree
OSBS_029.jpg,278,1,312,37,Tree
OSBS_029.jpg,364,204,400,246,Tree
OSBS_029.jpg,90,117,121,145,Tree
OSBS_029.jpg,115,109,150,152,Tree
OSBS_029.jpg,161,155,199,191,Tree
```

The config file specifies the path to the CSV file that we want to use when training. The images are located in the working directory by default, and a user can provide a path to a different image directory.

```python
import os
from deepforest import main
from deepforest import get_data

# Example run with short training
annotations_file = get_data("testfile_deepforest.csv")

# Load the default model
m = main.deepforest()

m.config.train.epochs = 1
m.config.train.csv_file = annotations_file
m.config.train.root_dir = os.path.dirname(annotations_file)

m.create_trainer()
```

For debugging, its often useful to use the [fast_dev_run = True from pytorch lightning](https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html#fast-dev-run)

```python
m.config.train.fast_dev_run = True
```

See [config](https://deepforest.readthedocs.io/en/latest/ConfigurationFile.html) for full set of available arguments. You can also pass any [additional](https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html) pytorch lightning argument to trainer.

To begin training, we create a pytorch-lightning trainer and call trainer.fit on the model object directly on itself.
While this might look a touch awkward, it is useful for exposing the pytorch lightning functionality.

```python
m.trainer.fit(model)
```

[For more, see Google colab demo on model training](https://colab.research.google.com/drive/1gKUiocwfCvcvVfiKzAaf6voiUVL2KK_r?usp=sharing)

## Fine-tuning vs from-scratch training

Depending on your task, you might want to fine-tune an existing DeepForest model. This is probably the case if you want to detect trees in a region where the default model performs poorly. If your detection task is very different, like detecting wildlife or non-aerial images then you may wish to train "from scratch". Within DeepForest, this means starting from a generic pretrained model, typically trained on a large dataset like MS-COCO (and usually, on Imagenet as well). This is almost always more efficient than truly training from random weights.

To specify that you don't want to use a prebuilt model, set `model.name = None`. For example:

```python
m = main.deepforest(config_args{"num_classes": 3,
                                "label_dict": {
                                    "Tree": 0,
                                    "Bird": 1,
                                    "Animal": 2
                                }
                                "model":{"name":None}})
```

which will create an initialized RetinaNet model with 3 classes, ready for training. You must always specify your class count and label map.

## Custom datasets with other classes

If you need to re-train a "tree" detection model to work in your specific survey area or ecosystem, you don't need to do anything. Models themselves have no understanding of the label "tree", they output predictions corresponding to numerical class IDs (starting at 0). The default config sets `{"Tree": 0}`.

However, if you want to train on multiple classes or detect something that isn't a tree (for example we host a model for multiple Everglades bird species), you need to specify:

1. The number of classes you want to train on (this may be 1, unchanged)
2. The label_dict that specifies what your classes are called.

For example:

```python
config_args = {
    "num_classes": 2,
    "label_dict": {
        "Alive": 0,
        "Dead": 1
    }
}

m = main.deepforest(config_args=config_args)
```

Under the hood, the following steps are taken:

1. DeepForest loads a model from a checkpoint (for fine-tuning), or it initializes a model ready for training.
2. If your class count and label dict are the same, nothing happens
3. If the class count is the same, but your label dict differs (e.g. you set `{"Bird": 0}` over a tree model) then the model will be modified to reflect the new class list. At this point your model is technically unmodified, but the assumption is that you will re-train it.
4. If the class count differs, then the model is modified with the desired number of classes (and it will not provide good predictions until re-trained).

If you modify the defaulf configuration, we assume that you intend to train on your own data, and we will respect your overrides.

## Disable the progress bar

If you want to disable the progress bar while training change the `create_trainer` call to:

```python
from deepforest import model

 model.create_trainer(enable_progress_bar=False)
```

## Loggers

DeepForest logs the training loss, validation loss and class metrics (for multi-class models) during each epoch. To view the training curves, we *highly* recommend using a pytorch-lightning logger, this is the proper way of handling the many outputs during training. See [pytorch-lightning docs](https://lightning.ai/docs/pytorch/stable/extensions/logging.html) for all available loggers.

```python
from deepforest import main

m = main.deepforest()
logger = <any supported pytorch lightning logger>
m.create_trainer(logger=logger)
```

### Video walkthrough of colab

<div style="position: relative; padding-bottom: 56.25%; height: 0;"><iframe src="https://www.loom.com/embed/99c55129d5a34f3dbf7053dde9c7d97e" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>

## Reducing tile size

High resolution tiles may exceed GPU or CPU memory during training, especially when many target objecrts are present. To reduce the size of each tile, use preprocess.split_raster to divide the original tile into smaller pieces and create a corresponding annotations file.

For example, this sample data raster has size 2472, 2299 pixels.
```
"""Split raster into crops with overlaps to maintain all annotations"""
raster = get_data("2019_YELL_2_528000_4978000_image_crop2.png")
import rasterio
src = rasterio.open(raster)
/Users/benweinstein/.conda/envs/DeepForest/lib/python3.9/site-packages/rasterio/__init__.py:220: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix be returned.
  s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
src.read().shape
(3, 2472, 2299)
```

With 574 trees annotations

```python
from deepforest import utilities
from deepforest import get_data

annotations = utilities.read_pascal_voc(get_data("2019_YELL_2_528000_4978000_image_crop2.xml"))
annotations.shape
(574, 6)
```

```python
import tempfile
from deepforest import preprocess

#Write csv to file and crop
tmpdir = tempfile.gettempdir()
annotations.to_csv("{}/example.csv".format(tmpdir), index=False)
annotations_file = preprocess.split_raster(path_to_raster=raster,
                                           annotations_file="{}/example.csv".format(tmpdir),
                                           base_dir=tmpdir,
                                           patch_size=500,
                                           patch_overlap=0.25)

# Returns a 6 column pandas array
assert annotations_file.shape[1] == 6
```

Now we have crops and annotations in 500 px patches for training.

### Negative samples

To include images with no annotations from the target classes create a dummy row specifying the image_path, but set all bounding boxes to 0

```python
image_path, xmin, ymin, xmax, ymax, label
myimage.png, 0,0,0,0,"Tree"
```

Excessive use of negative samples may have a negative impact on model performance, but when used sparingly, they can increase precision.

### Model checkpoints

Pytorch lightning allows you to [save a model](https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#checkpoint-callback) at the end of each epoch. By default this behevaior is turned off since it slows down training and quickly fills up storage. To restore model checkpointing

```python
from pytorch_lightning.callbacks import ModelCheckpoint
from pytorch_lightning.loggers import TensorBoardLogger

callback = ModelCheckpoint(dirpath='temp/dir',
                                 monitor='box_recall',
                                 mode="max",
                                 save_top_k=3,
                                 filename="box_recall-{epoch:02d}-{box_recall:.2f}")
model.create_trainer(logger=TensorBoardLogger(save_dir='logdir/'),
                                  callbacks=[callback])
model.trainer.fit(model)
```
### Saving and loading models

```python
import tempfile
import pandas as pd
from deepforest import main

tmpdir = tempfile.TemporaryDirectory()

# Create a deepforest model and load the latest release
m = main.deepforest()
m.load_model("weecology/deepforest-tree")

#save the prediction dataframe after training and compare with prediction after reload checkpoint
img_path = get_data("OSBS_029.png")
model.create_trainer()
model.trainer.fit(model)
pred_after_train = model.predict_image(path = img_path)

#Save a checkpoint via the trainer
model.trainer.save_checkpoint("{}/checkpoint.pl".format(tmpdir))

#reload the checkpoint to model object
after = main.deepforest.load_from_checkpoint("{}/checkpoint.pl".format(tmpdir))
pred_after_reload = after.predict_image(path = img_path)

assert not pred_after_train.empty
assert not pred_after_reload.empty
pd.testing.assert_frame_equal(pred_after_train,pred_after_reload)
```

---

Note that when reloading models, you should carefully inspect the model parameters, such as the score_thresh and nms_thresh. These parameters are updated during model creation and the config file is not read when loading from checkpoint!

It is best to be direct to specify after loading checkpoint. If you want to save hyperparameters, edit the deepforest_config.yml directly. This will allow the hyperparameters to be reloaded on deepforest.save_model().

---

```
after.model.score_thresh = 0.3
```

Some users have reported a pytorch lightning module error on save

In this case, just saving the torch model is an easy fix.

```
torch.save(model.model.state_dict(),model_path)
```

and restore

```
model = main.deepforest()
model.model.load_state_dict(torch.load(model_path))
```

Note that if you trained on GPU and restore on cpu, you will need the map_location argument in torch.load.


### Data Augmentations

DeepForest supports configurable data augmentations using [Albumentations](https://albumentations.ai/docs/3-basic-usage/bounding-boxes-augmentations/) to improve model generalization across different sensors and acquisition conditions. Augmentations can be specified through the configuration file or passed directly to the model.

#### Configuration-based Augmentations

The easiest way to specify augmentations is through the config file:

```yaml
train:
  # Simple list of augmentation names
  augmentations: ["HorizontalFlip", "Downscale", "RandomBrightnessContrast"]

  # Or as a list of custom parameters
  augmentations:
    - HorizontalFlip: {p: 0.5}
    - Downscale: {scale_range: [0.25, 0.75], p: 0.5}
    - RandomSizedBBoxSafeCrop: {height: 400, width: 400, p: 0.3}
    - PadIfNeeded: {min_height: 400, min_width: 400, p: 1.0}
```

Note that augmentations are provided as a list (prepended with a `-` in YAML). If you omit this, the parameter will be interpreted as a dictionary and the config parser may fail. If you provide only the augmentation name, default settings will be used. These have been chosen to reflect sensible parameters for different transformations, as it's quite easy to "over augment" which can make models harder to train. By default, if you enable augmentation and do not specify a transform explicitly, only `HorizontalFlip` will be used.

You can also specify augmentations programmatically:

```python
from deepforest import main

# Using config_args
config_args = {
    "train": {
        "augmentations": ["HorizontalFlip", "Downscale", "RandomBrightnessContrast"]
    }
}
model = main.deepforest(config_args=config_args)

# Or with parameters
config_args = {
    "train": {
        "augmentations": [
            "HorizontalFlip": {"p": 0.8},
            "Downscale": {"scale_range": (0.5, 0.9), "p": 0.3}
        ]
    }
}
model = main.deepforest(config_args=config_args)
```

#### Available Augmentations

DeepForest supports the following augmentations optimized for object detection:

- **[HorizontalFlip](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/flip/#HorizontalFlip)**: Randomly flip images horizontally
- **[VerticalFlip](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/flip/#VerticalFlip)**: Randomly flip images vertically
- **[Downscale](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#Downscale)**: Randomly downscale images to simulate different resolutions
- **[RandomSizedBBoxSafeCrop](https://albumentations.ai/docs/api-reference/albumentations/augmentations/crops/transforms/#RandomSizedBBoxSafeCrop)**: Crop image while preserving bounding boxes
- **[PadIfNeeded](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/pad/#PadIfNeeded)**: Pad images to minimum size
- **[Rotate](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/rotate/#Rotate)**: Rotate images by small angles
- **[RandomBrightnessContrast](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#RandomBrightnessContrast)**: Adjust brightness and contrast
- **[HueSaturationValue](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#HueSaturationValue)**: Adjust color properties
- **[GaussNoise](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#GaussNoise)**: Add gaussian noise
- **[Blur](https://albumentations.ai/docs/api-reference/albumentations/augmentations/blur/transforms/#Blur)**: Apply blur effect

#### Zoom Augmentations for Multi-Resolution Training

For improved generalization across different image resolutions and scales:

```python
# Configuration for zoom/scale augmentations
config_args = {
    "train": {
        "augmentations": [
            # Simulate different acquisition heights/resolutions
            "Downscale": {"scale_range": (0.25, 0.75), "p": 0.5},

            # Crop at different scales while preserving objects
            "RandomSizedBBoxSafeCrop": {"height": 400, "width": 400, "p": 0.3},

            # Ensure minimum image size
            "PadIfNeeded": {"min_height": 400, "min_width": 400, "p": 1.0},

            # Basic data augmentation
            "HorizontalFlip": {"p": 0.5}
        ]
    }
}

model = main.deepforest(config_args=config_args)
```

#### Custom Transforms (Advanced)

For complete control over augmentations, you can still provide custom transforms:

```python
import albumentations as A
from albumentations.pytorch import ToTensorV2

def get_transform(augment):
    """Custom transform function"""
    if augment:
        transform = A.Compose([
            A.HorizontalFlip(p=0.5),
            A.Downscale(scale_range=(0.25, 0.75), p=0.5),
            ToTensorV2()
        ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=["category_ids"]))
    else:
        transform = A.Compose([ToTensorV2()],
                             bbox_params=A.BboxParams(format='pascal_voc', label_fields=["category_ids"]))
    return transform

model = main.deepforest(transforms=get_transform)
```

**Note**: When creating custom transforms, always include `ToTensorV2()` and properly configure `bbox_params` for object detection. If your augmentation pipeline does not contain any geometric transformations, `bbox_params` is not required. Otherwise it's important that you keep the format as `pascal_voc` so that the boxes are correctly interpreted by Albumentations.

**How do I make training faster?**

While it is impossible to anticipate the setup for all users, there are a few guidelines. First, a GPU-enabled processor is key. Training on a CPU can be done, but it will take much longer (100x) and is probably only done if needed. Using Google Colab can be beneficial but prone to errors. Once on the GPU, the configuration includes a "workers" argument. This connects to PyTorch's dataloader. As the number of workers increases, data is fed to the GPU in parallel. Increase the worker argument slowly, we have found that the optimal number of workers varies by system.

```
m.config.workers = 5
```

It is not foolproof, and occasionally 0 workers, in which data loading is run on the main thread, is optimal : https://stackoverflow.com/questions/73331758/can-ideal-num-workers-for-a-large-dataset-in-pytorch-be-0.

For large training runs, setting preload_images to True can be helpful.

```
m.configpreload_images = True
```

This will load all data into GPU memory once, at the beginning of the run. This is great, but it requires you to have enough memory space to do so.
Similarly, increasing the batch size can speed up training. Like both of the options above, we have seen examples where performance (and accuracy) improves and decreases depending on batch size. Track experiment results carefully when altering batch size, since it directly [effects the speed of learning](https://www.baeldung.com/cs/learning-rate-batch-size).

```
m.config.batch_size = 10
```

Remember to call m.create_trainer() after updating the config dictionary.

### Avoiding **Weakly referenced objects** errors

On some devices and systems we have found an error referencing the model.trainer object that was created in m.create_trainer().
We welcome a reproducible issue to address this error as it appears highly variable and relates to upstream issues. It appears more common on google colab and github actions.

In most cases, this error appears when running multiple calls to model.predict or model.train. We believe this occurs because garbage collection has deleted the model.trainer object see:
https://github.com/Lightning-AI/lightning/issues/12233
https://github.com/weecology/DeepForest/issues/338

If you run into this error, users (e.g https://github.com/weecology/DeepForest/issues/443), have found that creating the trainer object within the loop can resolve this issue.

```python
for tile in tiles_to_predict:
    m.create_trainer()
    m.predict_tile(tile)
```

Usually creating this object does not cost too much computational time.

#### Training across multiple nodes on a HPC system

We have heard that this error can appear when trying to deep copy the pytorch lightning module. The trainer object is not pickleable.
For example, on multi-gpu environments when trying to scale the deepforest model the entire module is copied leading to this error.
Setting the trainer object to None and directly using the pytorch object is a reasonable workaround.

Replace

```python
m = main.deepforest()
m.create_trainer()
m.trainer.fit(m)
```

with

```python
m.trainer = None
from pytorch_lightning import Trainer

    trainer = Trainer(
        accelerator="gpu",
        strategy="ddp",
        devices=model.config.devices,
        enable_checkpointing=False,
        max_epochs=model.config.train.epochs,
        logger=comet_logger
    )
trainer.fit(m)
```

The added benefits of this is more control over the trainer object.
The downside is that it doesn't align with the .config pattern where a user now has to look into the config to create the trainer.
We are open to changing this to be the default pattern in the future and welcome input from users.

#### Visualization during training

Visualizing images during training can be valuable to spot augmentation that isn't working as you expected, label issues and to see if the model is learning anything. To make this easy, we provide a Lightning callback that can be used with the trainer: `deepforest.callbacks.ImagesCallback`. You need to provide a directory path where the images will be saved, which can be a temporary path if you don't want to keep the images. To use, create the callback object and pass it to `create_trainer` along with any other callbacks you need.

```python
from pathlib import Path
import tempfile
from pytorch_lightning.loggers import CSVLogger
from deepforest import callbacks

logger = CSVLogger(save_dir="./logs")

# You can access the log directory from other loggers you've set up, to keep everything in the same place.
# the log_dir attribute is the resolved path that includes name and version number.


im_callback = callbacks.ImagesCallback(save_dir=Path(logger.log_dir) / "images", every_n_epochs=2)
m.create_trainer(callbacks=[im_callback], logger=logger)
```

The callback will, by default, log images to disk. When training starts, it will save images for the training and validation dataset (if available). Then at a user-specified interval (`every_n_epochs`), predictions will be logged with ground truth. If you have Comet or Tensorboard loggers (loggers which accept `add_image` or `log_image`) then the callback will attempt to log to those. Due to auto-discovery behavior with Comet, the callback will preferentially log to Tensorboard if present, to avoid images being pushed to Comet twice. To adjust the number of samples saved, modify `dataset_samples` and `prediction_samples` (set to 0 to disable).

If you don't want to keep images on disk, for example if you're using a cloud-based logger like Comet, you can pass a temporary directory to the callback.

```python
with tempfile.TemporaryDirectory() as tmpdir:
    im_callback = callbacks.ImagesCallback(save_dir=tmpdir, every_n_epochs=2)
    m.create_trainer(callbacks=[im_callback], logger=logger)
    m.trainer.fit(m)
```

#### Training via command line

We provide a basic script to trigger a training run via CLI. This script is installed as part of the standard DeepForest installation is called `deepforest train`. We use [Hydra](https://hydra.cc/docs/intro/) for configuration management and you can pass configuration parameters as command line arguments as follows:

```{note}
If you are using `uv` to manage your Python environment, remember to prefix these commands with `uv run`, for example: `uv run deepforest predict`.
```

```bash
deepforest train batch_size=8 train.csv_file=your_labels.csv train.root_dir=some/path
```

Under the hood, this simply sets up a standard DeepForest model, creates a trainer and runs `fit`. You can have a look at the script in `src/deepforest/scripts/cli.py` to see exactly what's being run. However for most users, configuring the dataset paths (`train.csv_file` and `train.root_dir`) and perhaps modifying `batch_size` should be sufficient to start with.

To check what configuration options are available, run:

```bash
deepforest config
```

which will show you where Hydra is looking and what the current (default) config is. You can then pick which parameters you want to override with the syntax `<key>=<value>`. For nested cofiguration, like the `train` option we passed above, separate levels of nesting with periods (`.`), such as `train.scheduler.params.gamma=0.1`.

For more complex training requirements, such as adding additional transforms or other aspects which are not directly modifiable via the config file, we suggest using the script as a reference.

To override settings using your own configuration file, we recommend copying the default configuration and place it in your working/training directory. This is the **preferred method** for repeatable training because it will provide you with a full record of the configuration of your training run. Then run:

```bash
deepforest --config-dir /your/config/folder --config-name config_file_name train
```

Note you don't need to pass the `yaml` extension. This method uses Hydra's [standard flags](https://hydra.cc/docs/advanced/hydra-command-line-flags/). Otherwise you can save a config file using any valid subset of the options (for example just the CSV location and root directory) and Hydra will overlay those on top of the default config.