deepforest package

Submodules

deepforest.IoU module

IoU Module, with help from https://github.com/SpaceNetChallenge/utilities/blob/spacenetV3/spacenetutilities/evalTools.py

deepforest.IoU.compute_IoU(ground_truth, submission)[source]
Parameters:
  • ground_truth – a projected geopandas dataframe with geoemtry
  • submission – a projected geopandas dataframe with geometry
Returns:

dataframe of IoU scores

Return type:

iou_df

deepforest.IoU.create_rtree_from_poly(poly_list)[source]

deepforest.callbacks module

A deepforest callback Callbacks must have the following methods on_epoch_begin, on_epoch_end, on_fit_end, on_fit_begin methods and inject model and epoch kwargs.

class deepforest.callbacks.images_callback(csv_file, root_dir, savedir, n=2, every_n_epochs=5)[source]

Bases: pytorch_lightning.callbacks.base.Callback

Run evaluation on a file of annotations during training :param model: pytorch model :param csv_file: path to csv with columns, image_path, xmin, ymin, xmax, ymax, label :param epoch: integer. current epoch :param experiment: optional comet_ml experiment :param savedir: optional, directory to save predicted images :param project: whether to project image coordinates into geographic coordinations, see deepforest.evaluate :param root_dir: root directory of images to search for ‘image path’ values from the csv file :param iou_threshold: intersection-over-union threshold, see deepforest.evaluate :param probability_threshold: minimum probablity for inclusion, see deepforest.evaluate :param n: number of images to upload :param every_n_epochs: run epoch interval

Returns:either prints validation scores or logs them to a comet experiment
Return type:None
log_images(pl_module)[source]
on_epoch_end(trainer, pl_module)[source]

Called when either of train/val/test epoch ends.

deepforest.dataset module

Dataset model

https://pytorch.org/docs/stable/torchvision/models.html#object-detection-instance-segmentation-and-person-keypoint-detection

During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing:

boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W

labels (Int64Tensor[N]): the class label for each ground-truth box

https://colab.research.google.com/github/benihime91/pytorch_retinanet/blob/master/demo.ipynb#scrollTo=0zNGhr6D7xGN

class deepforest.dataset.TreeDataset(csv_file, root_dir, transforms=None, label_dict={'Tree': 0}, train=True)[source]

Bases: torch.utils.data.dataset.Dataset

Parameters:
  • csv_file (string) – Path to a single csv file with annotations.
  • root_dir (string) – Directory with all the images.
  • transform (callable, optional) – Optional transform to be applied on a sample.
  • label_dict – a dictionary where keys are labels from the csv column and values are numeric labels “Tree” -> 0
Returns:

path, image, targets else:

image

Return type:

If train

deepforest.dataset.get_transform(augment)[source]

Albumentations transformation of bounding boxs

deepforest.evaluate module

Evaluation module

deepforest.evaluate.evaluate(predictions, ground_df, root_dir, iou_threshold=0.4, savedir=None)[source]

Image annotated crown evaluation routine submission can be submitted as a .shp, existing pandas dataframe or .csv path

Parameters:
  • predictions – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name. The labels in ground truth and predictions must match. If one is numeric, the other must be numeric.
  • ground_df – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name
  • root_dir – location of files in the dataframe ‘name’ column.
Returns:

a dataframe of match bounding boxes box_recall: proportion of true positives of box position, regardless of class box_precision: proportion of predictions that are true positive, regardless of class class_recall: a pandas dataframe of class level recall and precision with class sizes

Return type:

results

deepforest.evaluate.evaluate_image(predictions, ground_df, root_dir, savedir=None)[source]

Compute intersection-over-union matching among prediction and ground truth boxes for one image :param df: a pandas dataframe with columns name, xmin, xmax, ymin, ymax, label. The ‘name’ column should be the path relative to the location of the file. :param summarize: Whether to group statistics by plot and overall score :param image_coordinates: Whether the current boxes are in coordinate system of the image, e.g. origin (0,0) upper left. :param root_dir: Where to search for image names in df :param savedir: optional directory to save image with overlaid predictions and annotations

Returns:pandas dataframe with crown ids of prediciton and ground truth and the IoU score.
Return type:result

deepforest.main module

class deepforest.main.deepforest(num_classes=1, label_dict={'Tree': 0}, transforms=None)[source]

Bases: pytorch_lightning.core.lightning.LightningModule

Class for training and predicting tree crowns in RGB images

Parameters:num_classes (int) – number of classes in the model
Returns:a deepforest pytorch lightning module
Return type:self
configure_optimizers()[source]

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple.

Returns:Any of these 6 options.
  • Single optimizer.
  • List or Tuple of optimizers.
  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_dict).
  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_dict.
  • Tuple of dictionaries as described above, with an optional "frequency" key.
  • None - Fit will run without any optimizer.

The lr_dict is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_dict = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_dict contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

# The ReduceLROnPlateau scheduler requires a monitor
def configure_optimizers(self):
    optimizer = Adam(...)
    return {
        "optimizer": optimizer,
        "lr_scheduler": {
            "scheduler": ReduceLROnPlateau(optimizer, ...),
            "monitor": "metric_to_track",
        },
    }


# In the case of two optimizers, only one using the ReduceLROnPlateau scheduler
def configure_optimizers(self):
    optimizer1 = Adam(...)
    optimizer2 = SGD(...)
    scheduler1 = ReduceLROnPlateau(optimizer1, ...)
    scheduler2 = LambdaLR(optimizer2, ...)
    return (
        {
            "optimizer": optimizer1,
            "lr_scheduler": {
                "scheduler": scheduler1,
                "monitor": "metric_to_track",
            },
        },
        {"optimizer": optimizer2, "lr_scheduler": scheduler2},
    )

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

The frequency value specified in a dict along with the optimizer key is an int corresponding to the number of sequential batches optimized with the specific optimizer. It should be given to none or to all of the optimizers. There is a difference between passing multiple optimizers in a list, and passing multiple optimizers in dictionaries with a frequency of 1:

  • In the former case, all optimizers will operate on the given batch in each optimization step.
  • In the latter, only one optimizer will operate on the given batch at every step.

This is different from the frequency value specified in the lr_dict mentioned above.

def configure_optimizers(self):
    optimizer_one = torch.optim.SGD(self.model.parameters(), lr=0.01)
    optimizer_two = torch.optim.SGD(self.model.parameters(), lr=0.01)
    return [
        {"optimizer": optimizer_one, "frequency": 5},
        {"optimizer": optimizer_two, "frequency": 10},
    ]

In this example, the first optimizer will be used for the first 5 steps, the second optimizer for the next 10 steps and that cycle will continue. If an LR scheduler is specified for an optimizer using the lr_scheduler key in the above dict, the scheduler will only be updated when its optimizer is being used.

Examples:

# most cases. no learning rate scheduler
def configure_optimizers(self):
    return Adam(self.parameters(), lr=1e-3)

# multiple optimizer case (e.g.: GAN)
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    return gen_opt, dis_opt

# example with learning rate schedulers
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    dis_sch = CosineAnnealing(dis_opt, T_max=10)
    return [gen_opt, dis_opt], [dis_sch]

# example with step-based learning rate schedulers
# each optimizer has its own scheduler
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    gen_sch = {
        'scheduler': ExponentialLR(gen_opt, 0.99),
        'interval': 'step'  # called after each training step
    }
    dis_sch = CosineAnnealing(dis_opt, T_max=10) # called every epoch
    return [gen_opt, dis_opt], [gen_sch, dis_sch]

# example with optimizer frequencies
# see training procedure in `Improved Training of Wasserstein GANs`, Algorithm 1
# https://arxiv.org/abs/1704.00028
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    n_critic = 5
    return (
        {'optimizer': dis_opt, 'frequency': n_critic},
        {'optimizer': gen_opt, 'frequency': 1}
    )

Note

Some things to know:

  • Lightning calls .backward() and .step() on each optimizer and learning rate scheduler as needed.
  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizers.
  • If you use multiple optimizers, training_step() will have an additional optimizer_idx parameter.
  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.
  • If you use multiple optimizers, gradients will be calculated only for the parameters of current optimizer at each training step.
  • If you need to control how often those optimizers step or override the default .step() schedule, override the optimizer_step() hook.
create_model()[source]

Define a deepforest retinanet architecture

create_trainer(logger=None, callbacks=[], **kwargs)[source]

Create a pytorch lightning training by reading config files :param callbacks: a list of pytorch-lightning callback classes :type callbacks: list

evaluate(csv_file, root_dir, iou_threshold=None, savedir=None)[source]

Compute intersection-over-union and precision/recall for a given iou_threshold

Parameters:
  • csv_file – location of a csv file with columns “name”,”xmin”,”ymin”,”xmax”,”ymax”,”label”, each box in a row
  • root_dir – location of files in the dataframe ‘name’ column.
  • iou_threshold – float [0,1] intersection-over-union union between annotation and prediction to be scored true positive
  • savedir – optional path dir to save evaluation images
Returns:

dict of (“results”, “precision”, “recall”) for a given threshold

Return type:

results

load_dataset(csv_file, root_dir=None, augment=False, shuffle=True, batch_size=1)[source]

Create a tree dataset for inference Csv file format is .csv file with the columns “image_path”, “xmin”,”ymin”,”xmax”,”ymax” for the image name and bounding box position. Image_path is the relative filename, not absolute path, which is in the root_dir directory. One bounding box per line.

Parameters:
  • csv_file – path to csv file
  • root_dir – directory of images. If none, uses “image_dir” in config
  • augment – Whether to create a training dataset, this activates data augmentations
Returns:

a pytorch dataset

Return type:

ds

predict_file(csv_file, root_dir, savedir=None, color=None, thickness=1)[source]

Create a dataset and predict entire annotation file

Csv file format is .csv file with the columns “image_path”, “xmin”,”ymin”,”xmax”,”ymax” for the image name and bounding box position. Image_path is the relative filename, not absolute path, which is in the root_dir directory. One bounding box per line.

Parameters:
  • csv_file – path to csv file
  • root_dir – directory of images. If none, uses “image_dir” in config
  • savedir – Optional. Directory to save image plots.
  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)
  • thickness – thickness of the rectangle border line in px
Returns:

pandas dataframe with bounding boxes, label and scores for each image in the csv file

Return type:

df

predict_image(image=None, path=None, return_plot=False, color=None, thickness=1)[source]

Predict a single image with a deepforest model

Parameters:
  • image – a float32 numpy array of a RGB with channels last format
  • path – optional path to read image from disk instead of passing image arg
  • return_plot – Return image with plotted detections
  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)
  • thickness – thickness of the rectangle border line in px
Returns:

A pandas dataframe of predictions (Default) img: The input with predictions overlaid (Optional)

Return type:

boxes

predict_tile(raster_path=None, image=None, patch_size=400, patch_overlap=0.05, iou_threshold=0.15, return_plot=False, use_soft_nms=False, sigma=0.5, thresh=0.001, color=None, thickness=1)[source]

For images too large to input into the model, predict_tile cuts the image into overlapping windows, predicts trees on each window and reassambles into a single array.

Parameters:
  • raster_path – Path to image on disk
  • image (array) – Numpy image array in BGR channel order following openCV convention
  • patch_size – patch size default400,
  • patch_overlap – patch overlap default 0.15,
  • iou_threshold – Minimum iou overlap among predictions between windows to be suppressed. Defaults to 0.5. Lower values suppress more boxes at edges.
  • return_plot – Should the image be returned with the predictions drawn?
  • use_soft_nms – whether to perform Gaussian Soft NMS or not, if false, default perform NMS.
  • sigma – variance of Gaussian function used in Gaussian Soft NMS
  • thresh – the score thresh used to filter bboxes after soft-nms performed
  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)
  • thickness – thickness of the rectangle border line in px
Returns:

if return_plot, an image. Otherwise a numpy array of predicted bounding boxes, scores and labels

Return type:

boxes (array)

save_model(path)[source]

Save the trainer checkpoint in user defined path, in order to access in future :param Path: the path located the model checkpoint

train_dataloader()[source]

Train loader using the configurations Returns: loader

training_step(batch, batch_idx)[source]

Train on a loaded dataset

use_bird_release(check_release=True)[source]

Use the latest DeepForest bird model release from github and load model. Optionally download if release doesn’t exist. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns:A trained pytorch model
Return type:model (object)
use_release(check_release=True)[source]

Use the latest DeepForest model release from github and load model. Optionally download if release doesn’t exist. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns:A trained PyTorch model
Return type:model (object)
val_dataloader()[source]

Create a val data loader only if specified in config Returns: loader or None

validation_step(batch, batch_idx)[source]

Train on a loaded dataset

deepforest.model module

deepforest.model.create_anchor_generator(sizes=((8, 16, 32, 64, 128, 256, 400), ), aspect_ratios=((0.5, 1.0, 2.0), ))[source]

Create anchor box generator as a function of sizes and aspect ratios Documented https://github.com/pytorch/vision/blob/67b25288ca202d027e8b06e17111f1bcebd2046c/torchvision/models/detection/anchor_utils.py#L9 let’s make the network generate 5 x 3 anchors per spatial location, with 5 different sizes and 3 different aspect ratios. We have a Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios :param sizes: :param aspect_ratios:

Returns: anchor_generator, a pytorch module

deepforest.model.create_model(num_classes, nms_thresh, score_thresh, backbone=None)[source]

Create a retinanet model :param num_classes: number of classes in the model :type num_classes: int :param nms_thresh: non-max suppression threshold for intersection-over-union [0,1] :type nms_thresh: float :param score_thresh: minimum prediction score to keep during prediction [0,1] :type score_thresh: float

Returns:a pytorch nn module
Return type:model
deepforest.model.load_backbone()[source]

A torch vision retinanet model

deepforest.predict module

deepforest.predict.across_class_nms(predicted_boxes, iou_threshold=0.15)[source]

perform non-max suppression for a dataframe of results (see visualize.format_boxes) to remove boxes that overlap by iou_thresholdold of IoU

deepforest.predict.predict_file(model, csv_file, root_dir, savedir, device, iou_threshold=0.1, color=(0, 165, 255), thickness=1)[source]

Create a dataset and predict entire annotation file

Csv file format is .csv file with the columns “image_path”, “xmin”,”ymin”,”xmax”,”ymax” for the image name and bounding box position. Image_path is the relative filename, not absolute path, which is in the root_dir directory. One bounding box per line. If “label” column is present, these are assumed to be annotations and will be plotted in a different color than predictions

Parameters:
  • csv_file – path to csv file
  • root_dir – directory of images. If none, uses “image_dir” in config
  • savedir – Optional. Directory to save image plots.
  • device – pytorch device of ‘cuda’ or ‘cpu’ for gpu prediction. Set internally.
  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)
  • thickness – thickness of the rectangle border line in px
Returns:

pandas dataframe with bounding boxes, label and scores for each image in the csv file

Return type:

df

deepforest.predict.predict_image(model, image, return_plot, device, iou_threshold=0.1, color=None, thickness=1)[source]

Predict an image with a deepforest model

Parameters:
  • image – a numpy array of a RGB image ranged from 0-255
  • path – optional path to read image from disk instead of passing image arg
  • return_plot – Return image with plotted detections
  • device – pytorch device of ‘cuda’ or ‘cpu’ for gpu prediction. Set internally.
  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)
  • thickness – thickness of the rectangle border line in px
Returns:

A pandas dataframe of predictions (Default) img: The input with predictions overlaid (Optional)

Return type:

boxes

deepforest.predict.predict_tile(model, device, raster_path=None, image=None, patch_size=400, patch_overlap=0.05, iou_threshold=0.15, return_plot=False, use_soft_nms=False, sigma=0.5, thresh=0.001, color=None, thickness=1)[source]

For images too large to input into the model, predict_tile cuts the image into overlapping windows, predicts trees on each window and reassambles into a single array.

Parameters:
  • model – pytorch model
  • device – pytorch device of ‘cuda’ or ‘cpu’ for gpu prediction. Set internally.
  • numeric_to_label_dict – dictionary in which keys are numeric integers and values are character labels
  • raster_path – Path to image on disk
  • image (array) – Numpy image array in BGR channel order following openCV convention
  • patch_size – patch size default400,
  • patch_overlap – patch overlap default 0.15,
  • iou_threshold – Minimum iou overlap among predictions between windows to be suppressed. Defaults to 0.14. Lower values suppress more boxes at edges.
  • return_plot – Should the image be returned with the predictions drawn?
  • use_soft_nms – whether to perform Gaussian Soft NMS or not, if false, default perform NMS.
  • sigma – variance of Gaussian function used in Gaussian Soft NMS
  • thresh – the score thresh used to filter bboxes after soft-nms performed
Returns:

if return_plot, an image. Otherwise a numpy array of predicted bounding boxes, scores and labels

Return type:

boxes (array)

deepforest.predict.soft_nms(boxes, scores, sigma=0.5, thresh=0.001)[source]

Perform python soft_nms to reduce the confidances of the proposals proportional to IoU value Paper: Improving Object Detection With One Line of Code Code : https://github.com/DocF/Soft-NMS/blob/master/softnms_pytorch.py :param boxes: predicitons bounding boxes tensor format [x1,y1,x2,y2] :param scores: the score corresponding to each box tensors :param sigma: variance of Gaussian function :param thresh: score thresh

Returns:the index list of the selected boxes
Return type:idxs_keep

deepforest.preprocess module

The preprocessing module is used to reshape data into format suitable for training or prediction.

For example cutting large tiles into smaller images.

deepforest.preprocess.compute_windows(numpy_image, patch_size, patch_overlap)[source]

Create a sliding window object from a raster tile.

Parameters:numpy_image (array) – Raster object as numpy array to cut into crops
Returns:a sliding windows object
Return type:windows (list)
deepforest.preprocess.image_name_from_path(image_path)[source]

Convert path to image name for use in indexing.

deepforest.preprocess.preprocess_image(image, device)[source]

Preprocess a single RGB numpy array as a prediction from channels last, to channels first

deepforest.preprocess.save_crop(base_dir, image_name, index, crop)[source]

Save window crop as image file to be read by PIL.

Filename should match the image_name + window index

deepforest.preprocess.select_annotations(annotations, windows, index, allow_empty=False)[source]

Select annotations that overlap with selected image crop.

Parameters:
  • image_name (str) – Name of the image in the annotations file to lookup.
  • annotations_file – path to annotations file in the format -> image_path, xmin, ymin, xmax, ymax, label
  • windows – A sliding window object (see compute_windows)
  • index – The index in the windows object to use a crop bounds
  • allow_empty (bool) – If True, allow window crops that have no annotations to be included
Returns:

a pandas dataframe of annotations

Return type:

selected_annotations

deepforest.preprocess.split_raster(annotations_file, path_to_raster=None, numpy_image=None, base_dir='.', patch_size=400, patch_overlap=0.05, allow_empty=False, image_name=None)[source]

Divide a large tile into smaller arrays. Each crop will be saved to file.

Parameters:
  • numpy_image – a numpy object to be used as a raster, usually opened from rasterio.open.read()
  • path_to_raster – (str): Path to a tile that can be read by rasterio on disk
  • annotations_file (str) – Path to annotations file (with column names) data in the format -> image_path, xmin, ymin, xmax, ymax, label
  • base_dir (str) – Where to save the annotations and image crops relative to current working dir
  • patch_size (int) – Maximum dimensions of square window
  • patch_overlap (float) – Percent of overlap among windows 0->1
  • allow_empty – If True, include images with no annotations to be included in the dataset
  • image_name (str) – If numpy_image arg is used, what name to give the raster?
Returns:

A pandas dataframe with annotations file for training.

deepforest.utilities module

Utilities model

class deepforest.utilities.DownloadProgressBar(iterable=None, desc=None, total=None, leave=True, file=None, ncols=None, mininterval=0.1, maxinterval=10.0, miniters=None, ascii=None, disable=False, unit='it', unit_scale=False, dynamic_ncols=False, smoothing=0.3, bar_format=None, initial=0, position=None, postfix=None, unit_divisor=1000, write_bytes=None, lock_args=None, nrows=None, colour=None, delay=0, gui=False, **kwargs)[source]

Bases: tqdm.std.tqdm

Download progress bar class.

Parameters:
  • iterable (iterable, optional) – Iterable to decorate with a progressbar. Leave blank to manually manage the updates.
  • desc (str, optional) – Prefix for the progressbar.
  • total (int or float, optional) – The number of expected iterations. If unspecified, len(iterable) is used if possible. If float(“inf”) or as a last resort, only basic progress statistics are displayed (no ETA, no progressbar). If gui is True and this parameter needs subsequent updating, specify an initial arbitrary large positive number, e.g. 9e9.
  • leave (bool, optional) – If [default: True], keeps all traces of the progressbar upon termination of iteration. If None, will leave only if position is 0.
  • file (io.TextIOWrapper or io.StringIO, optional) – Specifies where to output the progress messages (default: sys.stderr). Uses file.write(str) and file.flush() methods. For encoding, see write_bytes.
  • ncols (int, optional) – The width of the entire output message. If specified, dynamically resizes the progressbar to stay within this bound. If unspecified, attempts to use environment width. The fallback is a meter width of 10 and no limit for the counter and statistics. If 0, will not print any meter (only stats).
  • mininterval (float, optional) – Minimum progress display update interval [default: 0.1] seconds.
  • maxinterval (float, optional) – Maximum progress display update interval [default: 10] seconds. Automatically adjusts miniters to correspond to mininterval after long display update lag. Only works if dynamic_miniters or monitor thread is enabled.
  • miniters (int or float, optional) – Minimum progress display update interval, in iterations. If 0 and dynamic_miniters, will automatically adjust to equal mininterval (more CPU efficient, good for tight loops). If > 0, will skip display of specified number of iterations. Tweak this and mininterval to get very efficient loops. If your progress is erratic with both fast and slow iterations (network, skipping items, etc) you should set miniters=1.
  • ascii (bool or str, optional) – If unspecified or False, use unicode (smooth blocks) to fill the meter. The fallback is to use ASCII characters ” 123456789#”.
  • disable (bool, optional) – Whether to disable the entire progressbar wrapper [default: False]. If set to None, disable on non-TTY.
  • unit (str, optional) – String that will be used to define the unit of each iteration [default: it].
  • unit_scale (bool or int or float, optional) – If 1 or True, the number of iterations will be reduced/scaled automatically and a metric prefix following the International System of Units standard will be added (kilo, mega, etc.) [default: False]. If any other non-zero number, will scale total and n.
  • dynamic_ncols (bool, optional) – If set, constantly alters ncols and nrows to the environment (allowing for window resizes) [default: False].
  • smoothing (float, optional) – Exponential moving average smoothing factor for speed estimates (ignored in GUI mode). Ranges from 0 (average speed) to 1 (current/instantaneous speed) [default: 0.3].
  • bar_format (str, optional) –

    Specify a custom bar string formatting. May impact performance. [default: ‘{l_bar}{bar}{r_bar}’], where l_bar=’{desc}: {percentage:3.0f}%|’ and r_bar=’| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, ‘

    ’{rate_fmt}{postfix}]’
    Possible vars: l_bar, bar, r_bar, n, n_fmt, total, total_fmt,
    percentage, elapsed, elapsed_s, ncols, nrows, desc, unit, rate, rate_fmt, rate_noinv, rate_noinv_fmt, rate_inv, rate_inv_fmt, postfix, unit_divisor, remaining, remaining_s, eta.

    Note that a trailing “: ” is automatically removed after {desc} if the latter is empty.

  • initial (int or float, optional) – The initial counter value. Useful when restarting a progress bar [default: 0]. If using float, consider specifying {n:.3f} or similar in bar_format, or specifying unit_scale.
  • position (int, optional) – Specify the line offset to print this bar (starting from 0) Automatic if unspecified. Useful to manage multiple bars at once (eg, from threads).
  • postfix (dict or *, optional) – Specify additional stats to display at the end of the bar. Calls set_postfix(**postfix) if possible (dict).
  • unit_divisor (float, optional) – [default: 1000], ignored unless unit_scale is True.
  • write_bytes (bool, optional) – If (default: None) and file is unspecified, bytes will be written in Python 2. If True will also write bytes. In all other cases will default to unicode.
  • lock_args (tuple, optional) – Passed to refresh for intermediate output (initialisation, iterating, and updating).
  • nrows (int, optional) – The screen height. If specified, hides nested bars outside this bound. If unspecified, attempts to use environment height. The fallback is 20.
  • colour (str, optional) – Bar colour (e.g. ‘green’, ‘#00ff00’).
  • delay (float, optional) – Don’t display until [default: 0] seconds have elapsed.
  • gui (bool, optional) – WARNING: internal parameter - do not use. Use tqdm.gui.tqdm(…) instead. If set, will attempt to use matplotlib animations for a graphical output [default: False].
Returns:

out

Return type:

decorated iterator.

update_to(b=1, bsize=1, tsize=None)[source]

Update class attributes :param b: :param bsize: :param tsize:

Returns:

deepforest.utilities.annotations_to_shapefile(df, transform, crs)[source]

Convert output from predict_image and predict_tile to a geopandas data.frame

Parameters:
  • df – prediction data.frame with columns [‘xmin’,’ymin’,’xmax’,’ymax’,’label’,’score’]
  • transform – A rasterio affine transform object
  • crs – A rasterio crs object
Returns:

a geopandas dataframe where every entry is the bounding box for a detected tree.

Return type:

results

deepforest.utilities.check_file(df)[source]

Check a file format for correct column names and structure

deepforest.utilities.check_image(image)[source]

Check an image is three channel, channel last format :param image: numpy array

Returns: None, throws error on assert

deepforest.utilities.collate_fn(batch)[source]
deepforest.utilities.project_boxes(df, root_dir)[source]

Convert output from predict_file into a geopandas data.frame Note that this assumes df is just a single plot being passed to this function :param df: a pandas type dataframe with columns: image_path, xmin, ymin, xmax, ymax. image_path is the relative within the root_dir arg. :param root_dir: directory of images

Returns:a geodataframe with transformed boxes as geometry
Return type:geodf
deepforest.utilities.read_config(config_path)[source]

Read config yaml file

deepforest.utilities.round_with_floats(x)[source]

Check if string x is float or int, return int, rounded if needed.

deepforest.utilities.shapefile_to_annotations(shapefile, rgb, savedir='.')[source]

Convert a shapefile of annotations into annotations csv file for DeepForest training and evaluation :param shapefile: Path to a shapefile on disk. If a label column is present, it will be used, else all labels are assumed to be “Tree” :param rgb: Path to the RGB image on disk :param savedir: Directory to save csv files

Returns:a pandas dataframe
Return type:results
deepforest.utilities.use_bird_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='bird', check_release=True)[source]

Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns: release_tag, output_path (str): path to downloaded model

deepforest.utilities.use_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='NEON', check_release=True)[source]

Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns: release_tag, output_path (str): path to downloaded model

deepforest.utilities.xml_to_annotations(xml_path)[source]

Load annotations from xml format (e.g. RectLabel editor) and convert them into retinanet annotations format. :param xml_path: Path to the annotations xml, formatted by RectLabel :type xml_path: str

Returns:
in the
format -> path-to-image.png,x1,y1,x2,y2,class_name
Return type:Annotations (pandas dataframe)

deepforest.visualize module

deepforest.visualize.format_boxes(prediction, scores=True)[source]

Format a retinanet prediction into a pandas dataframe for a single image :param prediction: a dictionary with keys ‘boxes’ and ‘labels’ coming from a retinanet :param scores: Whether boxes come with scores, during prediction, or without scores, as in during training.

Returns:
df: a pandas dataframe
deepforest.visualize.label_to_color(label)[source]
deepforest.visualize.plot_prediction_and_targets(image, predictions, targets, image_name, savedir)[source]

Plot an image, its predictions, and its ground truth targets for debugging :param image: torch tensor, RGB color order :param targets: torch tensor

Returns:path on disk with saved figure
Return type:figure_path
deepforest.visualize.plot_prediction_dataframe(df, root_dir, ground_truth=None, savedir=None)[source]

For each row in dataframe, call plot predictions. For multi-class labels, boxes will be colored by labels. Ground truth boxes will all be same color, regardless of class. :param df: a pandas dataframe with image_path, xmin, xmax, ymin, ymax and label columns. The image_path column should be the relative path from root_dir, not the full path. :param root_dir: relative dir to look for image names from df.image_path :param ground_truth: an optional pandas dataframe in same format as df holding ground_truth boxes :param savedir: save the plot to an optional directory path.

Returns:list of filenames written
Return type:written_figures
deepforest.visualize.plot_predictions(image, df, color=None, thickness=1)[source]

Plot a set of boxes on an image By default this function does not show, but only plots an axis Label column must be numeric! Image must be BGR color order! :param image: a numpy array in BGR color order! Channel order is channels first :param df: a pandas dataframe with xmin, xmax, ymin, ymax and label column :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Returns:a numpy array with drawn annotations
Return type:image
deepforest.visualize.view_dataset(ds, savedir=None, color=None, thickness=1)[source]

Plot annotations on images for debugging purposes :param ds: a deepforest pytorch dataset, see deepforest.dataset or deepforest.load_dataset() to start from a csv file :param savedir: optional path to save figures. If none (default) images will be interactively plotted :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Module contents

Top-level package for DeepForest.

deepforest.get_data(path)[source]

helper function to get package sample data