deepforest package#

Subpackages#

Submodules#

deepforest.IoU module#

IoU Module, with help from https://github.com/SpaceNetChallenge/utilities/blob/spacenetV3/spacenetutilities/evalTools.py

deepforest.IoU.compute_IoU(ground_truth, submission)[source]#
Parameters
  • ground_truth – a projected geopandas dataframe with geoemtry

  • submission – a projected geopandas dataframe with geometry

Returns

dataframe of IoU scores

Return type

iou_df

deepforest.IoU.create_rtree_from_poly(poly_list)[source]#

deepforest.callbacks module#

A deepforest callback Callbacks must have the following methods on_epoch_begin, on_epoch_end, on_fit_end, on_fit_begin methods and inject model and epoch kwargs.

class deepforest.callbacks.images_callback(csv_file, root_dir, savedir, n=2, every_n_epochs=5)[source]#

Bases: Callback

Run evaluation on a file of annotations during training :param model: pytorch model :param csv_file: path to csv with columns, image_path, xmin, ymin, xmax, ymax, label :param epoch: integer. current epoch :param experiment: optional comet_ml experiment :param savedir: optional, directory to save predicted images :param project: whether to project image coordinates into geographic coordinations, see deepforest.evaluate :param root_dir: root directory of images to search for ‘image path’ values from the csv file :param iou_threshold: intersection-over-union threshold, see deepforest.evaluate :param probability_threshold: minimum probablity for inclusion, see deepforest.evaluate :param n: number of images to upload :param every_n_epochs: run epoch interval

Returns

either prints validation scores or logs them to a comet experiment

Return type

None

log_images(pl_module)[source]#
on_validation_epoch_end(trainer, pl_module)[source]#

Called when the val epoch ends.

class deepforest.callbacks.iou_callback(config, every_n_epochs=5)[source]#

Bases: Callback

Run evaluation on a file of annotations during training :param model: pytorch model :param csv_file: path to csv with columns, image_path, xmin, ymin, xmax, ymax, label :param epoch: integer. current epoch :param experiment: optional comet_ml experiment :param savedir: optional, directory to save predicted images :param project: whether to project image coordinates into geographic coordinations, see deepforest.evaluate :param root_dir: root directory of images to search for ‘image path’ values from the csv file :param iou_threshold: intersection-over-union threshold, see deepforest.evaluate :param probability_threshold: minimum probablity for inclusion, see deepforest.evaluate :param n: number of images to upload :param every_n_epochs: run epoch interval

Returns

either prints validation scores or logs them to a comet experiment

Return type

None

on_validation_epoch_end(trainer, pl_module)[source]#

Called when the val epoch ends.

deepforest.dataset module#

Dataset model

https://pytorch.org/docs/stable/torchvision/models.html#object-detection-instance-segmentation-and-person-keypoint-detection

During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing:

boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W

labels (Int64Tensor[N]): the class label for each ground-truth box

https://colab.research.google.com/github/benihime91/pytorch_retinanet/blob/master/demo.ipynb#scrollTo=0zNGhr6D7xGN

class deepforest.dataset.TileDataset(tile: Optional[ndarray], preload_images: bool = False, patch_size: int = 400, patch_overlap: float = 0.05)[source]#

Bases: Dataset

Parameters
  • tile – an in memory numpy array.

  • patch_size (int) – The size for the crops used to cut the input raster into smaller pieces. This is given in pixels, not any geographic unit.

  • patch_overlap (float) – The horizontal and vertical overlap among patches

Returns

a pytorch dataset

Return type

ds

class deepforest.dataset.TreeDataset(csv_file, root_dir, transforms=None, label_dict={'Tree': 0}, train=True, preload_images=False)[source]#

Bases: Dataset

Parameters
  • csv_file (string) – Path to a single csv file with annotations.

  • root_dir (string) – Directory with all the images.

  • transform (callable, optional) – Optional transform to be applied on a sample.

  • label_dict – a dictionary where keys are labels from the csv column and values are numeric labels “Tree” -> 0

Returns

If train, path, image, targets else image

deepforest.dataset.get_transform(augment)[source]#

Albumentations transformation of bounding boxs

deepforest.evaluate module#

Evaluation module

deepforest.evaluate.compute_class_recall(results)[source]#

Given a set of evaluations, what proportion of predicted boxes match. True boxes which are not matched to predictions do not count against accuracy.

deepforest.evaluate.evaluate(predictions, ground_df, root_dir, iou_threshold=0.4, savedir=None)[source]#

Image annotated crown evaluation routine submission can be submitted as a .shp, existing pandas dataframe or .csv path

Parameters
  • predictions – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name. The labels in ground truth and predictions must match. If one is numeric, the other must be numeric.

  • ground_df – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name

  • root_dir – location of files in the dataframe ‘name’ column.

Returns

a dataframe of match bounding boxes box_recall: proportion of true positives of box position, regardless of class box_precision: proportion of predictions that are true positive, regardless of class class_recall: a pandas dataframe of class level recall and precision with class sizes

Return type

results

deepforest.evaluate.evaluate_image(predictions, ground_df, root_dir, savedir=None)[source]#

Compute intersection-over-union matching among prediction and ground truth boxes for one image :param df: a pandas dataframe with columns name, xmin, xmax, ymin, ymax, label. The ‘name’ column should be the path relative to the location of the file. :param summarize: Whether to group statistics by plot and overall score :param image_coordinates: Whether the current boxes are in coordinate system of the image, e.g. origin (0,0) upper left. :param root_dir: Where to search for image names in df :param savedir: optional directory to save image with overlaid predictions and annotations

Returns

pandas dataframe with crown ids of prediciton and ground truth and the IoU score.

Return type

result

deepforest.main module#

class deepforest.main.deepforest(num_classes: int = 1, label_dict: dict = {'Tree': 0}, transforms=None, config_file: str = 'deepforest_config.yml')[source]#

Bases: LightningModule

Class for training and predicting tree crowns in RGB images

Parameters
  • num_classes (int) – number of classes in the model

  • config_file (str) – path to deepforest config file

Returns

a deepforest pytorch lightning module

Return type

self

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple.

Returns

Any of these 6 options.

  • Single optimizer.

  • List or Tuple of optimizers.

  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).

  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.

  • Tuple of dictionaries as described above, with an optional "frequency" key.

  • None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

# The ReduceLROnPlateau scheduler requires a monitor
def configure_optimizers(self):
    optimizer = Adam(...)
    return {
        "optimizer": optimizer,
        "lr_scheduler": {
            "scheduler": ReduceLROnPlateau(optimizer, ...),
            "monitor": "metric_to_track",
            "frequency": "indicates how often the metric is updated"
            # If "monitor" references validation metrics, then "frequency" should be set to a
            # multiple of "trainer.check_val_every_n_epoch".
        },
    }


# In the case of two optimizers, only one using the ReduceLROnPlateau scheduler
def configure_optimizers(self):
    optimizer1 = Adam(...)
    optimizer2 = SGD(...)
    scheduler1 = ReduceLROnPlateau(optimizer1, ...)
    scheduler2 = LambdaLR(optimizer2, ...)
    return (
        {
            "optimizer": optimizer1,
            "lr_scheduler": {
                "scheduler": scheduler1,
                "monitor": "metric_to_track",
            },
        },
        {"optimizer": optimizer2, "lr_scheduler": scheduler2},
    )

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

The frequency value specified in a dict along with the optimizer key is an int corresponding to the number of sequential batches optimized with the specific optimizer. It should be given to none or to all of the optimizers. There is a difference between passing multiple optimizers in a list, and passing multiple optimizers in dictionaries with a frequency of 1:

  • In the former case, all optimizers will operate on the given batch in each optimization step.

  • In the latter, only one optimizer will operate on the given batch at every step.

This is different from the frequency value specified in the lr_scheduler_config mentioned above.

def configure_optimizers(self):
    optimizer_one = torch.optim.SGD(self.model.parameters(), lr=0.01)
    optimizer_two = torch.optim.SGD(self.model.parameters(), lr=0.01)
    return [
        {"optimizer": optimizer_one, "frequency": 5},
        {"optimizer": optimizer_two, "frequency": 10},
    ]

In this example, the first optimizer will be used for the first 5 steps, the second optimizer for the next 10 steps and that cycle will continue. If an LR scheduler is specified for an optimizer using the lr_scheduler key in the above dict, the scheduler will only be updated when its optimizer is being used.

Examples:

# most cases. no learning rate scheduler
def configure_optimizers(self):
    return Adam(self.parameters(), lr=1e-3)

# multiple optimizer case (e.g.: GAN)
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    return gen_opt, dis_opt

# example with learning rate schedulers
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    dis_sch = CosineAnnealing(dis_opt, T_max=10)
    return [gen_opt, dis_opt], [dis_sch]

# example with step-based learning rate schedulers
# each optimizer has its own scheduler
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    gen_sch = {
        'scheduler': ExponentialLR(gen_opt, 0.99),
        'interval': 'step'  # called after each training step
    }
    dis_sch = CosineAnnealing(dis_opt, T_max=10) # called every epoch
    return [gen_opt, dis_opt], [gen_sch, dis_sch]

# example with optimizer frequencies
# see training procedure in `Improved Training of Wasserstein GANs`, Algorithm 1
# https://arxiv.org/abs/1704.00028
def configure_optimizers(self):
    gen_opt = Adam(self.model_gen.parameters(), lr=0.01)
    dis_opt = Adam(self.model_dis.parameters(), lr=0.02)
    n_critic = 5
    return (
        {'optimizer': dis_opt, 'frequency': n_critic},
        {'optimizer': gen_opt, 'frequency': 1}
    )

Note

Some things to know:

  • Lightning calls .backward() and .step() on each optimizer as needed.

  • If learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.

  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizers.

  • If you use multiple optimizers, training_step() will have an additional optimizer_idx parameter.

  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.

  • If you use multiple optimizers, gradients will be calculated only for the parameters of current optimizer at each training step.

  • If you need to control how often those optimizers step or override the default .step() schedule, override the optimizer_step() hook.

create_model()[source]#

Define a deepforest retinanet architecture

create_trainer(logger=None, callbacks=[], **kwargs)[source]#

Create a pytorch lightning training by reading config files :param callbacks: a list of pytorch-lightning callback classes :type callbacks: list

evaluate(csv_file, root_dir, iou_threshold=None, savedir=None)[source]#

Compute intersection-over-union and precision/recall for a given iou_threshold

Parameters
  • csv_file – location of a csv file with columns “name”,”xmin”,”ymin”,”xmax”,”ymax”,”label”, each box in a row

  • root_dir – location of files in the dataframe ‘name’ column.

  • iou_threshold – float [0,1] intersection-over-union union between annotation and prediction to be scored true positive

  • savedir – optional path dir to save evaluation images

Returns

dict of (“results”, “precision”, “recall”) for a given threshold

Return type

results

load_dataset(csv_file, root_dir=None, augment=False, shuffle=True, batch_size=1, train=False)[source]#

Create a tree dataset for inference Csv file format is .csv file with the columns “image_path”, “xmin”,”ymin”,”xmax”,”ymax” for the image name and bounding box position. Image_path is the relative filename, not absolute path, which is in the root_dir directory. One bounding box per line.

Parameters
  • csv_file – path to csv file

  • root_dir – directory of images. If none, uses “image_dir” in config

  • augment – Whether to create a training dataset, this activates data augmentations

Returns

a pytorch dataset

Return type

ds

on_fit_start()[source]#

Called at the very beginning of fit.

If on DDP it is called on every process

predict_dataloader(ds)[source]#

Create a pytorch dataloader for prediction Returns:

predict_file(csv_file, root_dir, savedir=None, color=None, thickness=1)[source]#

Create a dataset and predict entire annotation file

Csv file format is .csv file with the columns “image_path”, “xmin”,”ymin”,”xmax”,”ymax” for the image name and bounding box position. Image_path is the relative filename, not absolute path, which is in the root_dir directory. One bounding box per line.

Parameters
  • csv_file – path to csv file

  • root_dir – directory of images. If none, uses “image_dir” in config

  • savedir – Optional. Directory to save image plots.

  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)

  • thickness – thickness of the rectangle border line in px

Returns

pandas dataframe with bounding boxes, label and scores for each image in the csv file

Return type

df

predict_image(image: Optional[ndarray] = None, path: Optional[str] = None, return_plot: bool = False, thickness: int = 1, color: Optional[tuple] = (0, 165, 255))[source]#

Predict a single image with a deepforest model

Parameters
  • image – a float32 numpy array of a RGB with channels last format

  • path – optional path to read image from disk instead of passing image arg

  • return_plot – Return image with plotted detections

  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)

  • thickness – thickness of the rectangle border line in px

Returns

A pandas dataframe of predictions (Default) img: The input with predictions overlaid (Optional)

Return type

boxes

predict_step(batch, batch_idx)[source]#

Step function called during predict(). By default, it calls forward(). Override to add any processing logic.

The predict_step() is used to scale inference on multi-devices.

To prevent an OOM error, it is possible to use BasePredictionWriter callback to write the predictions to disk or database after each batch or on epoch end.

The BasePredictionWriter should be used while using a spawn based accelerator. This happens for Trainer(strategy="ddp_spawn") or training on 8 TPU cores with Trainer(accelerator="tpu", devices=8) as predictions won’t be returned.

Example

class MyModel(LightningModule):

    def predict_step(self, batch, batch_idx, dataloader_idx=0):
        return self(batch)

dm = ...
model = MyModel()
trainer = Trainer(accelerator="gpu", devices=2)
predictions = trainer.predict(model, dm)
Parameters
  • batch – Current batch.

  • batch_idx – Index of current batch.

  • dataloader_idx – Index of the current dataloader.

Returns

Predicted output

predict_tile(raster_path=None, image=None, patch_size=400, patch_overlap=0.05, iou_threshold=0.15, return_plot=False, mosaic=True, use_soft_nms=False, sigma=0.5, thresh=0.001, color=None, thickness=1)[source]#

For images too large to input into the model, predict_tile cuts the image into overlapping windows, predicts trees on each window and reassambles into a single array.

Parameters
  • raster_path – Path to image on disk

  • image (array) – Numpy image array in BGR channel order following openCV convention

  • patch_size – patch size default400,

  • patch_overlap – patch overlap default 0.15,

  • iou_threshold – Minimum iou overlap among predictions between windows to be suppressed. Defaults to 0.5. Lower values suppress more boxes at edges.

  • return_plot – Should the image be returned with the predictions drawn?

  • mosaic – Return a single prediction dataframe (True) or a tuple of image crops and predictions (False)

  • use_soft_nms – whether to perform Gaussian Soft NMS or not, if false, default perform NMS.

  • sigma – variance of Gaussian function used in Gaussian Soft NMS

  • thresh – the score thresh used to filter bboxes after soft-nms performed

  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)

  • thickness – thickness of the rectangle border line in px

Returns

if return_plot, an image. Otherwise a numpy array of predicted bounding boxes, scores and labels

Return type

boxes (array)

save_model(path)[source]#

Save the trainer checkpoint in user defined path, in order to access in future :param Path: the path located the model checkpoint

train_dataloader()[source]#

Train loader using the configurations Returns: loader

training_step(batch, batch_idx)[source]#

Train on a loaded dataset

use_bird_release(check_release=True)[source]#

Use the latest DeepForest bird model release from github and load model. Optionally download if release doesn’t exist. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns

A trained pytorch model

Return type

model (object)

use_release(check_release=True)[source]#

Use the latest DeepForest model release from github and load model. Optionally download if release doesn’t exist. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns

A trained PyTorch model

Return type

model (object)

val_dataloader()[source]#

Create a val data loader only if specified in config Returns: a dataloader or a empty iterable.

validation_step(batch, batch_idx)[source]#

Train on a loaded dataset

deepforest.model module#

deepforest.model.create_anchor_generator(sizes=((8, 16, 32, 64, 128, 256, 400),), aspect_ratios=((0.5, 1.0, 2.0),))[source]#

Create anchor box generator as a function of sizes and aspect ratios Documented https://github.com/pytorch/vision/blob/67b25288ca202d027e8b06e17111f1bcebd2046c/torchvision/models/detection/anchor_utils.py#L9 let’s make the network generate 5 x 3 anchors per spatial location, with 5 different sizes and 3 different aspect ratios. We have a Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios :param sizes: :param aspect_ratios:

Returns: anchor_generator, a pytorch module

deepforest.model.create_model(num_classes, nms_thresh, score_thresh, backbone=None)[source]#

Create a retinanet model :param num_classes: number of classes in the model :type num_classes: int :param nms_thresh: non-max suppression threshold for intersection-over-union [0,1] :type nms_thresh: float :param score_thresh: minimum prediction score to keep during prediction [0,1] :type score_thresh: float

Returns

a pytorch nn module

Return type

model

deepforest.model.load_backbone()[source]#

A torch vision retinanet model

deepforest.predict module#

deepforest.predict.across_class_nms(predicted_boxes, iou_threshold=0.15)[source]#

perform non-max suppression for a dataframe of results (see visualize.format_boxes) to remove boxes that overlap by iou_thresholdold of IoU

deepforest.predict.mosiac(boxes, windows, use_soft_nms=False, sigma=0.5, thresh=0.001, iou_threshold=0.1)[source]#
deepforest.predict.predict_image(model, image, return_plot, device, iou_threshold=0.1, color=None, thickness=1)[source]#

Predict an image with a deepforest model

Parameters
  • image – a numpy array of a RGB image ranged from 0-255

  • path – optional path to read image from disk instead of passing image arg

  • return_plot – Return image with plotted detections

  • device – pytorch device of ‘cuda’ or ‘cpu’ for gpu prediction. Set internally.

  • color – color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255)

  • thickness – thickness of the rectangle border line in px

Returns

A pandas dataframe of predictions (Default) img: The input with predictions overlaid (Optional)

Return type

boxes

deepforest.predict.soft_nms(boxes, scores, sigma=0.5, thresh=0.001)[source]#

Perform python soft_nms to reduce the confidances of the proposals proportional to IoU value Paper: Improving Object Detection With One Line of Code Code : https://github.com/DocF/Soft-NMS/blob/master/softnms_pytorch.py :param boxes: predicitons bounding boxes tensor format [x1,y1,x2,y2] :param scores: the score corresponding to each box tensors :param sigma: variance of Gaussian function :param thresh: score thresh

Returns

the index list of the selected boxes

Return type

idxs_keep

deepforest.preprocess module#

The preprocessing module is used to reshape data into format suitable for training or prediction.

For example cutting large tiles into smaller images.

deepforest.preprocess.compute_windows(numpy_image, patch_size, patch_overlap)[source]#

Create a sliding window object from a raster tile.

Parameters

numpy_image (array) – Raster object as numpy array to cut into crops

Returns

a sliding windows object

Return type

windows (list)

deepforest.preprocess.image_name_from_path(image_path)[source]#

Convert path to image name for use in indexing.

deepforest.preprocess.preprocess_image(image)[source]#

Preprocess a single RGB numpy array as a prediction from channels last, to channels first

deepforest.preprocess.save_crop(base_dir, image_name, index, crop)[source]#

Save window crop as image file to be read by PIL.

Filename should match the image_name + window index

deepforest.preprocess.select_annotations(annotations, windows, index, allow_empty=False)[source]#

Select annotations that overlap with selected image crop.

Parameters
  • image_name (str) – Name of the image in the annotations file to lookup.

  • annotations_file – path to annotations file in the format -> image_path, xmin, ymin, xmax, ymax, label

  • windows – A sliding window object (see compute_windows)

  • index – The index in the windows object to use a crop bounds

  • allow_empty (bool) – If True, allow window crops that have no annotations to be included

Returns

a pandas dataframe of annotations

Return type

selected_annotations

deepforest.preprocess.split_raster(annotations_file, path_to_raster=None, numpy_image=None, base_dir='.', patch_size=400, patch_overlap=0.05, allow_empty=False, image_name=None)[source]#

Divide a large tile into smaller arrays. Each crop will be saved to file.

Parameters
  • numpy_image – a numpy object to be used as a raster, usually opened from rasterio.open.read()

  • path_to_raster – (str): Path to a tile that can be read by rasterio on disk

  • annotations_file (str) – Path to annotations file (with column names) data in the format -> image_path, xmin, ymin, xmax, ymax, label

  • base_dir (str) – Where to save the annotations and image crops relative to current working dir

  • patch_size (int) – Maximum dimensions of square window

  • patch_overlap (float) – Percent of overlap among windows 0->1

  • allow_empty – If True, include images with no annotations to be included in the dataset

  • image_name (str) – If numpy_image arg is used, what name to give the raster?

Returns

A pandas dataframe with annotations file for training.

deepforest.utilities module#

Utilities model

class deepforest.utilities.DownloadProgressBar(*_, **__)[source]#

Bases: tqdm

Download progress bar class.

update_to(b=1, bsize=1, tsize=None)[source]#

Update class attributes :param b: :param bsize: :param tsize:

Returns:

deepforest.utilities.annotations_to_shapefile(df, transform, crs)[source]#

Convert output from predict_image and predict_tile to a geopandas data.frame

Parameters
  • df – prediction data.frame with columns [‘xmin’,’ymin’,’xmax’,’ymax’,’label’,’score’]

  • transform – A rasterio affine transform object

  • crs – A rasterio crs object

Returns

a geopandas dataframe where every entry is the bounding box for a detected tree.

Return type

results

deepforest.utilities.boxes_to_shapefile(df, root_dir, projected=True, flip_y_axis=False)[source]#

Convert from image coordinates to geographic coordinates Note that this assumes df is just a single plot being passed to this function :param df: a pandas type dataframe with columns: name, xmin, ymin, xmax, ymax. Name is the relative path to the root_dir arg. :param root_dir: directory of images to lookup image_path column :param projected: If True, convert from image to geographic coordinates, if False, keep in image coordinate system :param flip_y_axis: If True, reflect predictions over y axis to align with raster data in QGIS, which uses a negative y origin compared to numpy. See https://gis.stackexchange.com/questions/306684/why-does-qgis-use-negative-y-spacing-in-the-default-raster-geotransform

Returns

a geospatial dataframe with the boxes optionally transformed to the target crs

Return type

df

deepforest.utilities.check_file(df)[source]#

Check a file format for correct column names and structure

deepforest.utilities.check_image(image)[source]#

Check an image is three channel, channel last format :param image: numpy array

Returns: None, throws error on assert

deepforest.utilities.collate_fn(batch)[source]#
deepforest.utilities.project_boxes(df, root_dir, transform=True)[source]#

Convert from image coordinates to geographic coordinates Note that this assumes df is just a single plot being passed to this function df: a pandas type dataframe with columns: name, xmin, ymin, xmax, ymax. Name is the relative path to the root_dir arg. root_dir: directory of images to lookup image_path column transform: If true, convert from image to geographic coordinates

deepforest.utilities.read_config(config_path)[source]#

Read config yaml file

deepforest.utilities.round_with_floats(x)[source]#

Check if string x is float or int, return int, rounded if needed.

deepforest.utilities.shapefile_to_annotations(shapefile, rgb, buffer_size=0.5, convert_to_boxes=False, savedir='.')[source]#

Convert a shapefile of annotations into annotations csv file for DeepForest training and evaluation :param shapefile: Path to a shapefile on disk. If a label column is present, it will be used, else all labels are assumed to be “Tree” :param rgb: Path to the RGB image on disk :param savedir: Directory to save csv files :param buffer_size: size of point to box expansion in map units of the target object, meters for projected data, pixels for unprojected data. The buffer_size is added to each side of the x,y point to create the box. :param convert_to_boxes: If True, convert the point objects in the shapefile into bounding boxes with size ‘buffer_size’. :type convert_to_boxes: False

Returns

a pandas dataframe

Return type

results

deepforest.utilities.use_bird_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='bird', check_release=True)[source]#

Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns: release_tag, output_path (str): path to downloaded model

deepforest.utilities.use_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='NEON', check_release=True)[source]#

Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns: release_tag, output_path (str): path to downloaded model

deepforest.utilities.xml_to_annotations(xml_path)[source]#

Load annotations from xml format (e.g. RectLabel editor) and convert them into retinanet annotations format. :param xml_path: Path to the annotations xml, formatted by RectLabel :type xml_path: str

Returns

in the

format -> path-to-image.png,x1,y1,x2,y2,class_name

Return type

Annotations (pandas dataframe)

deepforest.visualize module#

deepforest.visualize.format_boxes(prediction, scores=True)[source]#

Format a retinanet prediction into a pandas dataframe for a single image :param prediction: a dictionary with keys ‘boxes’ and ‘labels’ coming from a retinanet :param scores: Whether boxes come with scores, during prediction, or without scores, as in during training.

Returns:

df: a pandas dataframe

deepforest.visualize.label_to_color(label)[source]#
deepforest.visualize.plot_prediction_and_targets(image, predictions, targets, image_name, savedir)[source]#

Plot an image, its predictions, and its ground truth targets for debugging :param image: torch tensor, RGB color order :param targets: torch tensor

Returns

path on disk with saved figure

Return type

figure_path

deepforest.visualize.plot_prediction_dataframe(df, root_dir, savedir, ground_truth=None)[source]#

For each row in dataframe, call plot predictions and save plot files to disk. For multi-class labels, boxes will be colored by labels. Ground truth boxes will all be same color, regardless of class. :param df: a pandas dataframe with image_path, xmin, xmax, ymin, ymax and label columns. The image_path column should be the relative path from root_dir, not the full path. :param root_dir: relative dir to look for image names from df.image_path :param ground_truth: an optional pandas dataframe in same format as df holding ground_truth boxes :param savedir: save the plot to an optional directory path.

Returns

list of filenames written

Return type

written_figures

deepforest.visualize.plot_predictions(image, df, color=None, thickness=1)[source]#

Plot a set of boxes on an image By default this function does not show, but only plots an axis Label column must be numeric! Image must be BGR color order! :param image: a numpy array in BGR color order! Channel order is channels first :param df: a pandas dataframe with xmin, xmax, ymin, ymax and label column :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Returns

a numpy array with drawn annotations

Return type

image

deepforest.visualize.view_dataset(ds, savedir=None, color=None, thickness=1)[source]#

Plot annotations on images for debugging purposes :param ds: a deepforest pytorch dataset, see deepforest.dataset or deepforest.load_dataset() to start from a csv file :param savedir: optional path to save figures. If none (default) images will be interactively plotted :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Module contents#

Top-level package for DeepForest.

deepforest.get_data(path)[source]#

helper function to get package sample data