deepforest package#

Subpackages#

deepforest.data package
- Module contents

Submodules#

deepforest.IoU module#

IoU Module, with help from https://github.com/SpaceNetChallenge/utilities/blob/spacenetV3/spacenetutilities/evalTools.py

deepforest.IoU.compute_IoU(ground_truth, submission)[source]#

Parameters

ground_truth – a projected geopandas dataframe with geoemtry
submission – a projected geopandas dataframe with geometry

Returns

dataframe of IoU scores

Return type

iou_df

deepforest.IoU.create_rtree_from_poly(poly_list)[source]#

deepforest.callbacks module#

A deepforest callback Callbacks must have the following methods on_epoch_begin, on_epoch_end, on_fit_end, on_fit_begin methods and inject model and epoch kwargs.

class deepforest.callbacks.images_callback(savedir, n=2, every_n_epochs=5, select_random=False, color=None, thickness=1)[source]#

Bases: Callback

Run evaluation on a file of annotations during training :param savedir: optional, directory to save predicted images :param probability_threshold: minimum probablity for inclusion, see deepforest.evaluate :param n: number of images to upload :param select_random: whether to select random images or the first n images :type select_random: False :param every_n_epochs: run epoch interval :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Returns: either prints validation scores or logs them to the pytorch-lightning logger
Return type: None

log_images(pl_module)[source]#

on_validation_epoch_end(trainer, pl_module)[source]#: Called when the val epoch ends.

deepforest.dataset module#

Dataset model

https://pytorch.org/docs/stable/torchvision/models.html#object-detection-instance-segmentation-and-person-keypoint-detection

During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing:

boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W

labels (Int64Tensor[N]): the class label for each ground-truth box

https://colab.research.google.com/github/benihime91/pytorch_retinanet/blob/master/demo.ipynb#scrollTo=0zNGhr6D7xGN

class deepforest.dataset.TileDataset(tile: Optional[ndarray], preload_images: bool = False, patch_size: int = 400, patch_overlap: float = 0.05)[source]#

Bases: Dataset

Parameters

tile – an in memory numpy array.
patch_size (int) – The size for the crops used to cut the input raster into smaller pieces. This is given in pixels, not any geographic unit.
patch_overlap (float) – The horizontal and vertical overlap among patches

Returns

a pytorch dataset

Return type

class deepforest.dataset.TreeDataset(csv_file, root_dir, transforms=None, label_dict={'Tree': 0}, train=True, preload_images=False)[source]#

Bases: Dataset

Parameters

csv_file (string) – Path to a single csv file with annotations.
root_dir (string) – Directory with all the images.
transform (callable, optional) – Optional transform to be applied on a sample.
label_dict – a dictionary where keys are labels from the csv column and values are numeric labels “Tree” -> 0

Returns

If train, path, image, targets else image

deepforest.dataset.get_transform(augment)[source]#: Albumentations transformation of bounding boxs

deepforest.evaluate module#

Evaluation module

deepforest.evaluate.compute_class_recall(results)[source]#: Given a set of evaluations, what proportion of predicted boxes match. True boxes which are not matched to predictions do not count against accuracy.

deepforest.evaluate.evaluate(predictions, ground_df, root_dir, iou_threshold=0.4, savedir=None)[source]#

Image annotated crown evaluation routine submission can be submitted as a .shp, existing pandas dataframe or .csv path

Parameters

predictions – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name. The labels in ground truth and predictions must match. If one is numeric, the other must be numeric.
ground_df – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name
root_dir – location of files in the dataframe ‘name’ column.
savedir – optional directory to save image with overlaid predictions and annotations

Returns

a dataframe of match bounding boxes box_recall: proportion of true positives of box position, regardless of class box_precision: proportion of predictions that are true positive, regardless of class class_recall: a pandas dataframe of class level recall and precision with class sizes

Return type

results

deepforest.evaluate.evaluate_image(predictions, ground_df, root_dir, savedir=None)[source]#

Compute intersection-over-union matching among prediction and ground truth boxes for one image :param df: a pandas dataframe with columns name, xmin, xmax, ymin, ymax, label. The ‘name’ column should be the path relative to the location of the file. :param summarize: Whether to group statistics by plot and overall score :param image_coordinates: Whether the current boxes are in coordinate system of the image, e.g. origin (0,0) upper left. :param root_dir: Where to search for image names in df :param savedir: optional directory to save image with overlaid predictions and annotations

Returns: pandas dataframe with crown ids of prediciton and ground truth and the IoU score.
Return type: result

deepforest.evaluate.point_recall(predictions, ground_df, root_dir=None, savedir=None)[source]#

Evaluate the proportion on ground truth points overlap with predictions submission can be submitted as a .shp, existing pandas dataframe or .csv path For bounding box recall, see evaluate(). :param predictions: a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name. The labels in ground truth and predictions must match. If one is numeric, the other must be numeric. :param ground_df: a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name :param root_dir: location of files in the dataframe ‘name’ column. :param savedir: optional directory to save image with overlaid predictions and annotations

Returns: a dataframe of matched bounding boxes and ground truth labels box_recall: proportion of true positives between predicted boxes and ground truth points, regardless of class class_recall: a pandas dataframe of class level recall and precision with class sizes
Return type: results

deepforest.main module#

deepforest.model module#

class deepforest.model.Model(config)[source]#

Bases: object

A architecture agnostic class that controls the basic train, eval and predict functions. A model should optionally allow a backbone for pretraining. To add new architectures, simply create a new module in models/ and write a create_model. Then add the result to the if else statement below. :param num_classes: number of classes in the model :type num_classes: int :param nms_thresh: non-max suppression threshold for intersection-over-union [0,1] :type nms_thresh: float :param score_thresh: minimum prediction score to keep during prediction [0,1] :type score_thresh: float

Returns: a pytorch nn module
Return type: model

check_model()[source]#: Ensure that model follows deepforest guidelines, see ##### If fails, raise ValueError

create_model()[source]#: This function converts a deepforest config file into a model. An architecture should have a list of nested arguments in config that match this function

deepforest.predict module#

deepforest.predict.across_class_nms(predicted_boxes, iou_threshold=0.15)[source]#: perform non-max suppression for a dataframe of results (see visualize.format_boxes) to remove boxes that overlap by iou_thresholdold of IoU

deepforest.predict.mosiac(boxes, windows, sigma=0.5, thresh=0.001, iou_threshold=0.1)[source]#

deepforest.preprocess module#

The preprocessing module is used to reshape data into format suitable for training or prediction.

For example cutting large tiles into smaller images.

deepforest.preprocess.compute_windows(numpy_image, patch_size, patch_overlap)[source]#

Create a sliding window object from a raster tile.

Parameters: numpy_image (array) – Raster object as numpy array to cut into crops
Returns: a sliding windows object
Return type: windows (list)

deepforest.preprocess.image_name_from_path(image_path)[source]#: Convert path to image name for use in indexing.

deepforest.preprocess.preprocess_image(image)[source]#: Preprocess a single RGB numpy array as a prediction from channels last, to channels first

deepforest.preprocess.save_crop(base_dir, image_name, index, crop)[source]#

Save window crop as an image file to be read by PIL.

Parameters

base_dir (str) – The base directory to save the image file.
image_name (str) – The name of the original image.
index (int) – The index of the window crop.
crop (numpy.ndarray) – The window crop as a NumPy array.

Returns

The filename of the saved image.

Return type

str

deepforest.preprocess.select_annotations(annotations, windows, index, allow_empty=False)[source]#

Select annotations that overlap with selected image crop.

Parameters

image_name (str) – Name of the image in the annotations file to lookup.
annotations_file – path to annotations file in the format -> image_path, xmin, ymin, xmax, ymax, label
windows – A sliding window object (see compute_windows)
index – The index in the windows object to use a crop bounds
allow_empty (bool) – If True, allow window crops that have no annotations to be included

Returns

a pandas dataframe of annotations

Return type

selected_annotations

deepforest.preprocess.split_raster(annotations_file=None, path_to_raster=None, numpy_image=None, base_dir=None, patch_size=400, patch_overlap=0.05, allow_empty=False, image_name=None, save_dir='.')[source]#

Divide a large tile into smaller arrays. Each crop will be saved to file.

Parameters

numpy_image – a numpy object to be used as a raster, usually opened from rasterio.open.read(), in order (height, width, channels)
path_to_raster – (str): Path to a tile that can be read by rasterio on disk
annotations_file (str or pd.DataFrame) – A pandas dataframe or path to annotations csv file to transform to cropped images. In the format -> image_path, xmin, ymin, xmax, ymax, label. If None, allow_empty is ignored and the function will only return the cropped images.
save_dir (str) – Directory to save images
base_dir (str) – Directory to save images
patch_size (int) – Maximum dimensions of square window
patch_overlap (float) – Percent of overlap among windows 0->1
allow_empty – If True, include images with no annotations to be included in the dataset. If annotations_file is None, this is ignored.
image_name (str) – If numpy_image arg is used, what name to give the raster?

Returns

If annotations_file is provided, a pandas dataframe with annotations file for training. A copy of this file is written to save_dir as a side effect. If not, a list of filenames of the cropped images.

deepforest.utilities module#

Utilities model

class deepforest.utilities.DownloadProgressBar(*_, **__)[source]#

Bases: tqdm

Download progress bar class.

update_to(b=1, bsize=1, tsize=None)[source]#

Update class attributes :param b: :param bsize: :param tsize:

Returns:

deepforest.utilities.annotations_to_shapefile(df, transform, crs)[source]#

Convert output from predict_image and predict_tile to a geopandas data.frame

Parameters

df – prediction data.frame with columns [‘xmin’,’ymin’,’xmax’,’ymax’,’label’,’score’]
transform – A rasterio affine transform object
crs – A rasterio crs object

Returns

a geopandas dataframe where every entry is the bounding box for a detected tree.

Return type

results

deepforest.utilities.boxes_to_shapefile(df, root_dir, projected=True, flip_y_axis=False)[source]#

Convert from image coordinates to geographic coordinates Note that this assumes df is just a single plot being passed to this function :param df: a pandas type dataframe with columns: name, xmin, ymin, xmax, ymax. Name is the relative path to the root_dir arg. :param root_dir: directory of images to lookup image_path column :param projected: If True, convert from image to geographic coordinates, if False, keep in image coordinate system :param flip_y_axis: If True, reflect predictions over y axis to align with raster data in QGIS, which uses a negative y origin compared to numpy. See https://gis.stackexchange.com/questions/306684/why-does-qgis-use-negative-y-spacing-in-the-default-raster-geotransform

Returns: a geospatial dataframe with the boxes optionally transformed to the target crs
Return type: df

deepforest.utilities.check_file(df)[source]#: Check a file format for correct column names and structure

deepforest.utilities.check_image(image)[source]#

Check an image is three channel, channel last format :param image: numpy array

Returns: None, throws error on assert

deepforest.utilities.collate_fn(batch)[source]#

deepforest.utilities.project_boxes(df, root_dir, transform=True)[source]#: Convert from image coordinates to geographic coordinates Note that this assumes df is just a single plot being passed to this function df: a pandas type dataframe with columns: name, xmin, ymin, xmax, ymax. Name is the relative path to the root_dir arg. root_dir: directory of images to lookup image_path column transform: If true, convert from image to geographic coordinates

deepforest.utilities.read_config(config_path)[source]#: Read config yaml file

deepforest.utilities.round_with_floats(x)[source]#: Check if string x is float or int, return int, rounded if needed.

deepforest.utilities.shapefile_to_annotations(shapefile, rgb, buffer_size=0.5, geometry_type='bbox', savedir='.')[source]#

Convert a shapefile of annotations into annotations csv file for DeepForest training and evaluation

Geometry Handling: The geometry_type is the form of the objects in the given shapefile. It can be “bbox” or “point”. If geometry_type is set to “bbox” (default) then the bounding boxes in the shapefile will be used as is and transferred over to the annotations file. If the geometry_type is “point” then a bounding box will be created for each point that is centered on the point location and has an apothem equal to buffer_size, resulting in a bounding box with dimensions of 2 times the value of buffer_size.

Parameters

shapefile – Path to a shapefile on disk. If a label column is present, it will be used, else all labels are assumed to be “Tree”
rgb – Path to the RGB image on disk
savedir – Directory to save csv files
buffer_size – size of point to box expansion in map units of the target object, meters for projected data, pixels for unprojected data. The buffer_size is added to each side of the x,y point to create the box.
geometry_type – Specifies the spatial representation used in the shapefile; can be “bbox” or “point”

Returns

a pandas dataframe

Return type

results

deepforest.utilities.use_bird_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='bird', check_release=True)[source]#

Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical

Returns: release_tag, output_path (str): path to downloaded model

deepforest.utilities.use_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='NEON', check_release=True)[source]#

Returns: release_tag, output_path (str): path to downloaded model

deepforest.utilities.xml_to_annotations(xml_path)[source]#

Load annotations from xml format (e.g. RectLabel editor) and convert them into retinanet annotations format. :param xml_path: Path to the annotations xml, formatted by RectLabel :type xml_path: str

Returns

in the: format -> path-to-image.png,x1,y1,x2,y2,class_name

Return type

Annotations (pandas dataframe)

deepforest.visualize module#

deepforest.visualize.format_boxes(prediction, scores=True)[source]#

Format a retinanet prediction into a pandas dataframe for a single image :param prediction: a dictionary with keys ‘boxes’ and ‘labels’ coming from a retinanet :param scores: Whether boxes come with scores, during prediction, or without scores, as in during training.

Returns:
df: a pandas dataframe

deepforest.visualize.label_to_color(label)[source]#

deepforest.visualize.plot_points(image, df, color=None, thickness=1)[source]#

Plot a set of points on an image By default this function does not show, but only plots an axis Label column must be numeric! Image must be BGR color order! :param image: a numpy array in BGR color order! Channel order is channels first :param df: a pandas dataframe with x,y and label column :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Returns: a numpy array with drawn annotations
Return type: image

deepforest.visualize.plot_prediction_and_targets(image, predictions, targets, image_name, savedir)[source]#

Plot an image, its predictions, and its ground truth targets for debugging :param image: torch tensor, RGB color order :param targets: torch tensor

Returns: path on disk with saved figure
Return type: figure_path

deepforest.visualize.plot_prediction_dataframe(df, root_dir, savedir, color=None, thickness=1, ground_truth=None)[source]#

For each row in dataframe, call plot predictions and save plot files to disk. For multi-class labels, boxes will be colored by labels. Ground truth boxes will all be same color, regardless of class. :param df: a pandas dataframe with image_path, xmin, xmax, ymin, ymax and label columns. The image_path column should be the relative path from root_dir, not the full path. :param root_dir: relative dir to look for image names from df.image_path :param ground_truth: an optional pandas dataframe in same format as df holding ground_truth boxes :param savedir: save the plot to an optional directory path.

Returns: list of filenames written
Return type: written_figures

deepforest.visualize.plot_predictions(image, df, color=None, thickness=1)[source]#

Plot a set of boxes on an image By default this function does not show, but only plots an axis Label column must be numeric! Image must be BGR color order! :param image: a numpy array in BGR color order! Channel order is channels first :param df: a pandas dataframe with xmin, xmax, ymin, ymax and label column :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Returns: a numpy array with drawn annotations
Return type: image

deepforest.visualize.view_dataset(ds, savedir=None, color=None, thickness=1)[source]#: Plot annotations on images for debugging purposes :param ds: a deepforest pytorch dataset, see deepforest.dataset or deepforest.load_dataset() to start from a csv file :param savedir: optional path to save figures. If none (default) images will be interactively plotted :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px

Module contents#

Top-level package for DeepForest.

deepforest.get_data(path)[source]#: helper function to get package sample data