deepforest package#
Subpackages#
Submodules#
deepforest.IoU module#
IoU Module, with help from https://github.com/SpaceNetChallenge/utilities/blob/spacenetV3/spacenetutilities/evalTools.py
deepforest.callbacks module#
A deepforest callback Callbacks must have the following methods on_epoch_begin, on_epoch_end, on_fit_end, on_fit_begin methods and inject model and epoch kwargs.
- class deepforest.callbacks.images_callback(savedir, n=2, every_n_epochs=5, select_random=False, color=None, thickness=1)[source]#
Bases:
Callback
Run evaluation on a file of annotations during training :param savedir: optional, directory to save predicted images :param probability_threshold: minimum probablity for inclusion, see deepforest.evaluate :param n: number of images to upload :param select_random: whether to select random images or the first n images :type select_random: False :param every_n_epochs: run epoch interval :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px
- Returns
either prints validation scores or logs them to the pytorch-lightning logger
- Return type
None
deepforest.dataset module#
Dataset model
During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing:
boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W
labels (Int64Tensor[N]): the class label for each ground-truth box
- class deepforest.dataset.TileDataset(tile: Optional[ndarray], preload_images: bool = False, patch_size: int = 400, patch_overlap: float = 0.05)[source]#
Bases:
Dataset
- Parameters
tile – an in memory numpy array.
patch_size (int) – The size for the crops used to cut the input raster into smaller pieces. This is given in pixels, not any geographic unit.
patch_overlap (float) – The horizontal and vertical overlap among patches
- Returns
a pytorch dataset
- Return type
ds
- class deepforest.dataset.TreeDataset(csv_file, root_dir, transforms=None, label_dict={'Tree': 0}, train=True, preload_images=False)[source]#
Bases:
Dataset
- Parameters
csv_file (string) – Path to a single csv file with annotations.
root_dir (string) – Directory with all the images.
transform (callable, optional) – Optional transform to be applied on a sample.
label_dict – a dictionary where keys are labels from the csv column and values are numeric labels “Tree” -> 0
- Returns
If train, path, image, targets else image
deepforest.evaluate module#
Evaluation module
- deepforest.evaluate.compute_class_recall(results)[source]#
Given a set of evaluations, what proportion of predicted boxes match. True boxes which are not matched to predictions do not count against accuracy.
- deepforest.evaluate.evaluate(predictions, ground_df, root_dir, iou_threshold=0.4, savedir=None)[source]#
Image annotated crown evaluation routine submission can be submitted as a .shp, existing pandas dataframe or .csv path
- Parameters
predictions – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name. The labels in ground truth and predictions must match. If one is numeric, the other must be numeric.
ground_df – a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name
root_dir – location of files in the dataframe ‘name’ column.
savedir – optional directory to save image with overlaid predictions and annotations
- Returns
a dataframe of match bounding boxes box_recall: proportion of true positives of box position, regardless of class box_precision: proportion of predictions that are true positive, regardless of class class_recall: a pandas dataframe of class level recall and precision with class sizes
- Return type
results
- deepforest.evaluate.evaluate_image(predictions, ground_df, root_dir, savedir=None)[source]#
Compute intersection-over-union matching among prediction and ground truth boxes for one image :param df: a pandas dataframe with columns name, xmin, xmax, ymin, ymax, label. The ‘name’ column should be the path relative to the location of the file. :param summarize: Whether to group statistics by plot and overall score :param image_coordinates: Whether the current boxes are in coordinate system of the image, e.g. origin (0,0) upper left. :param root_dir: Where to search for image names in df :param savedir: optional directory to save image with overlaid predictions and annotations
- Returns
pandas dataframe with crown ids of prediciton and ground truth and the IoU score.
- Return type
result
- deepforest.evaluate.point_recall(predictions, ground_df, root_dir=None, savedir=None)[source]#
Evaluate the proportion on ground truth points overlap with predictions submission can be submitted as a .shp, existing pandas dataframe or .csv path For bounding box recall, see evaluate(). :param predictions: a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name. The labels in ground truth and predictions must match. If one is numeric, the other must be numeric. :param ground_df: a pandas dataframe, if supplied a root dir is needed to give the relative path of files in df.name :param root_dir: location of files in the dataframe ‘name’ column. :param savedir: optional directory to save image with overlaid predictions and annotations
- Returns
a dataframe of matched bounding boxes and ground truth labels box_recall: proportion of true positives between predicted boxes and ground truth points, regardless of class class_recall: a pandas dataframe of class level recall and precision with class sizes
- Return type
results
deepforest.main module#
deepforest.model module#
- class deepforest.model.Model(config)[source]#
Bases:
object
A architecture agnostic class that controls the basic train, eval and predict functions. A model should optionally allow a backbone for pretraining. To add new architectures, simply create a new module in models/ and write a create_model. Then add the result to the if else statement below. :param num_classes: number of classes in the model :type num_classes: int :param nms_thresh: non-max suppression threshold for intersection-over-union [0,1] :type nms_thresh: float :param score_thresh: minimum prediction score to keep during prediction [0,1] :type score_thresh: float
- Returns
a pytorch nn module
- Return type
model
deepforest.predict module#
deepforest.preprocess module#
The preprocessing module is used to reshape data into format suitable for training or prediction.
For example cutting large tiles into smaller images.
- deepforest.preprocess.compute_windows(numpy_image, patch_size, patch_overlap)[source]#
Create a sliding window object from a raster tile.
- Parameters
numpy_image (array) – Raster object as numpy array to cut into crops
- Returns
a sliding windows object
- Return type
windows (list)
- deepforest.preprocess.image_name_from_path(image_path)[source]#
Convert path to image name for use in indexing.
- deepforest.preprocess.preprocess_image(image)[source]#
Preprocess a single RGB numpy array as a prediction from channels last, to channels first
- deepforest.preprocess.save_crop(base_dir, image_name, index, crop)[source]#
Save window crop as an image file to be read by PIL.
- Parameters
base_dir (str) – The base directory to save the image file.
image_name (str) – The name of the original image.
index (int) – The index of the window crop.
crop (numpy.ndarray) – The window crop as a NumPy array.
- Returns
The filename of the saved image.
- Return type
str
- deepforest.preprocess.select_annotations(annotations, windows, index, allow_empty=False)[source]#
Select annotations that overlap with selected image crop.
- Parameters
image_name (str) – Name of the image in the annotations file to lookup.
annotations_file – path to annotations file in the format -> image_path, xmin, ymin, xmax, ymax, label
windows – A sliding window object (see compute_windows)
index – The index in the windows object to use a crop bounds
allow_empty (bool) – If True, allow window crops that have no annotations to be included
- Returns
a pandas dataframe of annotations
- Return type
selected_annotations
- deepforest.preprocess.split_raster(annotations_file=None, path_to_raster=None, numpy_image=None, base_dir=None, patch_size=400, patch_overlap=0.05, allow_empty=False, image_name=None, save_dir='.')[source]#
Divide a large tile into smaller arrays. Each crop will be saved to file.
- Parameters
numpy_image – a numpy object to be used as a raster, usually opened from rasterio.open.read(), in order (height, width, channels)
path_to_raster – (str): Path to a tile that can be read by rasterio on disk
annotations_file (str or pd.DataFrame) – A pandas dataframe or path to annotations csv file to transform to cropped images. In the format -> image_path, xmin, ymin, xmax, ymax, label. If None, allow_empty is ignored and the function will only return the cropped images.
save_dir (str) – Directory to save images
base_dir (str) – Directory to save images
patch_size (int) – Maximum dimensions of square window
patch_overlap (float) – Percent of overlap among windows 0->1
allow_empty – If True, include images with no annotations to be included in the dataset. If annotations_file is None, this is ignored.
image_name (str) – If numpy_image arg is used, what name to give the raster?
- Returns
If annotations_file is provided, a pandas dataframe with annotations file for training. A copy of this file is written to save_dir as a side effect. If not, a list of filenames of the cropped images.
deepforest.utilities module#
Utilities model
- class deepforest.utilities.DownloadProgressBar(*_, **__)[source]#
Bases:
tqdm
Download progress bar class.
- deepforest.utilities.annotations_to_shapefile(df, transform, crs)[source]#
Convert output from predict_image and predict_tile to a geopandas data.frame
- Parameters
df – prediction data.frame with columns [‘xmin’,’ymin’,’xmax’,’ymax’,’label’,’score’]
transform – A rasterio affine transform object
crs – A rasterio crs object
- Returns
a geopandas dataframe where every entry is the bounding box for a detected tree.
- Return type
results
- deepforest.utilities.boxes_to_shapefile(df, root_dir, projected=True, flip_y_axis=False)[source]#
Convert from image coordinates to geographic coordinates Note that this assumes df is just a single plot being passed to this function :param df: a pandas type dataframe with columns: name, xmin, ymin, xmax, ymax. Name is the relative path to the root_dir arg. :param root_dir: directory of images to lookup image_path column :param projected: If True, convert from image to geographic coordinates, if False, keep in image coordinate system :param flip_y_axis: If True, reflect predictions over y axis to align with raster data in QGIS, which uses a negative y origin compared to numpy. See https://gis.stackexchange.com/questions/306684/why-does-qgis-use-negative-y-spacing-in-the-default-raster-geotransform
- Returns
a geospatial dataframe with the boxes optionally transformed to the target crs
- Return type
df
- deepforest.utilities.check_file(df)[source]#
Check a file format for correct column names and structure
- deepforest.utilities.check_image(image)[source]#
Check an image is three channel, channel last format :param image: numpy array
Returns: None, throws error on assert
- deepforest.utilities.project_boxes(df, root_dir, transform=True)[source]#
Convert from image coordinates to geographic coordinates Note that this assumes df is just a single plot being passed to this function df: a pandas type dataframe with columns: name, xmin, ymin, xmax, ymax. Name is the relative path to the root_dir arg. root_dir: directory of images to lookup image_path column transform: If true, convert from image to geographic coordinates
- deepforest.utilities.round_with_floats(x)[source]#
Check if string x is float or int, return int, rounded if needed.
- deepforest.utilities.shapefile_to_annotations(shapefile, rgb, buffer_size=0.5, geometry_type='bbox', savedir='.')[source]#
Convert a shapefile of annotations into annotations csv file for DeepForest training and evaluation
Geometry Handling: The geometry_type is the form of the objects in the given shapefile. It can be “bbox” or “point”. If geometry_type is set to “bbox” (default) then the bounding boxes in the shapefile will be used as is and transferred over to the annotations file. If the geometry_type is “point” then a bounding box will be created for each point that is centered on the point location and has an apothem equal to buffer_size, resulting in a bounding box with dimensions of 2 times the value of buffer_size.
- Parameters
shapefile – Path to a shapefile on disk. If a label column is present, it will be used, else all labels are assumed to be “Tree”
rgb – Path to the RGB image on disk
savedir – Directory to save csv files
buffer_size – size of point to box expansion in map units of the target object, meters for projected data, pixels for unprojected data. The buffer_size is added to each side of the x,y point to create the box.
geometry_type – Specifies the spatial representation used in the shapefile; can be “bbox” or “point”
- Returns
a pandas dataframe
- Return type
results
- deepforest.utilities.use_bird_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='bird', check_release=True)[source]#
Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical
Returns: release_tag, output_path (str): path to downloaded model
- deepforest.utilities.use_release(save_dir='/home/docs/checkouts/readthedocs.org/user_builds/deepforest/checkouts/latest/deepforest/data/', prebuilt_model='NEON', check_release=True)[source]#
Check the existence of, or download the latest model release from github :param save_dir: Directory to save filepath, default to “data” in deepforest repo :param prebuilt_model: Currently only accepts “NEON”, but could be expanded to include other prebuilt models. The local model will be called prebuilt_model.h5 on disk. :param check_release: whether to check github for a model recent release. In cases where you are hitting the github API rate limit, set to False and any local model will be downloaded. If no model has been downloaded an error will raise. :type check_release: logical
Returns: release_tag, output_path (str): path to downloaded model
- deepforest.utilities.xml_to_annotations(xml_path)[source]#
Load annotations from xml format (e.g. RectLabel editor) and convert them into retinanet annotations format. :param xml_path: Path to the annotations xml, formatted by RectLabel :type xml_path: str
- Returns
- in the
format -> path-to-image.png,x1,y1,x2,y2,class_name
- Return type
Annotations (pandas dataframe)
deepforest.visualize module#
- deepforest.visualize.format_boxes(prediction, scores=True)[source]#
Format a retinanet prediction into a pandas dataframe for a single image :param prediction: a dictionary with keys ‘boxes’ and ‘labels’ coming from a retinanet :param scores: Whether boxes come with scores, during prediction, or without scores, as in during training.
- Returns:
df: a pandas dataframe
- deepforest.visualize.plot_points(image, df, color=None, thickness=1)[source]#
Plot a set of points on an image By default this function does not show, but only plots an axis Label column must be numeric! Image must be BGR color order! :param image: a numpy array in BGR color order! Channel order is channels first :param df: a pandas dataframe with x,y and label column :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px
- Returns
a numpy array with drawn annotations
- Return type
image
- deepforest.visualize.plot_prediction_and_targets(image, predictions, targets, image_name, savedir)[source]#
Plot an image, its predictions, and its ground truth targets for debugging :param image: torch tensor, RGB color order :param targets: torch tensor
- Returns
path on disk with saved figure
- Return type
figure_path
- deepforest.visualize.plot_prediction_dataframe(df, root_dir, savedir, color=None, thickness=1, ground_truth=None)[source]#
For each row in dataframe, call plot predictions and save plot files to disk. For multi-class labels, boxes will be colored by labels. Ground truth boxes will all be same color, regardless of class. :param df: a pandas dataframe with image_path, xmin, xmax, ymin, ymax and label columns. The image_path column should be the relative path from root_dir, not the full path. :param root_dir: relative dir to look for image names from df.image_path :param ground_truth: an optional pandas dataframe in same format as df holding ground_truth boxes :param savedir: save the plot to an optional directory path.
- Returns
list of filenames written
- Return type
written_figures
- deepforest.visualize.plot_predictions(image, df, color=None, thickness=1)[source]#
Plot a set of boxes on an image By default this function does not show, but only plots an axis Label column must be numeric! Image must be BGR color order! :param image: a numpy array in BGR color order! Channel order is channels first :param df: a pandas dataframe with xmin, xmax, ymin, ymax and label column :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px
- Returns
a numpy array with drawn annotations
- Return type
image
- deepforest.visualize.view_dataset(ds, savedir=None, color=None, thickness=1)[source]#
Plot annotations on images for debugging purposes :param ds: a deepforest pytorch dataset, see deepforest.dataset or deepforest.load_dataset() to start from a csv file :param savedir: optional path to save figures. If none (default) images will be interactively plotted :param color: color of the bounding box as a tuple of BGR color, e.g. orange annotations is (0, 165, 255) :param thickness: thickness of the rectangle border line in px
Module contents#
Top-level package for DeepForest.