loss

An implementation of the loss function for training the YOLOX object detection model based on OpenMMLab’s implementation in the mmdetection library.

source

SamplingResult

 SamplingResult (positive_indices:numpy.ndarray,
                 negative_indices:numpy.ndarray, bboxes:numpy.ndarray,
                 ground_truth_bboxes:torch.Tensor,
                 assignment_result:cjm_yolox_pytorch.simota.AssignResult,
                 ground_truth_flags:numpy.ndarray)

*Bounding box sampling result.

Based on OpenMMLab’s implementation in the mmdetection library:


source

YOLOXLoss

 YOLOXLoss (num_classes:int, bbox_loss_weight:float=5.0,
            class_loss_weight:float=1.0, objectness_loss_weight:float=1.0,
            l1_loss_weight:float=1.0, use_l1:bool=False,
            strides:List[int]=[8, 16, 32])

*The callable YOLOXLoss class implements the loss function for training a YOLOX model.

A YOLOXLoss instance takes the, class scores, predicted bounding boxes, objectness scores, ground truth bounding boxes, and ground truth labels. It then goes through the following steps:

  1. Generate box coordinates for the output grids based on the input dimensions and stride values.
  2. Flatten and concatenate class predictions, bounding box predictions, and objectness scores.
  3. Decode box predictions.
  4. Compute targets for each image in the batch.
  5. Concatenate all positive masks, class targets, objectness targets, and bounding box targets.
  6. Compute the bounding box loss, objectness loss, and classification loss, scale them by their respective weights, and normalize them by the total number of samples.
  7. If using L1 loss, concatenate L1 targets, computes the L1 loss, scale it by its weight, and normalize it by the total number of samples.
  8. Return a dictionary containing the computed losses.

Based on OpenMMLab’s implementation in the mmdetection library:

Type Default Details
num_classes int The number of target classes.
bbox_loss_weight float 5.0 The weight for the loss function to calculate the bounding box regression loss.
class_loss_weight float 1.0 The weight for the loss function to calculate the classification loss.
objectness_loss_weight float 1.0 The weight for the loss function to calculate the objectness loss.
l1_loss_weight float 1.0 The weight for the loss function to calculate the L1 loss.
use_l1 bool False Whether to use L1 loss in the calculation.
strides List [8, 16, 32] The list of strides.

source

YOLOXLoss.__init__

 YOLOXLoss.__init__ (num_classes:int, bbox_loss_weight:float=5.0,
                     class_loss_weight:float=1.0,
                     objectness_loss_weight:float=1.0,
                     l1_loss_weight:float=1.0, use_l1:bool=False,
                     strides:List[int]=[8, 16, 32])

The __init__ method defines several parameters for computing the loss, and it initializes different loss functions, such as Generalized IoU for bounding box loss, binary cross entropy with logits for classification and objectness loss, and L1 loss if applicable.

Type Default Details
num_classes int The number of target classes.
bbox_loss_weight float 5.0 The weight for the loss function to calculate the bounding box regression loss.
class_loss_weight float 1.0 The weight for the loss function to calculate the classification loss.
objectness_loss_weight float 1.0 The weight for the loss function to calculate the objectness loss.
l1_loss_weight float 1.0 The weight for the loss function to calculate the L1 loss.
use_l1 bool False Whether to use L1 loss in the calculation.
strides List [8, 16, 32] The list of strides.

source

YOLOXLoss.bbox_decode

 YOLOXLoss.bbox_decode (output_grid_boxes:torch.Tensor,
                        predicted_boxes:torch.Tensor)

Decodes the predicted bounding boxes based on the output grid boxes. Positive indices are those where the ground truth box indices are greater-than zero (indicating a match with a ground truth object), and the negatives are where the ground truth box indices are zero (meaning it does not pair with a ground truth object).

Type Details
output_grid_boxes Tensor The output grid boxes.
predicted_boxes Tensor The predicted bounding boxes.
Returns Tensor The decoded bounding boxes.

source

YOLOXLoss.sample

 YOLOXLoss.sample
                   (assignment_result:cjm_yolox_pytorch.simota.AssignResul
                   t, bboxes:torch.Tensor,
                   ground_truth_boxes:torch.Tensor)

Samples positive and negative indices based on the assignment result.

Type Details
assignment_result AssignResult The assignment result obtained from assigner.
bboxes Tensor The predicted bounding boxes.
ground_truth_boxes Tensor The ground truth boxes.
Returns SamplingResult The sampling result containing positive and negative indices.

source

YOLOXLoss.get_l1_target

 YOLOXLoss.get_l1_target (l1_target:torch.Tensor,
                          ground_truth_boxes:torch.Tensor,
                          output_grid_boxes:torch.Tensor,
                          epsilon:float=1e-08)

Calculates the L1 target, which measures the absolute differences between the predicted and actual values. The L1 loss measures how well the model’s predictions match the ground truth values.

Type Default Details
l1_target Tensor The L1 target tensor.
ground_truth_boxes Tensor The ground truth boxes.
output_grid_boxes Tensor The output grid boxes.
epsilon float 1e-08 A small value to prevent division by zero.
Returns Tensor The updated L1 target.

source

YOLOXLoss.get_target_single

 YOLOXLoss.get_target_single (class_preds:torch.Tensor,
                              objectness_score:torch.Tensor,
                              output_grid_boxes:torch.Tensor,
                              decoded_bboxes:torch.Tensor,
                              ground_truth_bboxes:torch.Tensor,
                              ground_truth_labels:torch.Tensor)

Calculates the targets for a single image. It assigns ground truth objects to output grid boxes and samples output grid boxes based on the assignment results. It then generates class targets, objectness targets, bounding box targets, and, optionally, L1 targets.

Type Details
class_preds Tensor The predicted class probabilities.
objectness_score Tensor The predicted objectness scores.
output_grid_boxes Tensor The output grid boxes.
decoded_bboxes Tensor The decoded bounding boxes.
ground_truth_bboxes Tensor The ground truth boxes.
ground_truth_labels Tensor The ground truth labels.
Returns Tuple The targets for classification, objectness, bounding boxes, and L1 (if applicable), along with the foreground mask and the number of positive samples.

source

YOLOXLoss.flatten_and_concat

 YOLOXLoss.flatten_and_concat (tensors:List[torch.Tensor], batch_size:int,
                               reshape_dims:Optional[bool]=None)

Flatten and concatenate a list of tensors.

Type Default Details
tensors List A list of tensors to flatten and concatenate.
batch_size int The batch size used to reshape the concatenated tensor.
reshape_dims Optional None
Returns Tensor The concatenated tensor

source

YOLOXLoss.__call__

 YOLOXLoss.__call__ (class_scores:List[torch.Tensor],
                     predicted_bboxes:List[torch.Tensor],
                     objectness_scores:List[torch.Tensor],
                     ground_truth_bboxes:List[torch.Tensor],
                     ground_truth_labels:List[torch.Tensor])

The __call__ method computes the loss values. It first generates box coordinates for the output grids based on the input dimensions and stride values. It then flattens and concatenates class predictions, bounding box predictions, and objectness scores. Next, it decodes the bounding box predictions, computes targets for each image in the batch, and finally computes the bounding box loss, objectness loss, and classification loss (and L1 loss, optionally). These losses are scaled by their respective weights and normalized by the total number of samples.

Type Details
class_scores List A list of class scores for each scale.
predicted_bboxes List A list of predicted bounding boxes for each scale.
objectness_scores List A list of objectness scores for each scale.
ground_truth_bboxes List A list of ground truth bounding boxes for each image.
ground_truth_labels List A list of ground truth labels for each image.
Returns Dict A dictionary with the classification, bounding box, objectness, and optionally, L1 loss.