loss

An implementation of the loss function for training the YOLOX object detection model based on OpenMMLab’s implementation in the mmdetection library.

source

SamplingResult

 SamplingResult (positive_indices:numpy.ndarray,
                 negative_indices:numpy.ndarray, bboxes:numpy.ndarray,
                 ground_truth_bboxes:torch.Tensor,
                 assignment_result:cjm_yolox_pytorch.simota.AssignResult,
                 ground_truth_flags:numpy.ndarray)

*Bounding box sampling result.

Based on OpenMMLab’s implementation in the mmdetection library:

OpenMMLab’s Implementation*

source

YOLOXLoss

 YOLOXLoss (num_classes:int, bbox_loss_weight:float=5.0,
            class_loss_weight:float=1.0, objectness_loss_weight:float=1.0,
            l1_loss_weight:float=1.0, use_l1:bool=False,
            strides:List[int]=[8, 16, 32])

*The callable YOLOXLoss class implements the loss function for training a YOLOX model.

A YOLOXLoss instance takes the, class scores, predicted bounding boxes, objectness scores, ground truth bounding boxes, and ground truth labels. It then goes through the following steps:

Generate box coordinates for the output grids based on the input dimensions and stride values.
Flatten and concatenate class predictions, bounding box predictions, and objectness scores.
Decode box predictions.
Compute targets for each image in the batch.
Concatenate all positive masks, class targets, objectness targets, and bounding box targets.
Compute the bounding box loss, objectness loss, and classification loss, scale them by their respective weights, and normalize them by the total number of samples.
If using L1 loss, concatenate L1 targets, computes the L1 loss, scale it by its weight, and normalize it by the total number of samples.
Return a dictionary containing the computed losses.

Based on OpenMMLab’s implementation in the mmdetection library:

OpenMMLab’s Implementation*

	Type	Default	Details
num_classes	int		The number of target classes.
bbox_loss_weight	float	5.0	The weight for the loss function to calculate the bounding box regression loss.
class_loss_weight	float	1.0	The weight for the loss function to calculate the classification loss.
objectness_loss_weight	float	1.0	The weight for the loss function to calculate the objectness loss.
l1_loss_weight	float	1.0	The weight for the loss function to calculate the L1 loss.
use_l1	bool	False	Whether to use L1 loss in the calculation.
strides	List	[8, 16, 32]	The list of strides.

source

YOLOXLoss.init

 YOLOXLoss.__init__ (num_classes:int, bbox_loss_weight:float=5.0,
                     class_loss_weight:float=1.0,
                     objectness_loss_weight:float=1.0,
                     l1_loss_weight:float=1.0, use_l1:bool=False,
                     strides:List[int]=[8, 16, 32])

The __init__ method defines several parameters for computing the loss, and it initializes different loss functions, such as Generalized IoU for bounding box loss, binary cross entropy with logits for classification and objectness loss, and L1 loss if applicable.

	Type	Default	Details
num_classes	int		The number of target classes.
bbox_loss_weight	float	5.0	The weight for the loss function to calculate the bounding box regression loss.
class_loss_weight	float	1.0	The weight for the loss function to calculate the classification loss.
objectness_loss_weight	float	1.0	The weight for the loss function to calculate the objectness loss.
l1_loss_weight	float	1.0	The weight for the loss function to calculate the L1 loss.
use_l1	bool	False	Whether to use L1 loss in the calculation.
strides	List	[8, 16, 32]	The list of strides.

source

YOLOXLoss.bbox_decode

 YOLOXLoss.bbox_decode (output_grid_boxes:torch.Tensor,
                        predicted_boxes:torch.Tensor)

Decodes the predicted bounding boxes based on the output grid boxes. Positive indices are those where the ground truth box indices are greater-than zero (indicating a match with a ground truth object), and the negatives are where the ground truth box indices are zero (meaning it does not pair with a ground truth object).

	Type	Details
output_grid_boxes	Tensor	The output grid boxes.
predicted_boxes	Tensor	The predicted bounding boxes.
Returns	Tensor	The decoded bounding boxes.

source

YOLOXLoss.sample

 YOLOXLoss.sample
                   (assignment_result:cjm_yolox_pytorch.simota.AssignResul
                   t, bboxes:torch.Tensor,
                   ground_truth_boxes:torch.Tensor)

Samples positive and negative indices based on the assignment result.

	Type	Details
assignment_result	AssignResult	The assignment result obtained from assigner.
bboxes	Tensor	The predicted bounding boxes.
ground_truth_boxes	Tensor	The ground truth boxes.
Returns	SamplingResult	The sampling result containing positive and negative indices.

source

YOLOXLoss.get_l1_target

 YOLOXLoss.get_l1_target (l1_target:torch.Tensor,
                          ground_truth_boxes:torch.Tensor,
                          output_grid_boxes:torch.Tensor,
                          epsilon:float=1e-08)

Calculates the L1 target, which measures the absolute differences between the predicted and actual values. The L1 loss measures how well the model’s predictions match the ground truth values.

	Type	Default	Details
l1_target	Tensor		The L1 target tensor.
ground_truth_boxes	Tensor		The ground truth boxes.
output_grid_boxes	Tensor		The output grid boxes.
epsilon	float	1e-08	A small value to prevent division by zero.
Returns	Tensor		The updated L1 target.

source

YOLOXLoss.get_target_single

 YOLOXLoss.get_target_single (class_preds:torch.Tensor,
                              objectness_score:torch.Tensor,
                              output_grid_boxes:torch.Tensor,
                              decoded_bboxes:torch.Tensor,
                              ground_truth_bboxes:torch.Tensor,
                              ground_truth_labels:torch.Tensor)

Calculates the targets for a single image. It assigns ground truth objects to output grid boxes and samples output grid boxes based on the assignment results. It then generates class targets, objectness targets, bounding box targets, and, optionally, L1 targets.

	Type	Details
class_preds	Tensor	The predicted class probabilities.
objectness_score	Tensor	The predicted objectness scores.
output_grid_boxes	Tensor	The output grid boxes.
decoded_bboxes	Tensor	The decoded bounding boxes.
ground_truth_bboxes	Tensor	The ground truth boxes.
ground_truth_labels	Tensor	The ground truth labels.
Returns	Tuple	The targets for classification, objectness, bounding boxes, and L1 (if applicable), along with the foreground mask and the number of positive samples.

source

YOLOXLoss.flatten_and_concat

 YOLOXLoss.flatten_and_concat (tensors:List[torch.Tensor], batch_size:int,
                               reshape_dims:Optional[bool]=None)

Flatten and concatenate a list of tensors.

	Type	Default	Details
tensors	List		A list of tensors to flatten and concatenate.
batch_size	int		The batch size used to reshape the concatenated tensor.
reshape_dims	Optional	None
Returns	Tensor		The concatenated tensor

source

YOLOXLoss.call

 YOLOXLoss.__call__ (class_scores:List[torch.Tensor],
                     predicted_bboxes:List[torch.Tensor],
                     objectness_scores:List[torch.Tensor],
                     ground_truth_bboxes:List[torch.Tensor],
                     ground_truth_labels:List[torch.Tensor])

The __call__ method computes the loss values. It first generates box coordinates for the output grids based on the input dimensions and stride values. It then flattens and concatenates class predictions, bounding box predictions, and objectness scores. Next, it decodes the bounding box predictions, computes targets for each image in the batch, and finally computes the bounding box loss, objectness loss, and classification loss (and L1 loss, optionally). These losses are scaled by their respective weights and normalized by the total number of samples.

	Type	Details
class_scores	List	A list of class scores for each scale.
predicted_bboxes	List	A list of predicted bounding boxes for each scale.
objectness_scores	List	A list of objectness scores for each scale.
ground_truth_bboxes	List	A list of ground truth bounding boxes for each image.
ground_truth_labels	List	A list of ground truth labels for each image.
Returns	Dict	A dictionary with the classification, bounding box, objectness, and optionally, L1 loss.