loss
SamplingResult
SamplingResult (positive_indices:numpy.ndarray, negative_indices:numpy.ndarray, bboxes:numpy.ndarray, ground_truth_bboxes:torch.Tensor, assignment_result:cjm_yolox_pytorch.simota.AssignResult, ground_truth_flags:numpy.ndarray)
*Bounding box sampling result.
Based on OpenMMLab’s implementation in the mmdetection library:
YOLOXLoss
YOLOXLoss (num_classes:int, bbox_loss_weight:float=5.0, class_loss_weight:float=1.0, objectness_loss_weight:float=1.0, l1_loss_weight:float=1.0, use_l1:bool=False, strides:List[int]=[8, 16, 32])
*The callable YOLOXLoss class implements the loss function for training a YOLOX model.
A YOLOXLoss instance takes the, class scores, predicted bounding boxes, objectness scores, ground truth bounding boxes, and ground truth labels. It then goes through the following steps:
- Generate box coordinates for the output grids based on the input dimensions and stride values.
- Flatten and concatenate class predictions, bounding box predictions, and objectness scores.
- Decode box predictions.
- Compute targets for each image in the batch.
- Concatenate all positive masks, class targets, objectness targets, and bounding box targets.
- Compute the bounding box loss, objectness loss, and classification loss, scale them by their respective weights, and normalize them by the total number of samples.
- If using L1 loss, concatenate L1 targets, computes the L1 loss, scale it by its weight, and normalize it by the total number of samples.
- Return a dictionary containing the computed losses.
Based on OpenMMLab’s implementation in the mmdetection library:
Type | Default | Details | |
---|---|---|---|
num_classes | int | The number of target classes. | |
bbox_loss_weight | float | 5.0 | The weight for the loss function to calculate the bounding box regression loss. |
class_loss_weight | float | 1.0 | The weight for the loss function to calculate the classification loss. |
objectness_loss_weight | float | 1.0 | The weight for the loss function to calculate the objectness loss. |
l1_loss_weight | float | 1.0 | The weight for the loss function to calculate the L1 loss. |
use_l1 | bool | False | Whether to use L1 loss in the calculation. |
strides | List | [8, 16, 32] | The list of strides. |
YOLOXLoss.__init__
YOLOXLoss.__init__ (num_classes:int, bbox_loss_weight:float=5.0, class_loss_weight:float=1.0, objectness_loss_weight:float=1.0, l1_loss_weight:float=1.0, use_l1:bool=False, strides:List[int]=[8, 16, 32])
The __init__
method defines several parameters for computing the loss, and it initializes different loss functions, such as Generalized IoU for bounding box loss, binary cross entropy with logits for classification and objectness loss, and L1 loss if applicable.
Type | Default | Details | |
---|---|---|---|
num_classes | int | The number of target classes. | |
bbox_loss_weight | float | 5.0 | The weight for the loss function to calculate the bounding box regression loss. |
class_loss_weight | float | 1.0 | The weight for the loss function to calculate the classification loss. |
objectness_loss_weight | float | 1.0 | The weight for the loss function to calculate the objectness loss. |
l1_loss_weight | float | 1.0 | The weight for the loss function to calculate the L1 loss. |
use_l1 | bool | False | Whether to use L1 loss in the calculation. |
strides | List | [8, 16, 32] | The list of strides. |
YOLOXLoss.bbox_decode
YOLOXLoss.bbox_decode (output_grid_boxes:torch.Tensor, predicted_boxes:torch.Tensor)
Decodes the predicted bounding boxes based on the output grid boxes. Positive indices are those where the ground truth box indices are greater-than zero (indicating a match with a ground truth object), and the negatives are where the ground truth box indices are zero (meaning it does not pair with a ground truth object).
Type | Details | |
---|---|---|
output_grid_boxes | Tensor | The output grid boxes. |
predicted_boxes | Tensor | The predicted bounding boxes. |
Returns | Tensor | The decoded bounding boxes. |
YOLOXLoss.sample
YOLOXLoss.sample (assignment_result:cjm_yolox_pytorch.simota.AssignResul t, bboxes:torch.Tensor, ground_truth_boxes:torch.Tensor)
Samples positive and negative indices based on the assignment result.
Type | Details | |
---|---|---|
assignment_result | AssignResult | The assignment result obtained from assigner. |
bboxes | Tensor | The predicted bounding boxes. |
ground_truth_boxes | Tensor | The ground truth boxes. |
Returns | SamplingResult | The sampling result containing positive and negative indices. |
YOLOXLoss.get_l1_target
YOLOXLoss.get_l1_target (l1_target:torch.Tensor, ground_truth_boxes:torch.Tensor, output_grid_boxes:torch.Tensor, epsilon:float=1e-08)
Calculates the L1 target, which measures the absolute differences between the predicted and actual values. The L1 loss measures how well the model’s predictions match the ground truth values.
Type | Default | Details | |
---|---|---|---|
l1_target | Tensor | The L1 target tensor. | |
ground_truth_boxes | Tensor | The ground truth boxes. | |
output_grid_boxes | Tensor | The output grid boxes. | |
epsilon | float | 1e-08 | A small value to prevent division by zero. |
Returns | Tensor | The updated L1 target. |
YOLOXLoss.get_target_single
YOLOXLoss.get_target_single (class_preds:torch.Tensor, objectness_score:torch.Tensor, output_grid_boxes:torch.Tensor, decoded_bboxes:torch.Tensor, ground_truth_bboxes:torch.Tensor, ground_truth_labels:torch.Tensor)
Calculates the targets for a single image. It assigns ground truth objects to output grid boxes and samples output grid boxes based on the assignment results. It then generates class targets, objectness targets, bounding box targets, and, optionally, L1 targets.
Type | Details | |
---|---|---|
class_preds | Tensor | The predicted class probabilities. |
objectness_score | Tensor | The predicted objectness scores. |
output_grid_boxes | Tensor | The output grid boxes. |
decoded_bboxes | Tensor | The decoded bounding boxes. |
ground_truth_bboxes | Tensor | The ground truth boxes. |
ground_truth_labels | Tensor | The ground truth labels. |
Returns | Tuple | The targets for classification, objectness, bounding boxes, and L1 (if applicable), along with the foreground mask and the number of positive samples. |
YOLOXLoss.flatten_and_concat
YOLOXLoss.flatten_and_concat (tensors:List[torch.Tensor], batch_size:int, reshape_dims:Optional[bool]=None)
Flatten and concatenate a list of tensors.
Type | Default | Details | |
---|---|---|---|
tensors | List | A list of tensors to flatten and concatenate. | |
batch_size | int | The batch size used to reshape the concatenated tensor. | |
reshape_dims | Optional | None | |
Returns | Tensor | The concatenated tensor |
YOLOXLoss.__call__
YOLOXLoss.__call__ (class_scores:List[torch.Tensor], predicted_bboxes:List[torch.Tensor], objectness_scores:List[torch.Tensor], ground_truth_bboxes:List[torch.Tensor], ground_truth_labels:List[torch.Tensor])
The __call__
method computes the loss values. It first generates box coordinates for the output grids based on the input dimensions and stride values. It then flattens and concatenates class predictions, bounding box predictions, and objectness scores. Next, it decodes the bounding box predictions, computes targets for each image in the batch, and finally computes the bounding box loss, objectness loss, and classification loss (and L1 loss, optionally). These losses are scaled by their respective weights and normalized by the total number of samples.
Type | Details | |
---|---|---|
class_scores | List | A list of class scores for each scale. |
predicted_bboxes | List | A list of predicted bounding boxes for each scale. |
objectness_scores | List | A list of objectness scores for each scale. |
ground_truth_bboxes | List | A list of ground truth bounding boxes for each image. |
ground_truth_labels | List | A list of ground truth labels for each image. |
Returns | Dict | A dictionary with the classification, bounding box, objectness, and optionally, L1 loss. |