Terminologies and Methods in Object Detection

May. 23, 2024

Fundamentals

  • backbone (network): refers to the feature-extracting network that processes input data into a certain feature representation.12
  • Region-based approach (region proposal algorithm)
    • The region-based approach is also known as the region proposal algorithm, which extracts an adequate number of regions from the image. … Region proposal is used as a preprocessing stage or even the key step in many computer vision issues. The algorithm extracts a pool of appropriate regions, which are likely to contain objects, from an image. The extracted regions are shown as a bounding box or a segmented candidate.3
    • In the region-based approach, all pixels that correspond to an object are grouped together and are marked to indicate that they belong to one region. This process is called segmentation. Pixels are assigned to regions using some criterion that distinguishes them from the rest of the image.4
    • Region Based Convolutional Neural Networks: The original goal of R-CNN was to take an input image and produce a set of bounding boxes as output, where each bounding box contains an object and also the category (e.g. car or pedestrian) of the object.5


Annotation methods (annotation styles)

  • HBB (Horizontal Bounding Box) vs. OBB (Oriented Bounding Box)6

    image-20240523085016249

  • Bounding Box, Diamond Polygon, and Full Instance Segmentation7

    image-20240523091820425


Methods

  • Few-Shot Learning8
    • Few-Shot Learning is a machine learning framework in which an AI model learns to make accurate predictions by training on a very small number of labeled examples. It’s typically used to train models for classification tasks when suitable training data is scarce.
    • One-Shot Learning: where there is only one labeled example of each class to be learned.
    • Zero-Shot Learning: where there are no labeled examples at all.
    • While one-shot learning is essentially just a challenging variant of FSL, zero-shot learning is a distinct learning problem that necessitates its own unique methodologies.
  • Linear Warmup
    • Linear Warmup is a learning rate schedule where we linearly increase the learning rate from a low rate to a constant rate thereafter. This reduces volatility in the early stages of training.9
    • “… Additionally, we used a linear warmup period over 1K iterations.”7
  • Cross-dataset validations1011
  • Synthetic data


References

  1. What Does Backbone Mean in Neural Networks?˄

  2. deep learning - What does backbone mean in a neural network?˄

  3. A comprehensive and systematic review on classical and deep learning based region proposal algorithms˄

  4. MachineVision_Chapter3.pdf˄

  5. Region Based Convolutional Neural Networks˄

  6. Ding, Jian, et al. “Object detection in aerial images: A large-scale benchmark and challenges.” IEEE transactions on pattern analysis and machine intelligence 44.11 (2021): 7778-7796, available at: (PDF) Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges, or Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges˄

  7. Shermeyer, Jacob, et al. “Rareplanes: Synthetic data takes flight.” Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021, available at: WACV 2021 Open Access Repository˄ ˄2 ˄3

  8. What Is Few-Shot Learning?˄

  9. Linear Warmup Explained˄

  10. Xia, Gui-Song, et al. “DOTA: A large-scale dataset for object detection in aerial images.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018, available at: CVPR 2018 Open Access Repository˄

  11. Torralba, Antonio, and Alexei A. Efros. “Unbiased look at dataset bias.” CVPR 2011. IEEE, 2011, available at: Unbiased look at dataset bias, or datasets_cvpr11.pdf˄