CNN 009

2026년 4월 22일 · 약 2분

Owner

Prediciting Bounding Boxes

Check the probabilities of each detection and keep ones with score above a certain threshold (0.7)
For remaining boxes, a. Box with highest score is the detection results. b. Discard any remaining boxes with IoU > 0.5 with final detected box c. i.e. overlap with the box with highest score.

Associate each object to:
- A cell which contains its mid-point and
- Anchor box for the cell with highest IoU
Calculate the IoU of Anchor boxes and prediected Bounding Boxes.
- $IoU(P_{bb}, A_{bb}) = \frac{Area of Overlap}{Area of Union}$
$\hat{y} = \{P_0, x, y, h, w, C1, C2, \quad P_0 x, y, h, w, C1, C2\}$ $y^={P0,x,y,h,w,C1,C2,P0x,y,h,w,C1,C2}$
- $P_0$ is objectness score
- $x, y$ are the coordinates of the center of the bounding box relative to
- $h, w$ are the height and width of the bounding box
- $C1, C2$ are the class information for the object in the bounding box

Similar to YOLO, VGG16 base Convolutional Neural Network layers
Take advantage of Anchor boxes with different aspect ratios
Large number of anchors boxes are chosen
Not suitable for small objects
3 times faster than Faster R-CNN
with ResNet-101 base SSD may help in detecting small objects with better features from the CONV layers

SSD 300 architecture

Base Networks
- VGG156
- ResNet-101
- Inception-v2, v3
- ResNet
- MobileNet
- Alexnet
- ZFNet
Object Detection Framework
- R-CNN family
- YOLO family
- SSD family
- F-RCNN family
Faster-RCNN is more accurate but slower
YOLO/SSD are faster/real-time but may not be very accurate