Skip to content

bijeshsingha/Yolo-Implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

YOLO Object Detection & Class Distribution Analysis

Author: Bijesh Singha

📌 Project Objective

The primary objective of this assignment was to implement a YOLO (You Only Look Once) object detection model and evaluate its performance across diverse video datasets. The analysis focuses on:

  • Identifying entities within the model's standard trained classes.
  • Observing model behavior when encountering "out-of-distribution" objects not present in the training set.
  • Testing the model's generalization capabilities across real-world, stylized (cartoon), and AI-generated content.

📊 Experimental Results & Analysis

Case 1: Real-World Traffic Surveillance

The model was tested on standard CCTV footage capturing road traffic. * Findings: The model correctly identified a high volume of standard entities, including cars (2,580 detections), persons (749), and motorcycles (369). * Interpretation: Accuracy remained high because these entities belong to the standard 80 COCO classes the model was originally trained on.

Case 2: Out-of-Distribution Animal Detection

Testing class limitations using footage of a person feeding lions. * Findings: While the model accurately detected the person (102 detections), it completely failed to identify the lions. * Feature Proximity Mapping: Because "lion" is not one of the 80 trained classes, the model forced a classification based on the closest visual features available, misidentifying them as dogs (86), bears (66), and cats (29).

Case 3: Stylized Imagery (Cartoon Video)

Evaluating how the model handles non-realistic, stylized representations.

  • Findings:
  • A cartoon child was correctly generalized as a "person" (325 detections).
  • Abstractly drawn objects caused struggles; a chocolate bar was misidentified as a "cell phone" and bees were classified as "birds" or "sports balls".

Interpretation: The model generalizes human-like features well but defaults to the most similar trained feature set for small or abstract objects.

Case 4: AI-Generated Content

Determining if YOLO can distinguish between real and synthetic (AI) media. * Findings: The model purely identifies objects regardless of whether the source is real or synthetic; it failed to classify the video as "AI-generated". * Interpretation: YOLO is built for object localization and classification, not provenance or deepfake detection.


💡 Key Findings & Model Logic

Based on these experiments, the following conclusions were drawn regarding YOLO's operational logic:

Scenario Primary Object Top Detection Class Accuracy Note
Road Video Cars / Persons Car / Person
High: Objects are within standard 80 classes.
Animal Video Lions Dog / Bear
Low: "Lion" class is missing from training.
Cartoon Video Cartoon Child Person
Moderate: Effectively generalizes features.
AI Video Synthetic Objects Standard Classes
N/A: Cannot detect "AI" as a class.

Summary of Model Behavior

Fixed Class Constraint: The model is strictly limited to its 80 trained classes.

No Null Results: When encountering an unknown object, the model does not return a null result; it assigns the class with the most similar visual features (e.g., mapping lion features to a dog or bear).

Generalization vs. Specificity: While it can generalize (treating cartoon humans as "people"), it lacks the nuance to identify specific items outside its training set, such as specific animal species or unique consumer goods.


About

YOLO implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors