This repository contains a set of tools for object detection and distance measurement using the BDD100K dataset annotations. The tools are designed to extract bounding boxes, calculate the distance of objects, and detect potential collisions. The entire workflow involves converting the BDD100K annotations from JSON to a custom format, then using that data to process images, calculate distances, and visualize results.
- Overview
- Dependencies
- Features
- How to Use
- Project Structure
- Class and Function Details
- Custom Annotation Format
- License
This repository includes two main components:
- BDD100K to Custom Annotation Converter: This script converts the BDD100K JSON annotations into a custom format containing bounding box coordinates and class IDs.
- Object Distance Measurement: This script calculates the distance of detected objects in an image, displays the distances, and detects if an object is within a specified polygon for collision detection.
The two tools can work together to process the BDD100K dataset, create custom annotations, and measure the distance of detected objects in images. The custom annotation files can be used for training object detection models, while the distance measurement tool provides valuable insights into the real-world distances of detected objects.
This repository is designed for cases where you have detection results (e.g., bounding boxes) and you want to predict the distance of the detected objects without using a deep learning model. The object distance measurement is based on the size of the bounding boxes in the image and their known reference sizes. T
This solution is useful when you have pre-detected objects and wish to calculate their distances using simple geometric methods, offering a lightweight alternative to more computationally expensive methods like training a deep learning model.
- Python 3.x: For the custom annotation converter script.
- OpenCV: For image processing and object detection in C++.
- C++: For compiling and running the distance measure code.
- JSON and OS Libraries: Standard Python libraries for handling JSON files and directories.
- Filesystem Library: For managing file paths and directories in Python.
- Bounding Box Extraction: Extracts bounding box coordinates from BDD100K annotations.
- Class ID Mapping: Converts object category names (e.g., "car", "person") to custom class IDs using a predefined mapping dictionary.
- Custom Annotation Format: Saves bounding box and class ID information in a custom
.txt
format suitable for object detection tasks. - Output Directory Management: Automatically creates an output directory if it doesn't exist.
- Object Detection: Detects specific objects (e.g., cars, trucks) and calculates their distance from the camera based on the size of their bounding box.
- Collision Detection: Identifies whether detected objects are within a specific polygon, useful for applications like autonomous driving or object tracking.
- Visual Feedback: Annotates the image with detected object distances, confidence scores, and custom labels.
- Customizable Parameters: You can define the objects to track, set the focal length, and adjust the size reference for various objects.
- Download the BDD100K Dataset: Ensure you have the BDD100K dataset's annotation JSON files.
- Update the Class Mapping: Modify the
class_mapping
dictionary to ensure correct class IDs are assigned to the object categories. - Set File Paths: Set the paths for the BDD100K JSON files and the output directory for custom annotations.
- Run the Script:
- Save the script as
convert_bdd100k_to_custom.py
. - Run the script in your terminal:
python convert_bdd100k_to_custom.py
- Save the script as
The script will process the annotations and output .txt
files in the specified output directory.
- Prepare the Image and Annotation Data: Ensure the custom annotations (from Step 1) are available for processing.
- Set Up the Distance Measure Code:
- Ensure OpenCV and necessary libraries are installed.
- Update the image path and bounding box data with the custom annotations in the C++ code.
- Compile and Run the C++ Code:
- Compile the C++ code:
g++ -o distance_measure distance_measure.cpp `pkg-config --cflags --libs opencv4`
- Run the compiled program:
./distance_measure
- Compile the C++ code:
The program will:
- Calculate the distance of detected objects.
- Draw the detected points and distances on the image.
- Save the processed image.
The directory structure of the project is as follows:
vehicle-distance-measurement
├── bdd100k
│ ├── custom_labels
│ │ └── test
│ │ ├── cabc30fc-eb673c5a.txt
│ │ ├── cb319c00-9206979b.txt
│ │ ├── cb97debb-12f48570.txt
│ │ ├── cbe73da0-461983da.txt
│ │ └── cbf2d780-06947287.txt
│ ├── images
│ │ └── test
│ │ ├── cabc30fc-eb673c5a.jpg
│ │ ├── cabc30fc-eb673c5a_processed.jpg
│ │ ├── cb319c00-9206979b.jpg
│ │ ├── cb319c00-9206979b_processed.jpg
│ │ ├── cb97debb-12f48570.jpg
│ │ ├── cb97debb-12f48570_processed.jpg
│ │ ├── cbe73da0-461983da.jpg
│ │ ├── cbe73da0-461983da_processed.jpg
│ │ ├── cbf2d780-06947287.jpg
│ │ └── cbf2d780-06947287_processed.jpg
│ └── labels
│ └── test
│ ├── cabc30fc-eb673c5a.json
│ ├── cb319c00-9206979b.json
│ ├── cb97debb-12f48570.json
│ ├── cbe73da0-461983da.json
│ └── cbf2d780-06947287.json
├── bdd100k_to_custom.py
├── distance_measure
├── distance_measure.cpp
├── LICENSE
└── README.md
Input Image | Processed Result |
---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Converts BDD100K annotations from JSON format to a custom format.
- json_file: Path to the BDD100K JSON file.
- output_dir: Directory to save the custom annotation
.txt
files. - class_mapping: A dictionary that maps BDD100K class labels to custom class IDs.
The converted annotations are saved in .txt
files with the following format:
x_min, y_min, x_max, y_max, "category"
For example:
100, 150, 300, 400, "car"
200, 250, 500, 600, "truck"
The class that handles distance measurement and collision detection.
-
Constructor: Initializes the object with a list of objects to track and the focal length.
-
Methods:
updateDistance(boxes)
: Updates the distance of detected objects based on their bounding box size.calcCollisionPoint(poly)
: Calculates the collision point for objects within a specified polygon.drawDetectedOnFrame(frame_show)
: Draws detected points and their distances on the image.
- f: The focal length of the camera.
- object_list: A list of object categories to detect.
- RefSizeDict: A dictionary containing the reference sizes of various objects (e.g., cars, buses).
The custom annotation format used for object detection tasks is:
x_min, y_min, x_max, y_max, "category"
Where:
x_min
,y_min
: Top-left corner of the bounding box.x_max
,y_max
: Bottom-right corner of the bounding box."category"
: The object class name (e.g., "car", "person").
These annotations are saved in .txt
files, one for each image.
This project is licensed under the MIT License.