1.5 C
Wednesday, November 29, 2023

Annotation Mastery: Seamless Detectron Integration with LabelImg


Labeling the picture, or annotating the picture, within the huge image of pc imaginative and prescient was difficult. Our exploration delves into the teamwork of LabelImg and Detectron, a strong duo that combines exact annotation with environment friendly mannequin constructing.LabelImg, which is simple to make use of and correct, leads in cautious annotation, laying a stable basis for clear object detection.

As we discover LabelImg and get higher at drawing bounding containers, we seamlessly transfer to Detectron. This sturdy framework organizes our marked information, making it useful in coaching superior fashions. LabelImg and Detectron collectively make object detection simple for everybody, whether or not you’re a newbie or an skilled. Come alongside, the place every marked picture helps us unlock the total energy of visible info.

Detectron Integration with LabelImg

Studying Targets

  • Getting Began with LabelImg.
  • Setting Setup and LabelImg Set up.
  • Understanding LabelImg and Its Performance.
  • Changing VOC or Pascal Information to COCO Format for Object Detection.

This text was revealed as part of the Information Science Blogathon.


Flowchart of Seamless Detectron Integration with LabelImg

Setting Up Your Setting

1. Create a Digital Setting:

conda create -p ./venv python=3.8 -y

This command creates a digital surroundings named “venv” utilizing Python model 3.8.

2. Activate the Digital Setting: 

conda activate venv

Activate the digital surroundings to isolate the set up of LabelImg.

Putting in and Utilizing LabelImg

1. Set up LabelImg:

pip set up labelImg

Set up LabelImg inside the activated digital surroundings.

2. Launch LabelImg:

Launch LabelImg

Troubleshooting: If You Encounter Errors Operating the Script

For those who encounter errors whereas working the script, I’ve ready a zipper archive containing the digital surroundings (venv) on your comfort.

1. Obtain the Zip Archive:

  • Obtain the venv.zip archive from the Hyperlink

2. Create a LabelImg Folder:

  • Create a brand new folder named LabelImg in your native machine.

3. Extract the venv Folder:

  • Extract the contents of the venv.zip archive into the LabelImg folder.

4. Activate the Digital Setting:

  • Open your command immediate or terminal.
  • Navigate to the LabelImg folder.
  • Run the next command to activate the digital surroundings:
conda activate ./venv

This course of ensures you might have a pre-configured digital surroundings prepared to make use of with LabelImg. The supplied zip archive encapsulates the required dependencies, permitting a smoother expertise with out worrying about potential set up.

Now, proceed with the sooner steps for putting in and utilizing LabelImg inside this activated digital surroundings.

Annotation Workflow with LabelImg

1. Annotate Photographs in PascalVOC Format:

  • Construct and launch LabelImg.
  • Click on ‘Change default saved annotation folder’ in Menu/File.
Steps to do Annotation Workflow with LabelImg
  • Click on ‘Open Dir’ to pick the picture listing.
Steps to do Annotation Workflow with LabelImg
Steps to do Annotation Workflow with LabelImg
  • Use ‘Create RectBox’ to annotate objects within the picture.
Steps to do Annotation Workflow with LabelImg
Steps to do Annotation Workflow with LabelImg
Steps to do Annotation Workflow with LabelImg
  • Save the annotations to the desired folder.
Steps to do Annotation Workflow with LabelImg

contained in the .xml 

	<filename>0a8a68ee-f587-4dea-beec-79d02e7d3fa4___RS_Early.B 8461.JPG</filename>
	<path>/residence/suyodhan/Paperwork/Weblog /label
/practice/0a8a68ee-f587-4dea-beec-79d02e7d3fa4___RS_Early.B 8461.JPG</path>

This XML construction follows the Pascal VOC annotation format, generally used for object detection datasets. This format gives a standardized illustration of annotated information for coaching pc imaginative and prescient fashions. When you have further photos with annotations, you may proceed to generate comparable XML recordsdata for every annotated object within the respective photos.

Changing Pascal VOC Annotations to COCO Format: A Python Script

Object detection fashions typically require annotations in particular codecs to coach and consider successfully. Whereas Pascal VOC is a extensively used format, particular frameworks like Detectron choose COCO annotations. To bridge this hole, we introduce a flexible Python script, voc2coco.py, designed to transform Pascal VOC annotations to the COCO format seamlessly.


# pip set up lxml

import sys
import os
import json
import xml.etree.ElementTree as ET
import glob

# If vital, pre-define class and its id
#  PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "chook": 3, "boat": 4,
#  "bottle":5, "bus": 6, "automotive": 7, "cat": 8, "chair": 9,
#  "cow": 10, "diningtable": 11, "canine": 12, "horse": 13,
#  "motorcycle": 14, "individual": 15, "pottedplant": 16,
#  "sheep": 17, "couch": 18, "practice": 19, "tvmonitor": 20}

def get(root, title):
    vars = root.findall(title)
    return vars

def get_and_check(root, title, size):
    vars = root.findall(title)
    if len(vars) == 0:
        increase ValueError("Cannot discover %s in %s." % (title, root.tag))
    if size > 0 and len(vars) != size:
        increase ValueError(
            "The scale of %s is meant to be %d, however is %d."
            % (title, size, len(vars))
    if size == 1:
        vars = vars[0]
    return vars

def get_filename_as_int(filename):
        filename = filename.exchange("", "/")
        filename = os.path.splitext(os.path.basename(filename))[0]
        return str(filename)
        increase ValueError("Filename %s is meant to be an integer." % (filename))

def get_categories(xml_files):
    """Generate class title to id mapping from a listing of xml recordsdata.
        xml_files {checklist} -- A listing of xml file paths.
        dict -- class title to id mapping.
    classes_names = []
    for xml_file in xml_files:
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall("object"):
            classes_names.append(member[0].textual content)
    classes_names = checklist(set(classes_names))
    return {title: i for i, title in enumerate(classes_names)}

def convert(xml_files, json_file):
    json_dict = {"photos": [], "kind": "situations", "annotations": [], "classes": []}
    if PRE_DEFINE_CATEGORIES will not be None:
        classes = PRE_DEFINE_CATEGORIES
        classes = get_categories(xml_files)
    for xml_file in xml_files:
        tree = ET.parse(xml_file)
        root = tree.getroot()
        path = get(root, "path")
        if len(path) == 1:
            filename = os.path.basename(path[0].textual content)
        elif len(path) == 0:
            filename = get_and_check(root, "filename", 1).textual content
            increase ValueError("%d paths present in %s" % (len(path), xml_file))
        ## The filename have to be a quantity
        image_id = get_filename_as_int(filename)
        dimension = get_and_check(root, "dimension", 1)
        width = int(get_and_check(dimension, "width", 1).textual content)
        top = int(get_and_check(dimension, "top", 1).textual content)
        picture = {
            "file_name": filename,
            "top": top,
            "width": width,
            "id": image_id,
        ## At the moment we don't assist segmentation.
        #  segmented = get_and_check(root, 'segmented', 1).textual content
        #  assert segmented == '0'
        for obj in get(root, "object"):
            class = get_and_check(obj, "title", 1).textual content
            if class not in classes:
                new_id = len(classes)
                categoriesAdvanced = new_id
            category_id = categoriesAdvanced
            bndbox = get_and_check(obj, "bndbox", 1)
            xmin = int(get_and_check(bndbox, "xmin", 1).textual content) - 1
            ymin = int(get_and_check(bndbox, "ymin", 1).textual content) - 1
            xmax = int(get_and_check(bndbox, "xmax", 1).textual content)
            ymax = int(get_and_check(bndbox, "ymax", 1).textual content)
            assert xmax > xmin
            assert ymax > ymin
            o_width = abs(xmax - xmin)
            o_height = abs(ymax - ymin)
            ann = {
                "space": o_width * o_height,
                "iscrowd": 0,
                "image_id": image_id,
                "bbox": [xmin, ymin, o_width, o_height],
                "category_id": category_id,
                "id": bnd_id,
                "ignore": 0,
                "segmentation": [],
            bnd_id = bnd_id + 1

    for cate, cid in classes.objects():
        cat = {"supercategory": "none", "id": cid, "title": cate}

    #os.makedirs(os.path.dirname(json_file), exist_ok=True)
    json_fp = open(json_file, "w")
    json_str = json.dumps(json_dict)

if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser(
        description="Convert Pascal VOC annotation to COCO format."
    parser.add_argument("xml_dir", assist="Listing path to xml recordsdata.", kind=str)
    parser.add_argument("json_file", assist="Output COCO format json file.", kind=str)
    args = parser.parse_args()
    xml_files = glob.glob(os.path.be a part of(args.xml_dir, "*.xml"))

    # If you wish to do practice/check cut up, you may go a subset of xml recordsdata to transform operate.
    print("Variety of xml recordsdata: {}".format(len(xml_files)))
    convert(xml_files, args.json_file)
    print("Success: {}".format(args.json_file))

Script Overview

The voc2coco.py script simplifies the conversion course of by leveraging the lxml library. Earlier than diving into utilization, let’s discover its key elements:

1. Dependencies:

  • Make sure the lxml library is put in utilizing pip set up lxml.

2. Configuration:

  • Optionally pre-define classes utilizing the PRE_DEFINE_CATEGORIES variable. Uncomment and modify this part in response to your dataset.

3. FunctioGet

  • get, get_and_check, get_filename_as_int: Helper features for XML parsing.
  • get_categories: Generates a class title to ID mapping from a listing of XML recordsdata.
  • convert: The principle conversion operate processes XML recordsdata and generates COCO format JSON.


Executing the script is easy run it from the command line, offering the trail to your Pascal VOC XML recordsdata and specifying the specified output path for the COCO format JSON file. Right here’s an instance:

python voc2coco.py /path/to/xml/recordsdata /path/to/output/output.json


The script outputs a well-structured COCO format JSON file containing important details about photos, annotations, and classes.

Output of COCO format JSON file
Output from COCO format JSON file


In conclusion, Wrapping up our journey by way of object detection with LabelImg and Detectron, it’s essential to acknowledge the number of annotation instruments catering to fanatics and professionals. LabelImg, as an open-source gem, presents versatility and accessibility, making it a best choice.

Past free instruments, paid options like VGG Picture Annotator (VIA), RectLabel, and Labelbox step in for advanced duties and enormous tasks. These platforms convey superior options and scalability, albeit with a monetary funding, guaranteeing effectivity in high-stakes endeavors.

Our exploration emphasizes selecting the best annotation software based mostly on venture specifics, finances, and class stage. Whether or not sticking to LabelImg’s openness or investing in paid instruments, the secret is alignment along with your venture’s scale and targets. Within the evolving subject of pc imaginative and prescient, annotation instruments proceed to diversify, offering choices for tasks of all sizes and complexities.

Key Takeaways

  • LabelImg’s intuitive interface and superior options make it a flexible open-source software for exact picture annotation, perfect for these getting into object detection.
  • Paid instruments like VIA, RectLabel, and Labelbox cater to advanced annotation duties and large-scale tasks, providing superior options and scalability.
  • The crucial takeaway is selecting the best annotation software based mostly on venture wants, finances, and desired sophistication, guaranteeing effectivity and success in object detection endeavors.

Assets for Additional Studying:

1. LabelImg Documentation:

  • Discover the official documentation for LabelImg to realize in-depth insights into its options and functionalities.
  • LabelImg Documentation

2. Detectron Framework Documentation:

  • Dive into the documentation of Detectron, the highly effective object detection framework, to know its capabilities and utilization.
  • Detectron Documentation

3. VGG Picture Annotator (VIA) Information:

  • For those who’re serious about exploring VIA, the VGG Picture Annotator, check with the excellent information for detailed directions.
  • VIA Person Information

4.RectLabel Documentation:

  • Study extra about RectLabel, a paid annotation software, by referring to its official documentation for steerage on utilization and options.
  • RectLabel Documentation

5.Labelbox Studying Middle:

  • Uncover academic sources and tutorials within the Labelbox Studying Middle to reinforce your understanding of this annotation platform.
  • Labelbox Studying Middle

Often Requested Questions

Q1: What’s LabelImg, and the way does it differ from different annotation instruments?

A: LabelImg is an open-source picture annotation software for object detection duties. Its user-friendly interface and flexibility set it aside. Not like some instruments, LabelImg permits exact bounding field annotation, making it a most well-liked alternative for these new to object detection.

Q2: Are there various paid annotation instruments, and the way do they examine to free choices?

A: Sure, a number of paid annotation instruments, reminiscent of VGG Picture Annotator (VIA), RectLabel, and Labelbox, provide superior options and scalability. Whereas free instruments like LabelImg are wonderful for fundamental duties, paid options are tailor-made for extra advanced tasks, offering collaboration options and enhanced effectivity.

Q3: What’s the significance of changing annotations to the Pascal VOC format?

A: Changing annotations to Pascal VOC format is essential for compatibility with frameworks like Detectron. It ensures constant class labeling and seamless integration into the coaching pipeline, facilitating the creation of correct object detection fashions.

This fall: How does Detectron contribute to environment friendly mannequin coaching in object detection?

A: Detectron is a strong object detection framework streamlining the mannequin coaching course of. It performs a vital function in dealing with annotated information, getting ready it for coaching, and optimizing the general effectivity of object detection fashions.

Q5: Can I take advantage of paid annotation instruments for small-scale tasks, or are they primarily for enterprise-level duties?

A: Whereas paid annotation instruments are sometimes related to enterprise-level duties, they will additionally profit small-scale tasks. The choice depends upon the precise necessities, finances constraints, and the specified stage of sophistication for annotation duties.

The media proven on this article will not be owned by Analytics Vidhya and is used on the Writer’s discretion. 

Latest news
Related news


Please enter your comment!
Please enter your name here