Skip to content

Label noise very high #8

@duckduck-sys

Description

@duckduck-sys

Looking through the annotated samples, I noticed that for the gender and age attributes of the person class, the label noise is extremely high. In the range 20 to 30 percent label errors, see photos attached. The largest sources of errors are:

  • There's a lot of photos of Person class where the gender is simply not assigned even when it's clearly visible (~30%).
  • There's a lot of photos of females that are wrongly labeled as males (~20%).
  • For the age categories (Young, Adult, Old), almost all photos are simply labeled Adult, even when the object in question is clearly either Young or Old. There's substantially more old people without the Old label than old people with the Old label...

I haven't checked the other object categories but i suspect the label noise to be of similar magnitude. It would be nice if it was declared in the Repo that the labels in the data-set suffer from high error rate.

182952
183056
184930

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions