Race Labels in Computer Vision Data Sets Fail to Represent True Racial Diversity

Computer vision is a term that is used to describe the representations of the visual world that computers are able to come up with based on data that they have been fed. Computer vision data sets are a really important part of this sort of thing, but there is a major problem that is associated with them that could potentially end up harming members of oppressed groups as well as numerous minorities that might just be out there all in all.

One aspect of these labels that could pose a problem has to do with racial labels and the like. These labels are usually based on facial recognition, but if you look into the data sets that are being formed through the use of these labels you would realize that most if not all of the faces look rather similar which is an indicator that something has gone wrong since race is a rather vague term and there will obviously be a great deal of diversity among people that are members of the same race.

Research conducted at Northwestern University has revealed some of these flaws, and they have made it obvious that labels based on race need a lot more work before they can end up getting to a point where they are in any way accurate.

A lot of analysis needs to be performed to help make it so that these data sets can end up being a great deal more representative of the kind of reality that people are actually living in. Computer vision is eventually going to end up forming a rather intrinsic part of people’s lives, and it is really important that a few steps be taken here or there to ensure that no matter what happens these labels would correspond to the various nuances of race.

Read next: Researchers have shown that even debiasing cannot remove racism from hate speeches
Previous Post Next Post