Data ethics has become a recent hot topic, as big data technologies constantly fail to fulfill their promise of bringing good to society.
Here are only two of such failures:
Google's gorilla incident: an early image recognition algorithm by Google that had not been trained on enough dark-skinned faces, and when presented with an image of a dark African American, the algorithm labeled her as a ``gorilla''.
Machine Bias in Tools Used by Judges: The judges in the US consider the scores assigned to the individuals based on their criminal record and their background, as a guidance when sentencing criminals. It is shown by Propublica that the scores have bias based on the race of criminals, claiming that these algorithms are racist.
Big Data technologies have the potential to harm us on an unprecedented scale. Some of the examples include swaying elections and inciting violence, influencing global economic policies, and amplifying racism in the criminal justice system.
In order to prevent potential harms of data-driven technologies, and to ensure that matters such as Fairness, Accountability, and Transparency (FAT) are fulfilled, it is critical to revisit the life cycle of data, including data collection, cleaning, integration, and processing, and to develop proper tools, strategies, and metrics.
This is our goal! We consider responsibility and data ethics as an important missing dimension in the development of Big Data Technologies.
We do research to design tools that help human being in conducting data-driven tasks, responsibly!