Human Rights Watch research reveals that the Laion-5B dataset contains photos of Australian children without their consent, raising concerns about privacy and protection of minors in AI technologies.
AI Dataset Includes Unauthorized Photos of Australian Children
Research from Human Rights Watch (HRW) reveals that photos of Australian children have been included in the Laion-5B dataset, used by AI image-generating tools such as Stability AI and Midjourney, without the knowledge or consent of the children or their families.
HRW’s analysis of a tiny portion (less than 0.0001%) of the 5.85 billion images in the Laion-5B dataset found 190 photos of Australian children. These images, sourced from various websites, were easily identifiable and sometimes included personal information like names, ages, and school details.
HRW’s Hye Jung Han explained that these images, often from school events and unindexed by search engines, were not publicly accessible without direct links. The dataset also contained unlisted YouTube videos and images of Indigenous children, raising concerns about cultural sensitivities.
Laion stated that all reported private children’s data had been removed. The organization emphasized that removing links from their dataset does not delete the original images hosted on other websites.
HRW highlighted the risks of such practices, noting that AI tools built on this dataset could harm children. In a recent incident, a teenage boy was arrested for circulating AI-generated nude images of female students. HRW calls for updated legislation to better protect children’s personal data and prohibit its misuse in AI technologies.