The Dark Web's Role in Training AI: Fresh Concerns and Secret Sources.

1 min read
Source: Search Engine Land
The Dark Web's Role in Training AI: Fresh Concerns and Secret Sources.
Photo: Search Engine Land
TL;DR Summary

The Washington Post has created a search tool that allows users to find out if their website or content was used to train AI systems as part of Google's C4 dataset, which includes websites and content creators that generative AI could potentially negatively impact. The C4 dataset is only part of the data used by Google Bard and other large language models, which also use Wikipedia, Reddit, and other sources. Reddit has updated its API terms and will now charge some companies, including Google and OpenAI, for access to its valuable corpus of data.

Share this article

Reading Insights

Total Reads

0

Unique Readers

1

Time Saved

1 min

vs 2 min read

Condensed

72%

33694 words

Want the full story? Read the original article

Read on Search Engine Land