Oliver Roick's Weblog Nobody reads this anyway.

Journalists at The Washington Post have investigated Google’s C4 data set, which has been used to train AI models at Google and Facebook. Amongst the sites are very few surprises, a couple of odd choices—a World of Warcraft forum and sites that sell dumpsters—and, of course, personal blogs. A neat tool lets you search for domains to find out if a specific site, like yours, is part of the corpus. (via)

You're reading an entry in Oliver Roick's Weblog. This post was published on . Explore similar posts about Artificial Intelligence.