Tag

Content Scraping

All articles tagged with #content scraping

Perplexity AI Faces Accusations of Stealth Data Scraping and Evasion

Originally Published 5 months ago — by theregister.com

Featured image for Perplexity AI Faces Accusations of Stealth Data Scraping and Evasion
Source: theregister.com

Perplexity AI has been accused of covertly scraping website content by disguising its bots and ignoring no-crawl directives, raising concerns about ethical data collection and the impact on web publishers. Despite attempts to hide their activities, Perplexity's bots continue to bypass restrictions, contributing to a surge in AI data scraping that threatens the sustainability of web content monetization. The issue highlights ongoing tensions between AI companies and website owners over data access and compensation.

Reddit Sues Anthropic Over AI Data Scraping and Unfair Practices

Originally Published 7 months ago — by Awful Announcing

Featured image for Reddit Sues Anthropic Over AI Data Scraping and Unfair Practices
Source: Awful Announcing

Reddit has filed a lawsuit against AI company Anthropic, accusing it of scraping content from sports-focused communities on Reddit without permission, which raises broader concerns about web scraping and AI training data usage, especially in the context of user privacy and content rights.

AI Companies Accused of Ignoring Web Standards and Copyright Laws

Originally Published 1 year ago — by Tom's Hardware

Featured image for AI Companies Accused of Ignoring Web Standards and Copyright Laws
Source: Tom's Hardware

Several AI companies are reportedly ignoring the Robots Exclusion Protocol (robots.txt) to scrape content from websites without permission, leading to disputes with publishers. TollBit, a content licensing startup, has highlighted widespread non-compliance, with AI firms using data for training without authorization. This has resulted in legal actions and negotiations for licensing deals, as the debate over the legality and value of using content to train generative AI continues.

Redditors Successfully Troll AI News Mill with Fake WoW Feature

Originally Published 2 years ago — by Ars Technica

Featured image for Redditors Successfully Troll AI News Mill with Fake WoW Feature
Source: Ars Technica

Redditors pranked an AI-powered news mill by posting a fake announcement about the introduction of "Glorbo" to World of Warcraft. The news mill, called The Portal, mindlessly regurgitated the post and published an article about Glorbo, likely written by a bot. This incident exposed the automated content scraping of Reddit by The Portal and prompted users to try to game the bots. The prank gained attention on social media, leading The Portal to take down the Glorbo post and remove all World of Warcraft content from its site. The content scraping is likely done to boost the search rankings of The Portal and increase traffic to the site.