Tumblr and WordPress will sell users data to OpenAI

Automatically, the company behind WordPress and Tumblr, is planning to discuss the content of users by selling its data to artificial intelligence companies, including Midjourney and OpenAI. These data from Tumblr and WordPress.com blogging platforms will be used to train Ai models.

Although the details of the transaction are still unclear, this news has aroused concerns among users regarding potential use for improper purposes of their private content on the two blogging platforms. Also, 404 Media suggests that internal conflicts have occurred within the automatic framework, because the content collected includes private data that were not intended for retention within the company.

In response to negative reactions, automatically is to enter a new feature that will allow users to opt to not share their data for training AI. The company, in a blog post, asserts its commitment to provide Tumblr and WordPress users a greater control over their content. Mentions the launch of a setting for “to discourage exploration by AI companies”, explaining that the tip exploration platforms are implicitly blocked.

The problem of using the content on blogs by companies that develop AI models, is not limited to the platforms managed by the company Automatic. Both Openai Cat and Google use crawler robots through which they collect information from all sites, to train artificial intelligence models. The process is similar to data collection by search engines.

How can you block Openai and Gemini (bard) to take the data on your blog?

If you are the owner of a blog or a site and you do not want the data on it to be used to train the Openai and Gamini artificial intelligence models, you can block the access of robots (crawlers) to the content. This restriction can be put through the file robots.txt.

OpenAI Crawlers

User-agent: GPTBot
Disallow: /

Gemini Crawlers

User-agent: Google-Extended
Disallow: /

After saving the robots.txt file with the new lines, go to Google Console to: Settings > robots.txt > Click on the menu with the three points, click “Request a recrawl“.

Tumblr and WordPress will sell users data to OpenAI
Recrawl Request

Related: GPT-5 and the new Crawler Gptbot web developed by Openai

For users of Tumblr and WordPress, access to data from blogs by OpenAI or other companies to develop artificial intelligence, can be blocked through the instruments provided by the automatic company.

Passionate about technology, I write with pleasure on stealthsetts.com starting with 2006. I have a rich experience in operating systems: Macos, Windows and Linux, but also in programming languages ​​and blogging platforms (WordPress) and for online stores (WooCommerce, Magento, Presashop).

Home Your source of IT tutorials, useful tips and news. Tumblr and WordPress will sell users data to OpenAI
Leave a Comment