Number of websites blocking Google-Extended jump 180%

Discover the latest trend among websites as the number of sites blocking Google-Extended has skyrocketed by 180% in just one month. Research conducted by the Detailed.com team exclusively for Search Engine Land reveals that over 250 websites currently refuse access to Google-Extended, a standalone product token introduced by Google on September 28th. Companies such as Ziff Davis properties, including PC Mag and Mashable, as well as Vox properties, including The Verge and NYMag, have joined The New York Times, Condé Nast, and Yelp in blocking Google-Extended crawling. While some argue over the benefits of blocking bots that utilize content for training LLM models, these sites aim to protect their content from being monetized by AI companies at their expense. Stay informed about the rise of this trend and the measures you can take to opt out fully.

Why websites are blocking Google-Extended

Introduction

In recent months, there has been a growing trend of websites blocking Google-Extended, a standalone product token introduced by Google on September 28. This token allows users to block certain AI models and APIs from accessing their content. The decision to block Google-Extended has sparked a debate among brands and businesses, with arguments centered around the use of content for training AI models and the competition it creates. In this article, we will explore the reasons behind websites blocking Google-Extended and the impact it has on the digital landscape.

Debate over blocking bots

The blocking of Google-Extended is just one aspect of a broader debate over whether brands and businesses should block any bots that crawl their content. Bots such as GPTBot and CCBot are used to train large language models (LLMs), which have become integral to various AI applications. Some websites argue that by blocking these bots, they can prevent AI companies from profiting off their content and potentially competing against them. However, this debate raises questions about the value of collaboration between brands and AI companies, as well as the ethical implications of blocking access to information.

AI companies and competition

One of the main factors driving websites to block Google-Extended is the fear of competition from AI companies. By blocking access to their content, websites hope to limit the ability of AI companies to train their models using valuable data. This competition can create a power dynamic where AI companies have an advantage in developing innovative applications, potentially leaving brands and businesses at a disadvantage. This concern over competition highlights the need for a balance between protecting intellectual property and fostering collaboration in the AI industry.

The increase in websites blocking Google-Extended

Current number of blocked websites

According to research conducted by the Detailed.com team, the number of websites blocking Google-Extended has seen a significant increase. As of November 19, out of a set of 3,000 popular websites, 252 have chosen to block Google-Extended. This represents a considerable jump from just a month earlier when only 89 sites had implemented the blocking measures. The increase in the number of blocked websites demonstrates the growing concerns among brands and businesses about AI companies’ use of their content.

Comparison with previous month

The leap from 89 to 252 websites blocking Google-Extended in the span of a month signifies a 180% increase in blocked websites. This substantial rise is evidence of the escalating trend of websites taking measures to control access to their content. It also indicates the urgency felt by brands and businesses to protect their interests in an increasingly competitive digital landscape.

Specific websites blocking Google-Extended

Ziff Davis properties

Among the websites that have chosen to block Google-Extended are various Ziff Davis properties, including PC Mag and Mashable. These well-known tech and media websites have decided to restrict access to their content in an effort to safeguard their intellectual property and limit potential competition.

Vox properties

Vox properties, such as The Verge and NYMag, are also part of the growing list of websites blocking Google-Extended. Vox, a prominent media company, has taken a proactive stance in protecting its content from being used by AI models and APIs.

The New York Times

Even established and reputable news organizations like The New York Times have decided to block Google-Extended. This move highlights the increasing need for media outlets to maintain control over their content and the potential impact of AI technologies on the journalism industry.

Condé Nast

Condé Nast, the publisher responsible for magazines like GQ, Vogue, and Wired, has implemented blocking measures for 22 of its websites. This demonstrates that even within the publishing world, concerns about protecting content and preventing AI companies from profiting off it are prominent.

Yelp

Yelp, a platform known for user-generated reviews and a frequent critic of Google, has joined the websites blocking Google-Extended. By taking this step, Yelp is aiming to preserve the integrity of its reviews and ensure that its content cannot be used to enhance AI models or APIs.

Options for blocking Google-Extended

Blocking in robots.txt

One option for blocking Google-Extended is to include specific directives in the website’s robots.txt file. By doing so, website owners can prevent Google-Extended from accessing and crawling their content. However, it is important to note that this type of blocking does not completely prevent the website’s content from appearing in Google’s Search Generative Experience or being utilized for training purposes.

Limitations of blocking in robots.txt

While blocking Google-Extended through the robots.txt file provides a level of control, it does have limitations. Websites must take into consideration that their content may still be used in Google’s Search Generative Experience and could potentially contribute to training AI models. Therefore, if the goal is to fully opt-out and prevent any usage of the content, blocking Googlebot would be necessary, effectively removing the website from search results altogether.

Opting out of Search Generative Experience

To opt-out of Google’s Search Generative Experience (SGE) completely, website owners have the option to utilize the “nosnippet” tag. Adding this tag to their website’s HTML indicates to Google that they do not want their content to appear as snippets in search results or be utilized for training AI models. This provides a more comprehensive solution for those looking to remove their content from the AI ecosystem altogether.

Using ‘nosnippet’ tag

By incorporating the “nosnippet” tag, website owners can exert greater control over their content’s visibility and usage. This option effectively removes any chance of the content being used to train AI models or appear as snippets in search results. The “nosnippet” tag offers a more nuanced approach to blocking access to content while still allowing websites to be discoverable through search engines.

In conclusion, the increasing number of websites blocking Google-Extended reflects the growing concerns of brands and businesses regarding the use of their content by AI companies. This trend highlights the need for a balance between protecting intellectual property and fostering collaboration in the AI industry. As discussions and debates continue, it is essential for stakeholders to navigate the complexities of this issue to ensure fair practices and ethical use of data in the digital sphere.

Please rate this post

0 / 5

Your page rank: