As of September 2019, Google is no longer supporting any unpublished and unsupported rules in their robots exclusion protocol, which mainly targets the noindex directive in the robots.txt.
What do these changes mean?
Put simply, Google is making this change because they wish to establish a standard for the robots exclusion protocol (REP). This means that their web crawlers will no longer look at the robots.txt for a noindex direction, so if you want to keep your page from being indexed, you need to use a different method.
A few noindex alternatives
Fortunately, Google has offered a few alternatives if you’re looking to prevent your page from being indexed by their web crawlers.
- Noindex via the robot meta tags. Yes, that’s correct, you can still noindex a page as long as it’s through the robots meta tags. The only part affected by Google’s update is the noindex directive in robots.txt.
- Using HTTP status codes. Namely 404 (page not found) and 410 (resource no longer exists), which essentially tell crawlers that the page they’re currently crawling does not exist.
- Requiring a password. If you require a login and password for a specific page, it will generally be removed from Google’s index and no longer appear in SERPs.
- The Search Console Remove URL tool. A useful method that makes it quick and easy to temporarily prevent a URL from showing up in Google’s search results.
Prepared for the future
It’s important that you keep these updates in mind when considering the indexation status of any page. One thing to remember is that this change will only affect sites using the noindex directive in the robots.txt. Other directives and commands will remain as they are for the time being, so your only concern is making sure that you stay up to date on your noindex approach.