What are duplicate URLs?
Duplicate URLs are live pages on a website that closely resemble other URLs elsewhere on the site. This presents a few issues to search engines such as Google and Bing. First, they don’t know which URLs to include or exclude from the index; they don’t know where to direct the link authority (e.g. should they keep it separate or send it to one page); and/or they don’t know which URL should rank higher. Before we explain why this is a big SEO issue for online businesses looking to improve their organic performance, let’s take a look at how to spot different types of duplicate URLs.
Four examples of duplicate URLs:
- Non-case sensitive URLs – These pages have URLs with different case variations. For example, if you had two live pages on a website that could be found at example.com/WOMEN/ and example.com/women, those web pages would be considered duplicate URLs.
- Trailing / versus no trailing / – These URLs are almost exactly the same, except one version of the URL has a trailing slash and the other doesn’t. To Google and other search engines, distinguishing URLs can be a pass-fail test. Being nearly identical in structure still means they’re not the same. Search engines will likely interpret both URLs as different from one another, and that leads to duplicate URL issues for your website.
- HTTP versus HTTPS – In 2014, Google released the news that HTTPS would become a ranking signal. As websites scrambled to get their SSL certification live on their e-commerce website, some of them overlooked the fact that Google viewed HTTP and HTTPS versions of a website as different, causing a number of duplicate URL errors to appear in Google Search Console without the proper redirects to send customers (and search engines) to the secure site.
- Www versus non-www – Search engines view www and non-www versions of a website (e.g. www.example.com and example.com) as different versions of a website. Even though these are essentially the same, keep in mind that URL distinction is often all-or-nothing for search engines, which would then rank those example pages as two separate pages.
Why does this happen?
There are a number of reasons why duplicate URLs come up. If you don’t follow the right steps when securing your website, it’s likely that the HTTP version of your website will still be live. Other times, not turning on a setting in certain website platforms can lead to non-case sensitive URLs existing on the website. Each variety of duplicate URLs can vary from website to website depending on your platform, so it’s important to check the settings specific to your platform.
Is Google actually crawling these duplicate URLs? Can they even be indexed?
Although there’s concern that duplicate URLs can be indexed, this isn’t often the case. Normally, Google will choose its own canonical to try to remedy the problem (sometimes choosing the less preferred page). Google understands users want diversity in their search results, so they will often take it in their own hands to consolidate URLs.
Your main concern with these duplicate URLs should actually be the thinned-out page authority and crawl space. As the SEO pro AJ Khon has said, “You are what Googlebot eats.” There is a direct correlation to what Google is able to crawl and your search visibility and SERP rankings. If Google isn’t crawling the most important pages on your website enough because it is using crawl budget for an abundant amount of duplicate URLs, then your ranking abilities are going to decrease. Of course, this issue can vary depending on the size of your website. That said, prioritization is an important principle of optimization across all website sizes that goes hand-in-hand with de-duplication. For more details, check out our webinar about dominating high-value keywords on Google that includes tips on how to prioritize your SEO efforts.
How do I know if I have a duplicate URL problem?
Most of the time, duplicate URLs are discovered after they have already been indexed or through a clean-up in Google Search Console’s Index Coverage Analysis. If you have not yet already found out whether or not this issue is affecting your e-commerce website, use one of the following methods to run a quick check on things.
- Google Analytics Check
You may have noticed that your traffic metrics are beginning to look a little spread out in Google Analytics. To see if this is caused by duplicate URLs, go into Acquisition > All Traffic > Channels > Organic Search and select Landing Page as the Primary dimension. You can then filter by including a landing page containing a general category page on your website, such as Women.
If this search is pulling in multiples of the same URL and bringing in traffic, it will look something like this:
In this case, the next step is to run both of these URLs through Google Search Console’s inspect tool to find the indexation status of both. This should help identify if these URLs have self-referencing canonicals; if they are indexed and what Google has selected as the canonical; and if they are identified as duplicates.
- Google Search Console
You can also check for a duplicate URL issue through Google Search Console’s Index Coverage report. Once you are in Google Search Console, navigate to the Index Coverage Analysis > Excluded. You can then navigate to the following duplicate status reasons:
After clicking through these options, you can spot-check each URL with the inspect tool to get more details from Google on how they are viewing these duplicate URLs.
- Manual Check
If you have already found a potential duplicate URL issue in Google Analytics or through Google Search Console and want to double-check this another way, go into your browser and type in both of these duplicate URLs. Use an indexation plugin (such as the Deep Crawl Chrome plugin and Redirect Path Chrome plugin) to determine if both of these URLs are live 200 pages (which are simply website pages that load without an error), redirected or properly canonicalized.
How do I fix a duplicate URL issue?
Resolving duplicate URLs comes down to one strategy: figuring out which is the “correct” version and making sure Google also knows this information. There are a variety of solutions that can be implemented to ensure you’re serving the right URLs to search engines.
- 301 Redirects
The preferred method for duplicate URLs, in many cases, is implementing 301 redirects. When combining two duplicate pages, not only will this kind of redirect stop them from competing against each other, but this will also create a stronger relevancy signal and avoid any user confusion. 301 redirects are especially helpful because they transfer domain authority in the eyes of a search engine while also taking the user where they need to go (unlike a canonical).
Using a rel=canonical attribute is another way of handling duplicate URL issues. This will tell search engines that the given page is a copy of the selected URL, and that one should be selected as the preferred (or master) website page. This should not have an impact on UX, since the user can still access both pages. This is mainly used to ensure that the link equity and content metrics are awarded to the original or desired page.
- Let Google Decide
Google understands how common it is to have the same pages on websites (such as http://example.com/ and https://www.example.com/home). They have recently announced that they will automatically pick one canonical for search when it recognizes this duplication. To tell Google which URL it should prioritize (instead of letting it decide for itself), you can either use a 301 redirect for the retired URL or use rel=”canonical” link tag mentioned above.
- Backend Website Platform Settings
Depending on the platform your e-commerce website is using, there may be a backend setting that isn’t enabled that’s creating these case insensitive URLs. Contact your platform’s support team to see if there is a setting you can check to ensure URLs are all case sensitive across the website.
How do I proactively prevent duplicate URLs?
Other than the obvious way of not creating duplicate URLs on your website, you can prevent duplicate pages by ensuring your entire website is secure to avoid any lurking duplicate HTTP internal linking. When creating new URLs, also make sure you are following a pattern with case sensitivity and look for any platform settings that can prevent this issue. Be proactive and run a few checks every other month on your Google Search Console to identify any potential issues that may have appeared.
Now that you’ve learned the ins and outs of duplicate URLs, you’ll be able to better manage and prioritize your website pages for Google and other search engines. Duplicate URLs are a common issue in the world of SEO, and as Jill Whalen once said, “Good SEO work only gets better over time.” Because there are a lot of technical factors involved with duplicate URLs, you may want to outsource this resolution to an e-commerce marketing agency. We offer a free e-commerce analysis for just that reason, so you can see how we’ll help you put together your holistic SEO strategy.