Today I’d like to go over another 3 quick tasks that site owners can do to check for technical SEO flaws on their website. These tasks can be done in addition to the other 3 I had recommended in my original SEO Monday post on February 13th.
The first task to complete is a very easy one, and it simply involves checking your domain for both the www version and non-www version. In addition, some sites may also have an index.html version so it helpful to check for all 4 of these URLs at once. The reason this causes a problem in addition to the potential duplication issues is that links pointing to multiple versions of a page can be split up diluting the overall link value. In this example, let’s assume you had 20 links total, split evenlyamong the 4 variations of the homepage. Google would see this as 4 different pages having 5 links each.
The solution for this is to make sure all variations resolve to only one version of the domain, which in this example is the www version. By canonicalizing these other versions to one URL, not only do you prevent duplicate pages from appearing in the index, but you can also then direct all links to one page, getting the most value from them. In the earlier example of 20 links split evenly among the 4 versions, redirecting all to the homepage now ensures all 20 links go to the one page.
The next task is checking the total number of indexed pages in Google versus the total number of pages on your site. As a webmaster or site owner, you should have a fairly good estimate of the total number of pages that live on your domain. With this number in mind, run a site command for your site in Google to check the total number of indexed pages. In this example I searched using Amazon and found 414 million pages are indexed. But what if Amazon has 800 million total pages???
If there is a huge difference in the number of pages on your site to the indexed pages in Google, this indicates indexing issues are occuring. Are there “orphan pages” on your site that aren’t linked to by the sitemap or any other pages, therefore making it impossible for Google to find? Are there other low quality pages with duplicate content or that are buried too deep within your site structure? Looking at various indexing factors like these can help you identify and resolve any indexing issues that may be occuring.
Finally, you can check for duplicate pages that are indexed caused by parameters on your site, as shown on the slide. In this example, session ID’s are causing the same page to be indexed multiple times. This type of issue can be resolved by using canonical tags on your pages or by controlling parameters in Webmaster Tools.