Using faceted navigation, like filtering by price range and colour, can often be useful for site visitors. However, it is frequently harmful to SEO efforts because many URLs are generated that have duplicate content. With so many URLs, crawlers may be slow to update fresh content and pages may not be indexed correctly due to the many duplicate versions. In order to lessen these problems and make sites that use faceted navigation more search friendly, we would like to provide you with some guidance, as follows:
- Faceted navigation best practices
- Give you some background on faceted navigation and potential pitfalls
- Point out worst practices
- Demonstrated faceted navigation implementations
In the best case scenario all unique content would have one URL to access it. This URL could be accessed by a clear navigation path from within a site.
Best scenario for search engines and searchers:
- Clear navigation path to access all content
- One URL for each category page
- One URL for each product page
Undesirable effects of faceted navigation:
- Multiple URLs for one article or product
- One product page accessible from several URLs
- Category pages that are not useful for searchers and have no value to search engines
- No value to site visitors with empty or rarely accessed categories
- Not helpful to search crawlers that find the same item accessible from several category pages
- Harms site indexing due to indexing dilution across several versions of one category
- Wasted server bandwidth and lost crawling capacity due to duplicate content
Faceted Navigation Worst Practices And Their Corresponding Best Practices
- Worst Practice: Use of non-standard parameters for coding URLs, like brackets or commas.
- key=value pairs that are marked with : as opposed to =
- use of [ ] to append multiple parameters as opposed to using &
- key=value pairs are marked with , as opposed to =
- several parameters appended with ,, as opposed to &
People can usually decode strange URL paramters, like “,,” but crawlers have problems interpreting non-standard implementations. The use of non-standard encoding practices will cause problems.
- Worst Practice: Use of file paths or directories as opposed to parameters to list values that don’t change the content of a page
Where c987 is the category and s427 is a session ID that does not change the content of the page
Crawlers have a difficult time telling the difference between useful parameters, like “chevrolet-corvette” from ones that are useless, like session Id, when these values are located directly within the path. However, URL parameters can give crawlers the flexibility to determine when a value does not require it to gain access to every variation.
Values that are commonly used that do not alter page content should be URL parameters such as:
- Tracking IDs
- Session IDs
- Referrer IDs
- Worst practice: Use values that are user-generated as URL parameters that are indexable and crawable but useless in search results
Using values that are user-generated such as latitude/longitude in URLs
Instead of allowing values that are user-generated to create URLs that are crawlable and can lead to numerous possibilities with no search value, you could publish a number of category pages for the most used values and include more information within to give more value to ordinary searches. Another option is to put values that are user-generated in robots.txt and disallow crawling of them.
- Worst practice: Illogical appendage of URL parameters
URL parameters that are extraneous only serve to increase duplication. This will result in crawling that is less efficient. You should attempt to strip your URL of parameters that are unneeded prior to URL generation. If you need numerous parameters for user sessions, consider hiding that information within a cookie as opposed to appending them to the URL.
- Worst practice: Offering more filtering when there are no available results
Giving users the option to choose filters when there are not any items left to display. Allowing refinement for a page with no results frustrates users and crawlers
Only generate URLs/links when items exist, and the selection is valid. When no items are available, disallow filtering options. You could include item counts adjacent to each filter to improve usability.
Avoid creating useless URLs and decrease your crawl space by only generating URLs when there are existent products. This will help with user engagement and decreases the number of URLs that must be crawled. In addition, if a page is unlikely to ever have useful content, return a 404 status code.
Best Practices For New Or Redesigned Implementations Of Faceted Navigation
There are several options for new sites that want to implement faceted navigation in order to consolidate their indexing signals, reduce duplicate page crawling, and optimise their crawl space for pages containing unique content.
- Determine the URL parameters that are needed for crawlers to crawl each page of unique content. Find out which parameters are needed to create a path to each item. This parameters may be things like category-id, item-id, page, and others.
- Determine the parameters that would benefit searchers and which would result in duplication with crawling that would be unnecessary.
- Consider the implementation of configuration options for URLs containing unneeded parameters. Be certain the unneeded parameters are never in the click path of a user or crawler.
- First option: use rel=”nofollow” on internal links
Make all unneeded URL links nofollow. This will lessen the chance a crawler will find unneeded links and minimizes the large crawl space that may result from faceted navigation.
- Second Option: Disallow in Robots.txt
For all URLs that have unneeded parameters include a filtering directory in robots.txt that will be disallowed. This will allow crawling of your unique content, but it will prevent the crawling of unnecessary URLs.
- Third Option: Separate Hosts
Consider putting all the URLS with unneeded parameters on another host. For instance have a primary host of www.mysite.com and a separate host of www2.mysite.com . On the second host, set your crawl rate with Webmaster Tools to low, and maintain the crawl rate of the primary host as high as you can.
- Avoid creating clickable links in cases where there are no products for a filter
- Refine the parameters displayed on URLs with logic
- Remove unneeded parameters as opposed to appending them.
- Improve the experience of searchers by maintaining a consistent order of parameters, with the least relevant parameters at the end
- Improve the indexing of valuable content by using rel=”canonical” for the preferred page versions.
- In your site maps, only include canonical URLs
Best Practices For Existing Sites Already Using Faceted Navigation
First, you should look at the best practices we have already discussed above such as nofollowing unneeded URLs. Otherwise, in all probability there is already a huge crawl space that has been found by crawlers. Consequently, your focus needs to be on minimising the growth of this crawl space and consolidating your indexing signals.
- Use standard coding on parameters
- Make sure that all values that do not alter page content, like session Ids, are not implemented as directors but as standard key=value.
- Don’t allow URLs or clicks to be generated when there are no items for a particular filter
- Use logic to alter which URL parameters are displayed
- Rather than appending values remove unneeded parameters
- Improve the experience of searchers by maintaining a consistent order for your parameters
- Use Webmaster Tools to configure your URL parameters if you have a good knowledge of the behaviour of URL parameters on your site.
- Improve content indexing by using rel=”canonical” to unique pieces of content
- In your sitemap, include canonical URLs only