Fixing Huge Duplicate Content Issue for Wines.com – Case Study

winesI just want to be clear that I am discussing the previous owners of wines.com. The site was sold I believe in 2012.

As you would imagine, wines.com is a website that sells wine but also has a lot of content on wine. When I was contacted about this website, they told me their Google rankings were horrible. They just didn’t compete for the big terms their competitors were ranking for. Looking at the site for the first time, I like its design and layout. They had the ability for merchants and wineries to upload their content to the website to be sold. But in the end this was the downfall to the site and the start of their SEO problems.

The first issue I saw was they had about 150,000 pages, but only about 30,000 were being indexed. Now that is a big problem. The site was built on Ruby on Rails, which is a great platform for developers, but not SEO friendly. The issue started when the wineries and merchants logged into the site and uploaded information to the site. When the site created these new pages for each wine, the URLs ended up being duplicated. So you would see something like this…

http://www.wines.com/wines/paul-hobbs-pinot-noir-russian-river-2008-1
http://www.wines.com/wines/paul-hobbs-pinot-noir-russian-river-2008-2
http://www.wines.com/wines/paul-hobbs-pinot-noir-russian-river-2008-3
http://www.wines.com/wines/paul-hobbs-pinot-noir-russian-river-2008-4
http://www.wines.com/wines/paul-hobbs-pinot-noir-russian-river-2008-5

Each URL would go to a different winery or merchant that was selling that wine. What you ended up with was 5 pages in this case (sometimes 20 pages for other wines) with duplicate URLs and about 90% of the content on the page duplicated. Even worse, on each page all of the information wasn’t being populated. So you had tens of thousands of pages with very little content on them and what was there was mostly duplicated.

What needed to be done was a complete overhaul of the URL structure and the content itself. The URLs needed more folders to separate the wines from the wineries and merchants.

For example…

http://www.wines.com/wines/paul-hobbs-pinot-noir-russian-river-2008

should be something like

http://www.wines.com/winery-name/merchant-name/paul-hobbs-pinot-noir-russian-river-2008

You need to get as detailed as possible in the URL since there is a lot of duplicate names. By adding a winery name and a merchant name, the URL would be more unique. But you can also take it a step further by adding in the wine type and other descriptions to make each URL even more unique.

After the URL problem, then you have the duplicate content and the shallow content on all of the pages. It is easy to come up with a plan to fix this, but to actually add the content on each page is another story This is where more information on the wineries, merchants and the wine itself can help with Google rankings. This was an unrealistic task because of how the site was created and the company didn’t have all of the content. So reaching out to all of their partners, just wasn’t feasible. The easier fix was to block all of these pages from being indexed and just create 1 page for each wine to get indexed and link to the other pages internally.

Overall, this was an easy challenge for me to figure out and put a plan together for the future, but it was not an easy fix. Since the site was built on Ruby on Rails and wasn’t setup as a CMS, meaning you make 1 change and every page on the site changes, you would have to fix the site 1 page at a time. There might have been a way to create an automation process, but that just wasn’t an option.

Since the time and money just wasn’t worth it to fix the site and starting over wasn’t an option either, they sold the site. I don’t know who owns it today or how the site is doing, but hopefully they are doing better.

What did we learn?

  • Duplicate content is a big problem
  • Make sure the platform you build your site on will make it easy for SEO and changes in the future. I love WordPress, but it won’t work for all websites
  • If you have others inputting content to your site, make sure there is a sufficient amount of content on each page. Remember, you can block Google from indexing pages if it makes sense
  • Each page of content should have at least 500 to 1,000 words on it
  • URL structure is very important, you get hurt by Google if your URLs are duplicated or too similar
  • Make sure you fully understand the capabilities of the platform you use to build your site. Don’t take the developers word for it because most developers do not create sites with SEO in mind

Leave a Reply

Your email address will not be published. Required fields are marked *