Google Penalty Evaluation & Recovery Plan – Case Study #1

This is a simple case study that will evaluate an inner page of a site that was hit by the Penguin 2.0 update for just a couple of keywords – based on what was observed inside organic search data in Google Analytics.

We will prepare and outline the evaluation and proposed recovery plan below. In this case it is fairly simple and straight forward (we will get into more complex case studies in the near future).

I will document the actual implementation of the recovery plan as we proceed with this project in the coming weeks. This will give us further insight into what is working and if our analysis below is accurate.

This case study does not describe how to figure out which specific Algorithm Update and Penalty the site being analyzed has been hit by.

I will cover that topic in detail in a later post where I will show you exactly how to use Google Analytics and freely available data to determine which penalty your site has been affected by. I will also discuss the a clever tool that is available in the market to do this for you.


Situation Overview

  • The Client has indicated that domain (actual URL hidden for privacy reasons) has received a negative ranking factor for keyword “ipad repair” associated with the url
  • Since this penalty is not sitewide and probably targeted at this specific keyword-url pair, the recovery process will be non-complicated.
  • Since the other pages in the SERPs are sticking to where they are and only one internal page has been effected, this is an algorithmic penalty and not a manual penalty that has been applied site wide.
  • The exact date of the penalty is un-known as of now, but can be determined via Google Analytics data for this keyword. We also need to examine and determine if any other URLs and/or Keywords were also hit. This data can be analyzed via Analytics access.
  • Once we get access to Analytics we will share the exact keyword and organic search engine traffic charts that will confirm that the site was hit by the Penguin 2.0 update. I will post this in the coming weeks as I get this data.


Backlink Data Charts & Analysis

We have used several tools to analyze real data of the clients site and top competitors data to formulate our analysis and recovery plan.

Note: The data in the initial 3 charts excludes the sitewide links where a domain has more than 5 sitewide links to the clients site. When you are making the Disavow Tool document for submission, you may want to exclude entire domains where all the 5 links are toxic and other links sitewide on each domain are also toxic. We can get more into this later for a better understanding. Right now, just note that the data is not of all the backlinks but is limited to only 5 links per site. This has been done to prevent any skewing of data in the event that there are 1000’s of links from one domain. Things like ratios of .com sites, nofollows etc will get skewed unfairly if we include these large number of sitewide links.

Get Instant Access Now

Just follow the instructions below…

page_check_48 From the data below – We notice that the Ratio of (Toxic AND Suspicious Links) – compared to – (Healthy Links) is unusually high.




page_check_48 From the data below – of a total 98,547 backlinks there are 95,255 backlinks that are Sitewides. This is not a healthy sign.


We will have to study the links marked as “suspicious” and classify them after review to check if they fall under Toxic or Healthy. This is data the tool cannot confirm to us as healthy or toxic and a manual review of each backlink is required. We do not need to check more than 5 links per domain where there are large number of sitewides. It will be safe to set the status of the sitewide links after studying the initial 5 links marked as “suspicious”.

page_check_48 From the data below – the following are approximate numbers and ratio of the Toxic links. (sitewides greater than 5 excluded)


(a) TOX1 (182 links) – Domain is not indexed in Google. Usually a sign for a penalty. But a non-indexed domain could also be a new domain nobody linked top yet or a problem in the robots.txt or meta robots tag. Double check on these.

(b) SUSP1 (395 links) – At best this is a link from a page without external links, which is often the case for forum. But it also comes from a rather weak domain. Chances are, that this link is coming from some special automated spamming activity or is on a page of yet another link directory.

(c) SUSP4 (437 links) – The page the link comes from doesn’t rank for it’t title. This is usually a sign of that page or domain being penalized. However, special cases like pages with generic titles like “Partners” or “Home” have a hard time ranking #1 most of the times, so double-check for such cases.

(d) SUSP15 (144 links) – Link Directory Links. These links are coming from typical web link directory footprints which were often just setup to artificially inflate link popularities and/or sell links from. While it was practice for years to fill up link profiles with these type of links for many SEOs it cannot be recommended these days.

(e) SUSP14 (71 links) – Page has no PageRank™ but at least some weak links. This could be a sign for a punishment of the page by Google. Google PageRank™ is more and more inaccurate and many webmasters don´t trust that metric anymore.

page_check_48 From the data below – we observe the following –

(a) NoFollow ratio to Dofollow is extremely unnatural.
(b) There are a large number of redirect links (could be social media shortner links)
(c) There seems to be an unnatural level of G+1’s Social Signal. We checked the competitors in the niche, and this seems to be an acceptable level.


The tool used for just this one chart above and the rest below is Ahrefs. It does have some some conflicting data with the initial data charts above, from another tool. Mainly the quantity of “sitewide links” seem to be conflicting. We will study this later to confirm if this is an issue worth considering or why the 2 tools are showing us conflicting data.

page_check_48 From the data below – The money keywords have been over-optimized to a large extent for the anchor links. This is a big issue with Penguin1.0 and 2.0


page_check_48 From the data below – The .info TLD looks highly disproportionate when compared to the .net and .org – which should actually be much higher numbers when usually occurring in a natural backlink profile.


page_check_48 From the data below – and seen in the detox-report.xlsx (located inside the folder “raw data” and which I will share shortly) – we observe the following below. Note: since only one inner page URL was hit by an algorithmic update we have currently filtered the detox report to examine closely the type of links coming to only the inner page that was hit.

We observe as under –

  1. Excessive amounts of Toxic links going to this page

  2. Excessive amounts of anchor text links with money keywords going to this inner page (for example: 257 out links are using “ipad repair” and 117 are using “ipad screen repair”)

  3. NoFollow to DoFollow ratio is unnatural

  • The above data in the file can be mapped out in a pivot table, at the appropriate time to get more details on the specific backlink profile to the inner page.
  • You may also default the filter the data set in the xlsx file to “all links” to observe your total backlinks profile.

Penalty Analysis Overview

  • The biggest problem with the Client’s backlink profile is that their money keywords have been over-optimized to a large extent for the anchor links. This is a big issue with Penguin1.0 and 2.0
  • There is a large fraction of “unhealthy links” going to the site, of which is large chunk are going to the page that has been affected.
  • The client’s backlink portfolio seems to be having a very high ratio of sitewide links coming into it. This needs to be identified and fixed as sitewide links in large thresholds from low PR or suspicious sources can result in severe negative effects.
  • We further noted that there were a very small fraction of NoFollow links which looks unnatural.
  • There were too many links originating on the .INFO TLD.
  • We noticed an unusually large number of G+1’s for this site. We would need to examine if these were organic or purchased from a vendor and thus create artificially.

Google Penalty Recovery Solution

In this plan we will target to fix only the internal page that has been effected by the update and has an unnatural looking backlink profile.

We will attempt to delete all the backlinks that we classify as unhealthy. We will then begin diluting the ratio of the money keywords used in the anchors. We can do this by editing the backlinks, or deleting the toxic ones, or simply creating more healthy looking backlinks.

Since this is an algorithmic penalty on one URL only, we could optionally use the disavow tool submission – to have Google automatically de-index all the bad links.

We do not need to file a manual reconsideration request at all since there was no manual penalty / notice from Google in the Webmaster Tools area.

We will record all our attempts to delete the toxic links, so that we can provide adequate proof of effort within the Disavow Tool submission.

f1Phase 1

In this phase we will attempt to work on links marked as unhealthy and fix them or remove them. We will target the approximately ~500 (479) backlinks pointing to the inner page that was affected only.

In brief, in order to examine the links marked as “suspicious” we will have visit each page that is marked as “suspicious” and analyze if we should –

(a)   keep the link if it looks healthy and mark it to edit the anchor text later on, if possible, and

(b)   mark the link as unhealthy, and work towards deleting it in later phase.

Elaborating the above… we visit each URL marked as “suspicious” to determine if the backlink on the page is either healthy or unhealthy.

If it is healthy, we then try and edit the anchors of the backlink – to become a Raw URL, Brand URL or Generic anchor. If we cannot edit it –we mark it for later, and we will attempt to edit it later or leave it as it is.

If its unhealthy, we mark it down as unhealthy and attempt to delete the backlink by contacting the webmaster etc in the later phase (to be carried out after we finish reviewing the entire list of 500 odd links)

If we cannot access the page and the page is invalid, we mark it as a dropped page.

fin2 Phase 2

In this phase we will attempt to contact the webmasters or site owners of each of the links marked above as unhealthy.

We will attempt to contact each webmaster behind each URL up to 3 times requesting the backlink deletion. We will keep a gap of a few days when we do this.

We will record all our activity in a Google Docs data file, so this can be submitted to Google if desired.

Once we finish working on all the links, we will prepare the Disavow Tool request document as per Google guidelines. This document will mark all links that we managed to delete successfully, those that became invalid and specifically those we did not manage to delete but we are requesting for deindexing by Google. The format of this document will be shared later.

In the event that we notice a large number of unhealthy or toxic backlinks coming from the same domain and we do not get a deletion confirmation response from the webmaster, we can disavow the entire domain with the disavow command as per Google guidelines.

Note: Phase 2 can be executed in parallel with Phase 3 below.


fin3 Phase 3

After we work on classifying and attempting to edit the currently exisisting backlinks to the inner page, we can then proceed to create new healthy backlinks to the inner page WITHOUT using the money keywords.

At the same time we will create around 20% links to the homepage – again using non-money keywords or keeping their use at a bare minimum (around 5% in this case).

Below is the break down of what we will be aim to achieve in the anchor text ratio –

– 5% main primary money keyword

– 5% related secondary & LSI money keywords

– 30% brand keywords… vary a bit by exact and sometimes with space if brand has 2 keywords

– 20% plain raw URLs / 8 combo rotation

– 30% generic anchors… click here, visit our page, visit our online store to buy shoes, etc.  (make sure to use as diverse many as possible, and also use custom generics derived by us)

– 10% niche specific highly generic keywords… like shoes

– include some image links when possible.

Note: while creating new high quality editorial links (for example of Web2.0 sites) each page can have 2 backlinks… one with money keyword going to internal page, and one with brand link or raw URL link going to homepage.

fin4 Phase 4

Diluting the anchor links can be done by link pruning, editing or creating new backlinks.

We strongly suggest the client swiftly create a large number of fresh links of various types – blog posts, edu links, related blog comments, guest posts, niche directory links, press release links etc. withthe anchor link as the brand or raw url variations.

The client will need to create a fraction of healthy backlinks to the main site URL with anchor text links that are raw URLs, generic words or niche specific generic links.

This is being done to balance the over-optimization of money keywords going to the main URL – that has been observed in our analysis above. We may kill two birds with one stone if we focus to put these new links on .com , .net or .org pages and stay clear of .info sites. We could also target to get a larger number of no-follow links as the ratio of no-follow to do-follow is very low and unnatural.

Competition Analysis

This is one for further insight into the sites ranking on top in the niche.

We have analyzed the backlink portfolio of 2 competitors that are still holding ranks, namely –

The backlink data observed below, on these 2 sites confirms all our observations in the recovery plan and the critical steps for recovery that need to be taken as specified by us above.

Complete data sheets for this is attached and included in the zip folders (and will be uploaded here shortly)

page_check_48 Here is data from



Closing Notes

This case study and analysis is a good start and is very basic and non-complicated.

If you’re new to the subject this is a good starting point. I will get into more complex penalties, evaluations and recovery plans shortly.

I will document the actual implementation of this recovery plan – as we proceed with this project in the coming weeks. This will give us further insight into what is working and if our analysis below is accurate.

Stay tuned and please sign up via the side bar form so you get notified of future posts.

If you have any questions or tips – please share them below!


Leave a Comment