45+ Check Points for a Comprehensive SEO Site Audit

In every SEO conference I attend there must be multiple site clinic sessions. Webmasters and business owners ask questions to a bunch of SEO gurus. What I noticed is that there is no process or structured methodology to Audit a site. Also i noticed that most of webmasters/business owners are not very familiar with SEO or Coding. That’s why I created this comprehensive list of SEO check points for any site webmaster or business owner to SEO-Audit their site in an easy way:

Code:

# Detection
Tool
Importance
1
DTD (Document Type Definition):
Web Developer Toolbar
*
Every web page must have a valid DTD at the top of the code. To validate DTD, you can use Web Developer Toolbar Firefox extension to validate the DTD. If the Page has not passed DTD validation, either the version in not declared in the head, or the code is not compliant with the declared version of DTD.
* Importance: Not that important

2
Obsolete or Deprecated HTML:
Web Developer Toolbar
*
You can use the Web Developer Toolbar to validate your HTML. Some of the deprecated HTML tags are:
,
, , etc.

3
Frames/IFrames Check:
Web Developer Toolbar
* * *
Neither Frames nor iFrames are SEO friendly. Spiders can drive traffic directly to one of the frames not the main frameset page, most of the time your internal frame pages lack navigation which gives the user no way to navigate into your site. You can simply search the source code for “frame” or “iframe” or you can use Web Developer Toolbar to check for their presence; if “View Frame Source” under View is not greyed out, Frames/iFrames are used in your site.
SEO Audit: Checking for Frames and iFrames

4
Forms:
Manual
* * * * *
SPIDERS CAN’T FILL FORMS (or at least they can’t fill complicated forms). in other words, spiders can’t find pages that can’t be reached except by submitting a form. Many classifieds sites recognized this problem and they created HTML site maps for spiders to follow and find all destination pages.
Examples: Indeed.com [link], Yahoo Autos [link], Yahoo Real Estate, Zillow.com, etc.indeed-site-map

5
H1 Heading Tags
Web Developer Toolbar
* *
There are different opinions on H1 Headings. Some believe that you can use multiple H1 Heading Tags while others, including me, recommend just using 1. H1 Heading tag is like your book titles or a chapter title, you can’t, and shouldn’t, have more than 1.

6
Heading Tags Structure
Web Developer Toolbar
* *
Heading Tags provide semantical structure for your content. H3 can’t come before H2 and H4 can’t come before H3 and so on.

washington post heading tags

7
Image Maps
Manual.
* * *
Image maps are not SEO friendly, some spiders have issues following up the links included in the image maps. Either provide alternative text links to the destination pages or don’t use image maps.
Detection: Search for “map id=” or “coords” in your source code

8
External CSS File
Web Developer Toolbar
Or Manual
* * * *
All CSS styling code should be in external CSS file(s). Doing so will decrease your file size and increase you text to code ratio.
Detection: Search for "style" in your source code

9
JavaScript
Web Developer Toolbar
* * * *
All JavaScript code should be in external JS file(s). This will decrease your files size and increase your text to code ratio.

10
JavaScript Links
Manual
* * * *
Most spiders can’t comprehend/follow links in JavaScript functions or code. Use plain text links or provide alternative text links.

11
DHTML Menus
Manual
* * * *
Same as above.

Meta Data

# Detection
Tool
Importance
12
Page Title
Google Webmaster Tools
* * * * *
Every page in your site must have a unique Page Title. Your Page Titles should be 60 – 70 characters in length. Always start with your primary keywords and place your branding at the end. *Some SEOs prefer placing their branding in the beginning of the page title like SEOmoz.com. If you brand is short keyword or domain, go for it (example: Nike or TIME) but if you brand or domain name is long, you will be wasting a very important chance.

13
Meta Description
Google Webmaster Tools
* * *
Each page must have a unique characters Meta Description. Your Meta Description should be 150 – 170 characters in length. Include Call-to-Action verbs (get, view, search, find, etc.). While Meta Description is not a ranking factor, it is extremely important CTR (click through rate) factor; a good Meta Description entice users to click your listings. Include your keywords in the Meta Description.

14
Meta Keywords
Google Webmaster Tools
* * *
Most of search engines don’t consider Meta Keywords as a ranking factor, but it is still important piece of code for other minor search engines. Don’t abuse it; I personally don’t recommend to include more than 10 – 15 keywords in your tag.
Google webmaster tools content analysis

15
Robots Meta Tags
Google Webmaster Tools
Robots.txt Syntax Checker
* * *

Examine your robots tags, in general you should use

The content=”robots-terms” is a comma separated list used in the Robots META Tag that may contain one or more of the following commands without regard to case: noindex, nofollow, all, index and follow. If you decide to include Robots Meta Tag, use the following code to allow all spiders to index and follow the page:

For Printer-Friendly pages, you may want to use the following command to disallow spiders from indexing the page to prevent Duplicate Content

Content

# Detection
Tool
Importance
16
Length
Manual
* * * * *
Each page must have at least 250 – 350 words of unique content excluding all boilerplate elements (header, footer, side rails, all common templates)

17
Onsite Duplicate Content
Google Webmaster Tools
* * *
Duplicate content is a serious issue; duplicate content may dilute your Link Equity and waster spider’s resources. While Google indicates Duplicate Content is not a penalty, it surely decreases the crawling rate of your site. Read SEOmoz for more info on duplicate content.

18
Flash Objects
Manual
* * *
While Google indicates they able now to index content within flash objects, it is advisable not to use flash to deliver content. Flash should be limited to deliver visual presentation. While Google crawl flash content, it is still not crawled by other major search engines. Even with Google flash objects need to be optimized.One thing for sure, don’t create the whole site in flash, just use flash elements in HTML pages.
gucci flash pages in google

URLs

# Detection
Tool
Importance
19
Dynamic URLs
Manual
* * * *
If your URLs contain less than 2 parameters, don’t bother changing especially if your parameters are descriptive words and not just a bunch of numbers.
example: http://www.your-domain.com/page.php?city=jersey-city&listing=condos-for-sale

But if URLs contain more parameters, consider using Mod ReWrite where you can include SEO optimized keywords in your URL.
– Try to create short, descriptive, static-looking and flat URLs.
– Avoid using files extensions so you can switch between different platforms without having future issues.
example: http://www.your-domain.com/jersey-city/condos-for-sale/

– Use hyphens instead of underscores

20
Redirect Attacks
Manual
* * *
Some sites utilize redirect similar to
http://www.your-domain.com/go.php?url=http://www.external-domain.com

this is a dangerous method of redirects and could be hijacked by spammers to redirect to bad neighborhoods.

21
Sort-By Links
Manual
* * * *
Many shopping carts and classifieds sites offer users the ability to sort results by multiple criteria (price, relevancy, date, popularity, most discussed, weight, etc). This creates multiple URLs generating similar pages (duplicate content). Block sort pages by adding NOFOLLOW attribute to the URLs, block these URLs if have different URL pattern than the original page or use JavaScript Links.

22
Session IDs
Manual
* * * *
Many shopping carts and other web solutions include sessions IDs in the URL, this creates Duplicate Content. Disable sessions Ids to be displayed from the server config file if possible.

Navigation & Site Architecture

# Detection
Tool
Importance
23
Home Page Link & Logo
Manual
* * * * *
Your navigation links must include a link titles “Home” this is important especially for users coming from search or visiting one of your internal pages to give them access to visit your home page. Your Logo should also link back to your home page to satisfy users expectations.
It is advisable to disable Home link on the home page.

24
Sub-Directories
Manual
* * * *
Don’t create deep sub-directories (more than 3 directories deep), or in other words any page shouldn’t not be more than clicks away from home page. You can achieve this by several ways: create section pages, archive pages, A_Z pages, etc.

25
Top Navigation Links
Manual
* * * *
Don’t use images, Flash or DHTML for your top navigation links; use Text/CSS links.

26
Navigation Consistency
Manual
* * * *
Some sites tend to change the top navigation when you move from one page to another, this is not a good user experience and it irritates users clicking the back button to find a top navigation link on a previous page. Keep your main navigation consistent across all pages.

Duplicate Content & Canonicalization

# Detection
Tool
Importance
27
Canonicalization: domain.com vs. www.domain.com
Manual
* * *
Most search engine recognize domain.com pages the same as www.domain.com pages. Google will even allow you set your domain name preference. To avoid having this problem, add the following code into your “.htaccess” file

domain.com to www.domain.com
Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^domain.com [nc]
rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc]

28
Canonicalization: Home Page/ Section Pages
Manual
* * * *
Many platforms provide multiple URLs for the same page; the most popular among all are the home page and the section pages.

example:
http://www.domain-name.com
http://www.domain-name.com/index.html

http://www.domain-name.com/news/
http://www.domain-name.com/news/index.html

The best way is always redirect URLs with file extensions to URLs ending with slash, this is will also save you future headache when switching languages or platforms. To do so add the following code into your “.htaccess” file
RewriteCond %{THE_REQUEST} ^[^/]*/index\.html [NC]
RewriteRule . / [R=301,L]

29
HTTPS Link Split
Manual
* * * *
Make sure to use absolute links (http://www.domain.com/page.php) and not to use relative links (/page.php). Relative links will make spiders crawl secured versions of your non secured pages which will result in creating duplicate content of your whole site.
You can also exclude all your https pages in robots.txt file

Other

# Detection
Tool
Importance
30
Home Page: PageRank 0 or Not Available
Manual
* * * * *
If your site has been online for more than a year and/or you have strong quality incoming links and your home page PageRank is still zero, this may indicate a penalty by Google. Investigate your site and search for any black hat tactics that may be implemented on your site (with or without your knowledge). Also check your external outgoing links if linking to any bad neighborhood.

31
Robots.txt File
Manual
* * * * *
You would be surprised how many sites have blocked spiders by mistake. I even encountered System Admins who blocked spiders to fix server performance issues! Make sure your check your robots.txt files regularly. You can also view blocked pages in Google Webmaster Tools and verify if you unintentionally blocked these pages.
Some of the pages you may want to block: printer friendly pages, https pages, sort pages, etc.

32
Visibility
Manual
* * * * *
You should check if you are indexed in search engines, check how many pages are indexed and compare to how many pages you have. If you are not indexed in search engines and your site is not new, you might got banned. Check your site and check if any black SEO tactics were applied on your site.
Detection: you can use the following commands to check if your site is indexed in search engines
site:your-domain-name.com

33
XML Spider Sitemap
Manual
* * * *
if your site has more than 100 pages or if you produce content on a frequent basis, you should consider creating an XML Sitemap

34
HTML Site Map
Manual
* * * *

Each site must have a comprehensive HTML site map page that contains hard coded direct links to all pages on site.

This allows spiders easily find and crawl all your pages,

It also makes all your pages 2 clicks away from the home page
– And it prevents
of having orphaned pages on the site with no incoming links.

35
Custom 404 Page
Manual
* * *
You must create a custom 404 page, this page should provide the following elements:
– An apology message "sorry the page you are looking for is not found"
– Suggestions of related content
– Simple search form
– Simple HTML site map of your most important pages.
Google recently launched an enhanced 404 widget that recommends pages to the users.

36
Popup Windows
Manual
* *
Many browsers and toolbars block popup windows by default, make sure you use New Window instead of popup windows, plus they are irritating

37
Page Size
Manual
* * *
It is recommended not to exceed 150 – 200K including images and external files.
Large page size can result in partial indexing.

38
Check Server Header (server response)
* * *
It is always good to check if your server is returning the correct codes. here are HTTP Status Codes that the Web Server can return:
200 OK
301 Moved Permanently
302 Found
304 Not Modified
307 Temporary Redirect
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
410 Gone
500 Internal Server Error
501 Not Implemented

39
Accessibility
Manual
* * *
Make sure your links have TITLE attributes <a href="http://www.domain.com/page.php" title="describe the destination page">. Also make sure your meaningful visible image have ALT attributes <img src="http://www.domain.com/images/image.jpg" alt="text describing the image">. Your should include some keywords in these attributes but don’t overdo it.

40
Number of Links on a Page
Manual
* * *
per Google Guidelines for Webmasters, don’t include more than 100 links per a page.
The more links you have on a page the less link juice you pass to these destination pages.

41
Broken Links
* * *
Each broken link adds a negative check mark against your site, Search Engines want to send users to reliable sites with good user experience. Check your links and fix all broken links.

42
Code to Text Ratio
* * *
Try to have at least 20 – 25% code to text ratio. The more text you have the better results you get. More text means better density for your keywords and better prominence.

43
Other File Types
Manual
* * *
If your site contains alternative file types (as PDF’s), make sure to optimize these files as well. For PDF’s, click file > properties and enter title, description, keywords, etc. To check if you PDF’s are indexed in Google, use the following command:
site:your-domain.com filetype:pdf

Avoid

# Detection
Tool
Importance
44
Meta Refresh
Manual
* * * * *
Some search engines consider meta refresh tags to be a spamming technique while other consider changing meta refresh to a redirect code. You should be in the safe side an use on-page or server level 301/302 redirects.

45
Hidden Text
Manual
* * * * *
Search engines consider Hidden Text to be a black hat SEO technique that may result in a complete ban from search engines. Hidden Text could be any one of the following techniques:
– white text on a white background (exact or similar colors of text and background)
– Text placed off-screen using CSS
– Text with hidden display:none CSS style
– Very tiny text
– Text behind an image
– Text way below the fold with great amount of white space in between

Any text that is not Easily Viewed by users is considered hidden text. Use select all (ctrl + A) to find potentially hidden text

46
Hidden Links
Manual
* * * * *
Hidden links are considered a black hat technique similar to hidden text. Don’t use single pixel links or any other ways to hide your links

47
Spamming
Manual
* * * * *
NO Domain Spamming
NO Keyword Stuffing
NO Keywords in NOSCRIPT tags

48
Over Optimization
Manual
* * * * *
I wrote an extensive guide for different items to check for Over Optimization Penalty (OOP)

I hope this list is helpful, I will keep updating this list and adding more check points.

1 Comment

  1. Great checklist.

    I wondered if you’d want to add Copyscape to your list, for checking for external duplicate content?

    http://www.copyscape.com/

Trackbacks/Pingbacks

  1. Twitter https: Ignoring SEO 101 Basics « John Shehata – Marketing Strategist - [...] 1. twitter uses relative links on https pages, so all the http links on these pages get spidered as…
  2. SearchFest 2011 Mini-Interview: John Shehata | SEMpdx - [...] a technical angle that helped me a lot understanding how search engines work. Check my posts on SEO Site…

Leave a Reply

You must be logged in to post a comment.