Wednesday, July 22, 2009

Controlling Search Engine Spiders for Improved Rankings

Topic: Controlling S.E. Spiders for Improved Rankings
This is your SiteProNews/ExactSeek Webmaster Newsletter!
To drop your subscription use the link at the bottom of this message.

SiteProNews logo Top Positions in the Web's Largest Article Directory!
Home     Advertise     Article Archives     Contact Us     Privacy Statement     SPNbabble     SEO News Forums    
Advertise in SiteProNews

  QUICK LINKS

JULY 22,  ISSUE #1265

Article Categories

   • Advertising
   • Blogs & Podcasts
   • Business
   • Ecommerce
   • Google
   • Linking Strategies
   • Marketing
   • RSS
   • SE Optimization
   • SE Positioning
   • SE Submission
   • SE Tactics
   • Security
   • Technology
   • Video Marketing
   • Web 2.0
   • Web Design
   • Webmasters
   • Website Promotion
   • Website Traffic
   • Writing

Web Search

   • Add a Site
   • Rapid Paid Inclusion
   • Low Cost Search Engine Ads
   • Search 85,000 News Sources

Blog Search

   • Add a Blog
   • Search 31,900+ Blogs
   • Grab a Blog RSS Feed
   • Grab a Blog Content Feed

Tools & Services
Webmaster Tools

   • Web Page Analyzer
   • Meta Tag Generator
   • Keyword Popularity Tool
   • Link Popularity Checker
   • Search Engine Submitter
   • Internet Tools Directory
   • Site Resource Directory

Traffic Exchanges

Get Free Visitors to Your Site with these Outstanding Services:

TrafficZap Visitor Exchange
TrafficZap

TrafficSwarm Visitor Exchange
TrafficSwarm


Site of the Day

DoNanza bills itself as the world's biggest search engine for online freelance projects. It aggregates hundreds of freelance marketplaces into one place. If you are looking for jobs to match your expertise, start here.

Does your web site qualify as a SPN Site of the Day? Webmaster resource sites can apply via email: sotd@sitepronews.com

App of the Day

SocialSeek is a social monitoring tool that you can use to find references about yourself or others in blogs, tweets, videos, images, etc. Search by topic and city. Get real-time automatic updates and notifications. Track chatter by city or anywhere. Compare topics, generate quick charts and export results to CSV and PDF. Freeware for Windows and Mac.

If you have a Webmaster App that you would like listed on the SPN site, send us an email with details to: wapps@sitepronews.com

SPN Partners

Hostway - Trust your Web site to the global leader! More than 2 million customers already enjoy our innovative Web hosting and ecommerce solutions.

SubmitPlus - Post your site to 110 search engines... Gratis.

Template Monster - The Web's number one website templates are available for immediate download.

Online Site Builder - Providing high end web design, site management & digital media tools

Web-Source - Your Guide to Professional Web Site Design & Development.

TechNewsletters.com - A directory of IT newsletters with ratings & descriptions.

NewWebDirectory - A new internet web directory of professionally reviewed web sites providing both freebie and paid site submission.

FreeWebMonitoring - Monitor your web site's availability 24 hours a day, 7 days a week with email alerts and weekly web site statistics.

DropJack - Add news, blog posts and links to one of the fastest growing social bookmarking services on the Web. Join over 70,000 active members.

SEO-News - Search Engine optimization strategies for webmasters and site owners.

Top 10 Exposure - Forget PPC. Get Google-Type ads for $3 - $4 per month and top 10 exposure across 100's of search engines & web directories.

Rapid Paid Inclusion - Add Your URL to 40+ Search Engines. Fast Inclusion, Recrawls and Backlinks.

Get a Featured Article Position on GoArticles - Put your article at the top of any GoArticles.com category or sub-category for greater exposure and better rankings.

SPNbabble for WebPros

SPNbabble is a Twitter clone for Web Professionals and anyone interested in quickly posting to Twitter, Plurk, Facebook, Friendfeed, Tumblr, etc.

 

AddMe
Submit Plus
Blog Search
DropJack.com
DesignerWiz
Web Position
SubmitExpress
Top SEO Tools
Website Builder
Top 10 Exposure
SiteProNews Blog
NewWebDirectory
Website Templates
FreeWebMonitoring
FreeWebSubmission
Rapid Paid Inclusion
Hostway Web Hosting
SmartWebGadgets.com



Controlling Search Engine
Spiders for Improved Rankings

By Eric Johnson (c) 2009


When it comes to getting your website listed at the top of the search engines keyword search rankings, it is essential for you to gain a deeper understanding of the search engine spiders that crawl over your website. After all, it is the spiders that determine the relevance of your website and decide where your site will land in the search engine results page. Therefore, by learning how to control the direction of the spiders, you can be certain your website will rise in rankings.

Gaining Control with the Help of Robots.txt

You may think that gaining control of search engine spiders is an impossible task, but it is actually easier than you might think when you take advantage of a handy little tool called the robots.txt file. With the robots.txt file, you can give the spiders the direction they need to locate the most important pages on your website while preventing them from wasting time on the more obscure pages such as your About Us and Privacy Policy pages. After all, these pages won't do much to improve your search engine ranking and won't help your target market find your website, so why should the spiders waste their time exploring these pages when ranking your site?

Special SEO Advice from Web CEO!

Another positive aspect to using a robot.txt file is the fact that it prevents the spiders from indexing duplicate pages. This is beneficial because having duplicate content can actually reduce your search engine ranking. So, while you are making changes to your website or working on an area that isn't fully developed yet, you can instruct the spiders to leave those pages alone until you are ready for them to be crawled. The same is true if you have a blog on your website, as a blog post created in WordPress will show up in the main post page, in an archive page, in a category page and as a tag page. With the help of the robots.txt tool, you can instruct the spiders to look only at the main post page.

With the help of your robot.txt files, you can tell the search engine spiders which pages they should and should not search through and index. It is important to keep in mind, however, that the robots.txt tool is meant to be used to prevent search engine spiders from searching certain pages. Therefore, you will only need to use it on those pages you don't want the spiders to crawl.

Implementing the Robots.txt Tool

To successfully use the robots.txt tool, you first need to determine which pages you don't want the spiders to search. Then, slowly begin making the changes to your site. By using the tool on only one or two pages at a time, you will be better capable of identifying mistakes that you may have made during the process.

To make your changes, you will need to add the robots.txt file to the root directory of your domain or to your subdomains. Adding it to your subdirectories will not work. For example, you may add the robots.txt file to a url such as http://domain.com/robots.txt or to http://privacypolicy.domain.com/robots.txt. But, adding it to a subdirectory such as http://www.domain.com/privacypolicy/robots.text will not work. With just one robots.txt file within your root directory, you can manage your entire site. If you have subdomains, however, you will need a robots.txt file for each one that you need to manage. You will also need separate robots.txt files for your secure (https) and nonsecure (http) pages.

Creating a Robots.txt File

Creating a robots.txt file is a relatively simple process, as you only need to name your text file robots.txt within any text editor, such as Textpad, NotePad or Apple TextEdit. Your robots.txt file only needs to contain two lines in order to be effective. If you wanted to stop the spiders from searching the archives of the blog on your site, for example, you would add the following to your robots.txt file:

    User-agent: * Disallow: /archives/


SPNbabble - A Shortcut to Marketing with Social Media!

The "User-agent" line is used to define which search engine spiders you want to have blocked. By placing the asterisk (*) here, you are instructing all search engine spiders to avoid the specified pages. You can, however, target specific search engine spiders by replacing the asterisk with the following codes:

     * Google - Googlebot

     * Yahoo - Slurp

     * Microsoft - msnbot

     * Ask - Teoma

The "Disallow" line specifies which part of the site you want the spiders to ignore. So, if you want the spiders to ignore the categories portion of your blog, for example, you would replace "archives" with "category" and so on. If you wanted to instruct the spiders to ignore multiple sections, you would simply add a new "Disallow" line for each area you want to be ignored. Just as you can name specific areas that you want the spiders to avoid, you can also list specific areas that you want specific spiders to view. For example, while you may want most spiders to avoid a specific area, you may want the MSN mediabot, Google image bot or Google AdWords bot to visit those areas. In this case, you can use the asterisk to instruct all search engines to avoid the area while instructing a specific spider to allow the same area. If you want Google's Adsense bot to access a folder, for example, you would create the following command:

     User-agent: * Disallow: /folder/

     User-agent: Mediapartners-Google Allow: /folder/

You can also use your robots.txt files to prevent dynamic URLs from being indexed by the search engine spiders. You can accomplish this with the following template:

     User-agent: * Disallow: /*&

Forget Expensive PPC Advertising - There is an Alternative!

With this command, you are instructing the spiders to index only one of the URLs that matches the parameters you have set. For example, if you had the following dynamic URLs:

     * /greatcars/details.php?propcode=ANCHORS&SRCH=tr

     * /greatcars/details.php?propcode=ANCHORS&vr=1

     * /greatcars/details.php?propcode=ANCHORS

Your robots.txt instructions will tell the spiders to only list the third example because it will disallow any URLs that start with a forward slash (/) and contain the & symbol. You can use the same strategy to block any URLs containing a question mark by using the following:

     User-agent: * Disallow: /*?

Or, you can block all directories that contain a specific word in the URL. For example, you might create a robots.txt file such as the following:

     User-agent: * Disallow: /corvette*/

With this command, any page with a URL containing the word "Corvette" will not be crawled by the spiders. It is important to use caution when using these directives, however, as they will cause the spiders to avoid all pages containing the word you specify. As a result, you may accidentally block pages that you do want to be indexed. If you do want to block all but one or two pages with URLs containing a specific word, you can create a robots.txt file that specifically allows the page you still want to be indexed. In this case, your robots.txt file would look something like this:

     User-agent: * Disallow: /corvette*/ Allow: /greatcards/corvettesandvipers/details.html

It is also possible for you to instruct the spiders to avoid an entire folder on your website while still allowing it to access specific pages within that folder.


Read the rest of Eric's article "Controlling Search Engine Spiders..." at:

http://www.sitepronews.com/2009/07/21/controlling-search-engine-spiders...

About The Author
Please visit TopClickMedia for any kind of SEO help.



Printer Friendly Version of this Article


Need Content for Your Website - GoArticles.com has 1,408,000+ Articles


Top Webmaster Headlines Breaking Blog News

  • Mozilla Releases Firefox 3.7 Mockups
  • Microsoft/Yahoo Search Deal Today? Not So Fast
  • Microsoft To Shut Down YouTube Wannabe Soapbox
  • 100,000 users to get Google Wave this fall
  • Gmail stifles marketing email deliverability
  • Can Barnes & Noble Challenge Amazon's eBook Empire?


  • DOJ vs Google - Another Microsoft Fiasco?
  • Yahoo Unveils New Home Page
  • Search Spend Projected to Grow Despite Fears
  • Further Study on the Value of Image Optimization
  • Google News Drops RSS Buttons Due To Spam?
  • How To Research, Create And Distribute Highly-Linkable Content

Recent Articles Posted on SiteProNews.com Webmaster Resource Sites & Services

How Glossaries and FAQs Can Improve Search Engine Rankings  By Ross Dunn - In this article I will account for a couple of (ranking) techniques that appear to be overlooked by many but have proven time and time again to work; the creation of an on-site glossary and frequently asked questions (FAQ) section.

How To Search Engine Optimize (SEO) an AJAX or Web 2.0 Site  By Daryl Quenet - All of the major search engine ranking algorithms have components that relate to the content that is contained on the website. Typically these components relate to Keyword Densities, number of words, content location, and sometimes age of content

How to Lose a Prospect's Attention in 5 Seconds or Less  By Kelley Robertson - When you make contact with a new prospect - either by telephone or in a face-to-face meeting - you have an extremely short window of time to connect with them. If you fail to achieve this, they will quickly tune you out.

Top SEO Tools - A suite of the best online submission and SEO Tools available. Sign up for a 7 day free test drive.

Add Me! - a pioneer in search engine submission, and the most popular. They provide fr-ee submission and paid submission.

WebPosition
WebPosition helps you maximize your site's search engine visibility by providing a complete SEO solution including rank reporting, keyword research, page optimization and submission. Download a demo today!

Build a Business Website in Under 5 Minutes.
Over 172,000 people just like you have used Exact Websites to build professional websites, complete with web pages, photo albums, email, links and 27 other features without ever having built a website before.

Webmaster Software  • Webmaster Resources  • Link to SPN  • Top SEO Tools  • SmartWebGadgets  • DropJack  • SEO-News 


You are currently signed up to spn-html2 as: arildinho13@gmail.com
To not receive further mailings, send a blank email to leave-spn-html2-3181408D@spfreemail.com


Jayde Online, Inc.
Suite 238, 23-845 Dakota Street
Winnipeg, MB R2M 5M3
(c) Copyright 2009 All rights reserved

No response to “Controlling Search Engine Spiders for Improved Rankings”

Leave a Reply