Robots.txt file guide

contact | privacy | link partners | free website content

Home » Articles » Search engine optimization

Articles

» Affiliate programs
» SE optimization
» Miscellaneous
» Newsletters & E-zines
» Promotion
» Web design
» Usability
» Domain names
» Create income
» Web writing
» Hosting
» Guerrilla marketing
» Accessibility
» Web credibility
» Data recovery

» Archive

Special articles

» Google tools/services
» Yahoo! tools/services

Robots.txt file guide

By David Callan

We all know search engine optimization is a tricky business, sometimes we rank well on one engine for a particular keyphrase and assume that all search engines will like our pages and hence we'll rank well for that keyphrase on a number of engines. Unfortunately this is rarely the case. All the major search engines differ somewhat, so what get you ranked high on one engine may actually help to lower your ranking on another engine.

It's for this reason that some people like to optimize pages for each particular search engine. Usually these pages would only be slightly different but this slight difference could make all the difference when it comes to ranking high, however because search engine spiders crawl through sites indexing every page they can find they might come across your search engine specific optimized pages and notice that they're very similar. Hence the spiders may think you're spamming and will do one of two things, ban your site altogether or severely punish you in the form of lower rankings.

What can you do to say stop Google indexing pages that are meant for Altavista, well the solution is really quite simple and I'm surprised that more webmasters who do optimize for each search engine don't use it more. It's done using a robots.txt file which resides on your webspace.

A Robots.txt file is a vital part of any webmasters battle against getting banned or punished by the search engines if he or she designs different pages for different search engines.

The robots.txt file is just a simple text file as the file extension suggests. It's created using a simple text editor like Notepad or Wordpad, complicated word processors such as Microsoft Word will only corrupt the file.

Here's the code you need to insert into the file:

Red text is compulsory and never changes while the blue text you'll have to change to suit the file and the engine which you want to avoid it.

User-Agent: (Spider Name)
Disallow: (File Name)

The User-Agent is the name of the search engines spider and Disallow is the name of the file that you don't want that spider to spider. I'm not entirely sure if the code is case sensitive or not but I do know that the code above works, so just to be sure check that the U and A are in caps and likewise the D in disallow.

You've to start a new batch of code for each engine, but if you want to list multiply disallow files you can one under another. For example -

User-Agent: Slurp (Inktomi's spider)
Disallow: internet-marketing-gg.html
Disallow: internet-marketing-al.html
Disallow: advertising-secrets-gg.html
Disallow: advertising-secrets-al.html

In the above code I have disallowed Inktomi to spider two pages optimized for Google (internet-marketing-gg.html & advertising-secrets-gg.html) and two pages optimized for Altavista (internet-marketing-al.html & advertising-secrets-al.html). If Inktomi were allowed to spider these pages as well as the pages specifically made for Inktomi, I run the risk of being banned or penalized so it's always a good idea to use a robots.txt file.

I mentioned earlier that the robots.txt file resides on your webspace, but where on your webspace? The root directory that's where, if you upload your file to sub-directories it won't work. If you want to block certain engines from certain files that do not reside in your root directory you simply need to point to the right directory and then list the file as normal, for example -

User-Agent: Slurp (Inktomi's spider)
Disallow: folder/internet-marketing-gg.html
Disallow: folder/internet-marketing-al.html

If you wanted to disallow all engines from indexing a file you simply use the * character where the engines name would usually be. However beware that the * character won't work on the Disallow line.

Here's the names of a few of the big engines, do a search for 'search engine user agent names' on Google to find more.

Excite - ArchitextSpider
Altavista - Scooter
Lycos - Lycos_Spider_(T-Rex)
Google - Googlebot
Alltheweb - FAST-WebCrawler/

Be sure to check over the file before uploading it, as you may have made a simple mistake which could mean your pages are indexed by engines you don't want to index them, or even worse none of your pages mightn't be indexed.

A little note before I go, I have listed the User-Agent names of a few of the big search engines, but in reality it's not worth creating different pages for more than 6-7 search engines. It's very time consuming and results would be similar to those if you created different pages for only the top five, more is not always best.
Now you know how to make a robots.txt file to stop you from getting banned by the search engines. Wasn't that easy, till next time!

About the author
Article by David Callan. David is an Internet marketing professional and webmaster of http://www.akamarketing.com/webmaster-forums/. Visit his webmaster forums for the latest discussions on search engines, website authoring and Internet marketing related issues and topics.

Latest articles

» Change your mind about an eBay bid?
We have all made choices in life that two seconds later we know we should take back. Especially when there is money involved this can become a problem.

» A simple way to create 7 effective autoresponder messages
Email is the Net's most powerful marketing tool. And autoresponders are the best idea yet for marketing with email.

» 7 ways to drive laser-targeted traffic to your website
Getting people who matter to see one’s website is a difficult undertaking if he tries to consider the fact that there are rivals everywhere waiting to pin him down.

» Website valuation: Why standard website pricing methods will emerge
The market of buying and selling developed websites is becoming more and more liquid each day.

» One way links are better than reciprocal links
You probably know by now that where your website ranks in the search engine rankings dramatically affects how many visitors you have to your site. Did you also know that you can change where your site is ranked by being proactive and getting as many one way links to your site as possible?

» How to make visitors stay at your website
The very first thing which you should provide the visitors with is some free interesting reading material.

» How to make your visitors click your ads
Here is a simple solution; Convert your banner advertisement to look like a text advertisement!

» Offline advertising should be a part of your online strategy
Day by day, online business has become more & more complicated and competitive.

» How to sell traffic
Selling the traffic arriving at your site is a good method to increase profits from your portal.

» Make money from online auctions
Online auctions have the best benefit of a vast platform. Your product is viewed by loads of people & hence there is more possibility of finding a suitable bidder.

» Groupware explained in easy terms
Groupware is a term used frequently to describe collaborative software. Groupware is application software that integrates work on a single project by several concurrent users at separated random workstations.

» Timely back up can save you from disasters
Few things which people often back up are e-mail addresses, bank records, photographs, personal records, software’s, music etc.

» Why should one go for autoresponders
Autoresponders are programs which get automatically executed in particular situations.

» Become your own boss - Start your own online business today
A survey conducted by SBA states that two third of new business survives at least two year and about forty four percent survives at least four year.

» Express your thoughts - Creating your own blog!
What exactly is a blog? Technically speaking it is a journal or a newsletter which is regularly updated and can be used by any one.

» Pop-up ads - To be or not to be?
According to a study conducted by the Bunnyfoot University, “The Efficacy of Pop-ups and the Resulting Effect on Brands” Internet users feel harangues and harassed by pop-up ads.

» Why content is king on the Internet
The advantages that Internet holds over the rest of the other communication mediums should not wasted because of the inability to find a comprehensive plan that will bind all these faculties together.

» 10 niche marketing tips
In our increasingly driven consumer economies, the average customer is bombarded by choices. With increased saturation of the market, companies look towards niche marketing to search new, ever-evolving and sophisticated consumers.

» Using free traffic exchange
These days internet has emerged as both, a market and hub for marketing. Unlike the ‘brick and mortar’ world where large manufacturers manage to squeeze out the market bases of smaller companies, the internet provides haven like the free traffic exchange.

» Ten ways to drive traffic to your website
Developing a web site and then letting it grow is like planting a tree and then nurturing it.

» Marketing through keyword articles
One of the most effective tools of Internet marketing is the use of keyword articles.

» Want to make money online? Market a service to businesses
Don walked across the street from his house to mine to announce he had finally retired. "But I'm not ready for the golf course," he said. "I want to make a living on the Internet. What can I sell?"

» Web site design mistakes - Database parameters in URLs
Creating a web site takes thought, planning and execution. Unfortunately, many designs are dead in the water before they are even published as far as search engine optimization is concerned. Whatever you do, avoid these critical mistakes.

» Alexa Toolbar - The ultimate internet tool
There are numerous tools available on the Internet to assist online businesses. A valuable tool that you should use is the Alexa Toolbar. Even better, this tool is free.

» Web site design mistakes
Some wise human once said "Learn from the mistakes of others. There isn't nearly enough time to make them all yourself." Hence this article. Here are five of the most annoying and common web design mistakes.

» Abandonment - Why visitors don't turn into customers
Every good Internet business understands the value of conversions versus hits received. Far too often, businesses become fixated on the hits they are receiving instead of monitoring their hit to sale conversion rate.

» Creative search engine optimization - A case study
Search engine optimization this and search engine optimization that. You read and hear about it all day, but what about your site?

Tools & services to enhance your online business

» Site Build It!
Over 100,000 small businesses of all kinds outperform larger, well-financed competitors. Read about this all-in-one site-building-hosting-marketing system of tools that delivers results.

» Secrets To Their Success
Take a private tour of two "Mom & Pop" web sites every month that earn $100,000+ a year... and discover the exact step-by-step strategies they have personally used to generate these massive profits.