Friday, November 29, 2013

Create & add custom robots.txt file in Blogger – Crawl & index

Have you ever heard the name robots.txt or have you added your own custom robots.txt file in your Blogger blog. Here we will discuss about what robots.txt is and how to create and add custom robots.txt file in Blogger. All that deals with crawling and indexing of your blog which covers your blog SEO, so take note on this article.

In Blogger you have a section called search preferences that covers your blogs Meta tags, errors /redirection & crawling and indexing. There you can manage your blogs view with search engines also to maintain a healthy blog you need to customize your blogs search preferences effectively. A few posts back we made an article on how to use Blogger custom redirects. In case if you had a broken links in your blog or if you changed your blog post URL then you can make use of Blogger custom redirects. Here let’s see about robots.txt and the use of adding custom robots.txt file in Blogger.


What is robots.txt?
Search engine like Google sends spiders or crawlers whatever it may be a kind of program that travels all around the web. When these crawlers or web spiders reach your site they first go through your robots.txt file to check any robots exclusion protocol that is before crawling and indexing your pages.
Robots.txt is a normal text file which is available on all websites that is used by a webmaster to advice these crawlers about accessing several pages on a website. The pages that are restricted in your robots.txt file will won’t be crawled and indexed in search results. However all those page are viewable publicly to normal humans.

Each Blogger blog will have a robots.txt file that comes by default and it looks something like the one below. You can check your blogs robots.txt file by adding /robots.txt next to your domain name. (http://yoursite.blogspot.com/robots.txt)

So what are these?
As you can see above the default robots.txt file has few things like user-agent, media partners-Google, user-agent:*, disallow and site map. If you are not aware about these, then here is the explanation.
First you need to know about User agent which is a software agent or client software that will act on behalf of you.

Mediapartners-Google – Media partner Google is the user agent for Google adsense that is used to server better relevant ads on your site based on your content. So if you disallow this they you will won’t able to see any ads on your blocked pages.

User-agent:* – So you all know what user-agent is, so what is user-agent:*. The user-agent that is marked with (*) asterisk is applicable to all crawlers and robots that can be Bing robots, affiliate crawlers or any client software it can be.

Disallow: By adding disallow you are telling robots not to crawl and index the pages. So below the user-agent:* you can see Disallow: /search which means you are disallowing your blogs search results by default. You are disallowing crawlers in to the directory /search that comes next after your domain name. That is a search page like http://yoursite.blogspot.com/search/label/yourlabel will not be crawled and never be indexed.

Allow – Allow: / simply refers to or you are specifically allowing search engines to crawl those pages.
Sitemap: Sitemap helps to crawl and index all your accessible pages and so in default robots.txt you can see that your blog specifically allowing crawlers in to sitemaps. You can learn more about Blogger sitemap here. There is an issue with default Blogger sitemap, so learn how to create sitemap in Blogger and notify search engines.

What pages should I disallow in Blogger?
This question is little tricky and we cannot predict what pages to allow and what to disallow in your Blog. You can disallow pages like privacy policy, Terms & conditions, cloaked affiliate links, labels as well as search results and it depends all upon you. Since you get some reasonable traffic from search results it is not recommended that you disallow the labels page, privacy policy page and TOS page.

How to disallow pages in Blogger using robots.txt
You can disallow search engines to crawl and index particular pages or posts in Blogger using your robots.txt file.
We don’t have the reason to block search engines on any particular posts and if you wish so then just add Disallow: /year/month/your-post-url.html in your robots.txt file. That is copy your post URL next to your domain name and add it in your robots.txt file.
Same what you will need to do for disallowing any particular pages. Copy the page URL next to your domain name and add it like this Disallow: /p/your-page.html in your robots.txt file.

The best and recommended robots.txt file for Blogger
Only use custom robots.txt file if you are 100% sure on what you are doing. Improper use of custom robots.txt can harm your site rankings. So for best results it is recommended that you use default robots.txt in Blogger which works good. But change the default sitemap in your robots.txt and add your custom sitemap for Blogger.

How to create and add custom robots.txt file in Blogger
In wordpress we have to create a robots.txt file in notepad and upload it to the web root directory. But here in Blogger you can add the robots.txt file easily from your blog dashboard. To add a custom robots.txt just login to your Blogger profile and select your blog. Now head to dashboard >> settings >> search preferences and you can see custom robots.txt in crawling and indexing section. Click edit and enable custom robots.txt content and add your robots.txt file


Once done click save changes. Now to check your robots.txt just add /robots.txt at the end of your blog URL and you can see your custom robots.txt file. After adding your custom robots.txt file you can submit your blog to search engines. Learn how to submit your blog to Google, Bing and Yahoo.
Hope this post clearly explained you about robots.txt and how to create and add custom robots.txt file in Blogger. Please share it and if you have any other doubts on Blogger robots.txt then feel free to comment below.

How to create sitemap for Blogger blog – Blogger sitemap XML


A Sitemap is nothing but a list of accessible pages in your website. Sitemaps helps search engines like Google, Yahoo and Bing to easily crawl pages in your site which helps in better index. As a blogger you must create a sitemap so whenever you make a new post search engines can crawl and index them easily. In this post let’s see how to create sitemap for Blogger blog and submitting it to Google webmaster tools and in robots.txt.

 

Default Blogger sitemap

By default your blogger blog will have sitemap, but the issue with that sitemap is it only shows your recent blog posts (example). A perfect sitemap is that it should contain the list of all you pages so that search engines know the complete structure of your site. Let’s see how to create sitemap for Blogger blog.

 

How to create sitemap for Blogger

Creating sitemap is very simple and this sitemap works for both self-hosted Blogger blogs and normal Blogger blogs. Just use this (atom.xml?redirect=false&start-index=1&max-results=500)next to your blogs URL. See example below.
http://yourblogname.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500
Now you have created sitemap for your Blogger blog, but you need to tell search engines about your sitemap so that bots can know your site structure. There are two ways to tell search engines about your sitemap.
1.            Showing your sitemap in robots.txt file and 2. Submitting your sitemaps in Google webmaster tools

 

Adding your Blogger sitemap in robots.txt file

Login to your Blogger blog and go to dashboard >> settings >> search preferences and edit the custom robots.txt. Enable and paste the following text and click save changes. Make sure to change your blog name in Blogger sitemap below

User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow:
Allow: /
Sitemap: http://debaonline4u.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500



So whenever search engines crawls your site they will access your Blogger sitemap from robots.txt

 

Submitting your Blogger sitemap in Google webmaster tools

Login to your Google webmaster tools and select you website. In your site dashboard click sitemaps below. Now click add/test sitemaps, add your sitemap (just the atom.xml?redirect=false&start-index=1&max-results=500) and submit your sitemap.


That’s it you have successfully submitted your sitemap in Google webmaster tools. Now it’s ready for crawling and indexing. You can find the number of pages crawled and indexed in Google webmaster tools. 



If you like then also submit your Bloggers sitemap to Bing webmaster tools.
Hope this helped you on how to create sitemap for Blogger and submitting it in robots.txt and Google webmaster tools. Please leave your comments below.






Monday, November 25, 2013

Inspirational Stories #128 - Aesop's tales # 13 - A Man and his two Wives


In the old days, when men were allowed to have many wives, a middle-aged Man had one wife that was old and one that was young; each loved him very much, and desired to see him like herself.
 
Now the Man's hair was turning grey, which the young Wife did not like, as it made him look too old for her husband. So every night she used to comb his hair and pick out the white ones. But the elder Wife saw her husband growing grey with great pleasure, for she did not like to be mistaken for his mother. So every morning she used to arrange his hair and pick out as many of the black ones as she could. The consequence was the Man soon found himself entirely bald.

Moral: Yield to all and you will soon have nothing to yield.