Telling a search engine to not index a page sounds relatively easy. According to the specs, all you need to do is implement the NOINDEX parameter in your Robots Metatag. Sounds like you’re all set. But you might not be.
Why? It turns out that if you have also used your Robots.txt file to tell Google to not crawl the pages you have “NOINDEX”ed that Google will ignore the metatags. This is because in Google’s algorithms Robots.txt takes precedence over the Robots metatag. Translation: the Robots Metatag is ignored if you exclude the crawling of the page via Robots.txt.
So even if the page is not crawled, your page can still be indexed if other pages link to it. So if you want to prevent a page from being indexed, use only the Robots Metatag, and you should be off to the races.
Eric Enge leads the Digital Marketing practice for Perficient Digital. He designs studies and produces industry-related research to help prove, debunk, or evolve assumptions about digital marketing practices and their value. Eric is a writer, blogger, researcher, teacher, and keynote speaker and panelist at major industry conferences. Partnering with several other experts, Eric served as the lead author of The Art of SEO. Learn More About Eric Enge