A client recently had an issue where a woman's home address and phone number was being listed in the Google Search results.. it was imperative to get this info out of the search results immediately but they couldnt work out what was going on, as the file containing her info seemed not to be on the site any longer. They asked my advice.. and this article is about my response..
Firstly, Google keeps information FOREVER.. its called the "Google Cache".. so even if a page doesnt have that info on it right now, it will be found in Googles search archives from the past. A name that doesnt appear very often on the web will bring up very very old instances because nothing else newer comes up more recently for example..
See
http://www.googleguide.com/cached_pages.html
Secondly, Google indexes EVERYTHING on your website unless you tell it not to explicitly.. To tell Google/Yahoo spiders NOT to index something you have to set "nofollow mechanisms" for directories where sensitive info is located.. and never put sensitive info into publicly / web accessible databases or files because even if a file is subject to a nofollow mechanism, if someone else links to the file or web page from their own website, the url may still get into Google's indexes.
See
http://www.google.com/support/webmasters/bin/answer.py?hl=en&an...
Other useful resources:
http://www.stormthecastle.com/earning-revenue/make-a-robots-txt-fil...
More advanced (but heavily google centric)
http://tools.seobook.com/robots-txt/
Thirdly, to get Google to stop showing some info in its results can be really hard - you would have to go here
http://www.google.com/quality_form and report the fact that you want something removed..
but then, depending upon the member of staff who deals with your request, you may have to prove that you are not preventing anyone's right to free speech or freedom of information if you want it removed.. googles cache is a bit too powerful!!!! ;-) Big brother=GOOGLE!!!