Security Through Obscurity is Not Very Secure
The Web is very useful for passing information to and from various people all around the world. But if you are putting up information that you don't want the world to see, you need to do something more than just hide it in a secret directory. There are many ways that a search engine can find files that are hidden and unlinked. Such as:
- Search engine spiders can fill out forms and spider the results pages
- They can also read referal codes to see where someone has come from. This means, that if they visited your hidden page and then Google, Google can get that page into their index.
- Even if you don't link to the page, if someone else does, then eventually the Googlebot will find the link and spider the page.
- In the same way, even if you don't use a search engine add URL page, someone else might add your hidden page for you.
Everything That is Not Actively Hidden Can be Found by a Search Engine
Everything that is stored on your website in public (non-password protected) directories is visible to a robot or search engine. At first blush, this might sound like a good thing. After all, isn't one of the goals of Web development to create pages and sites that are found and spidered by search engines? But you might be surprised at what search engines are now finding and including in their indexes.
Google and other search engines have tools that the Web for specific files and file types. And they don't just search the file names. Many of these file types are indexable, meaning the search engine can read the contents and index that as well. Even text in images is soon going to be indexable by search engines.
So if you have secret or private information in any of the following file types, you should not rely on security through obscurity to protect them.
- Acrobat (.pdf)
- PostScript (.ps)
- Word Documents (.doc)
- Excel Spreadsheets (.xls)
- Powerpoint Presentations (.ppt)
- Rich Text Format (.rtf)
- Flash (.swf and .fla)
- Images (.gif, .jpg, .png, and others)
It's All Vulnerable
If you put up any files that you don't want to be found on the website, they should be in a password protected directory. If they aren't, they are visible - and search engines can and will spider it.
How You Can Protect Your Files
There are several ways to protect your files:
- Don't put them up on the site
This is the most secure method. If you don't want your files to be seen by people, avoid putting them on a website or even a computer with a Web server on it.
Put them in a password protected directory
Put up a robots.txt file
This will prevent "law-abiding" robots from spidering the specified pages, but doesn't prevent robots who don't follow those rules. In fact, it acts as a flag to some that those directories might contain sensitive documents and materials.
When building and maintaining a website, it's important to keep security in mind at all times. "Security through obscurity" or the idea that if a page isn't linked means people won't find it, is an incorrect theory. If you put up a document or file on a Web page, you should assume that someone will find and read it. If you don't want it found, think really carefully before you post it to your website.