Google-Hacking Made Easy -- Visual Studio Magazine

Google-Hacking Made Easy

By William Jackson
03/06/2008

With a name like "Cult of the Dead Cow" you know these guys are probably up to no good, and they are living up to expectations with the release of Goolag Scan, a tool to automate the use of search engines to scan for vulnerable applications, back doors and sensitive information on Web sites.

This is a technique called "Google hacking," named for the Web's predominate search engine, and it isn't new. What's new is the improved tool that makes it easier to do the searches.

"I don't think they have anything new in terms of new capabilities," said Amichai Shulman, chief technology officer of Imperva Inc. of Foster City, Calif., and head of the company's Application Defense Center. "They do have a tool that makes Google hacking more accessible to script kiddies."

Goolag Scan runs with Windows and has a good graphical interface along with a library of about 1,500 carefully crafted searches that can reveal sensitive information about or from queried Web sites. The tool is neutral; it can be used for penetration-testing by administrators, by application owners to identify weaknesses or by hackers to find vulnerabilities to exploit.

"Tools like this scanner are a wake-up call for application owners," Shulman said. "And that is a good thing. The issue of data leakage into search engines is a big issue."

The Cult of the Dead Cow has said much of its research in this area has been against government servers where it has been able to turn up sensitive information that has been unwittingly exposed.

"With a lot of script kiddies having this tool, I think the government can expect a rough period of headlines," Shulman said.

The practice of using search engines to find sensitive information has been around for years. Johnny Long, a security researcher and penetration tester for Computer Sciences Corp. in El Segundo, Calif., wrote the book on the subject, Google Hacking for Penetration Testers, in 2005. The government became acutely aware of the practice in the wake of the terrorist attacks in 2001, Long said. It is one of the reasons agencies began scrubbing Web sites of sensitive data following the attacks.

The primary difference between Google hacking and doing a Google search is the frame of mind because the search engine is being used as intended. It's all a matter of what queries are used and how the resulting hits are used.

For example, a Google hacker might ignore the content of the links returned in a search and focus instead on the names of the servers that responded. Or, through a properly constructed query, access a list of Social Security numbers along with the names and addresses of their holders.

Long compiled a catalog of more than 1,300 such queries that are used by legitimate developers of penetration-testing tools. Queries can return hits containing:

Login pages for a variety of services and servers.
Security logs from firewalls, honeypots and intrusion detection and prevention systems that can reveal a wealth of details on vulnerabilities.
Lists of networked devices such as printers and cameras.
Servers operating with default configurations, which could include default passwords.

Long kept his catalog of queries confidential, but they are not secrets and the new Goolag Scan tool has its own catalog to use.

Google has taken steps to block the technique -- or at least make it less easy -- by blocking blatantly automated searches. However, this also can stop or slow down legitimate penetration-testing, giving an advantage to the hacker who can search slowly for a limited number of vulnerabilities. The administrator doing penetration testing has to scan for all 1,500 or so vulnerabilities to be secure.

Still, the best defense against this type of problem is to be proactive, Shulman said. "Find the leakage before others find it."

About the Author

William Jackson is the senior writer for Government Computer News (GCN.com).

Printable Format

comments powered by Disqus

Featured

Hands On: New VS Code Insiders Build Creates Web Page from Image in Seconds

New Vision support with GitHub Copilot in the latest Visual Studio Code Insiders build takes a user-supplied mockup image and creates a web page from it in seconds, handling all the HTML and CSS.
Naive Bayes Regression Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the naive Bayes regression technique, where the goal is to predict a single numeric value. Compared to other machine learning regression techniques, naive Bayes regression is usually less accurate, but is simple, easy to implement and customize, works on both large and small datasets, is highly interpretable, and doesn't require tuning any hyperparameters.
VS Code Copilot Previews New GPT-4o AI Code Completion Model

The 4o upgrade includes additional training on more than 275,000 high-quality public repositories in over 30 popular programming languages, said Microsoft-owned GitHub, which created the original "AI pair programmer" years ago.
Microsoft's Rust Embrace Continues with Azure SDK Beta

"Rust's strong type system and ownership model help prevent common programming errors such as null pointer dereferencing and buffer overflows, leading to more secure and stable code."
Xcode IDE from Microsoft Archrival Apple Gets Copilot AI

Just after expanding the reach of its Copilot AI coding assistant to the open-source Eclipse IDE, Microsoft showcased how it's going even further, providing details about a preview version for the Xcode IDE from archrival Apple.