So there have been some deep web sites that have shut down. Zuula.com and infomine. I also added a nifty archive site for California.
Archive for Deep web
http://www.shodanhq.com/ has been added to the list for Net Sec deep web assets. Should have added it 2 years ago, my bad.
nifty multi-search tool that lets you group answers by topic or domain extention (ie. – .edu or .gov)
We’ve added Getty, they have part of their collection online.
Getty Research Institute – http://www.getty.edu/research/library/ – The Getty Research Institute library collections include over one million books, periodicals, study photographs, and auction catalogs as well as extensive special collections of rare and unique materials. Focusing on art history, architecture, and related fields, they begin with the archaeology of prehistory and extend to the contemporary moment.
Getting sources of Data is always a problem when tackling a statistical or data mining project. Here are two very nice deep web assets:
http://databank.worldbank.org/data/home.aspx – Speciality statistical data on all kinds of subjects, from countries GDP to levels of blindness.
http://www.quandl.com/ – An awesome collection of 9,000,000 of financial, economic, and social datasets.
Added http://www.scirus.com
for academic search docs, pointed out to me by a user
and a military hardware site http://semanticommunity.info
Was shown a nice search engine for embedded, zipped, or otherwise obscured files.
Looks in FTP, zip, RAR and other formats.
Also added the National Security Archive
While researching on the current massive meteorite strike in Russia this morning, one deep web resource has shown how useful it is for extracting out videos for local OSInt.
It is being added to the deep web list
Some brave, bold academic decided to stand up to a commercial pay-fee deep website that holds academic articles. The result you will read about below, but in essence, bad things happened.
This highlights a moral battle. There is a a great deal of financial gain to the gatekeepers of wonderful data behind fee-pay portals of Deep Web sites. Controversy erupts when the data can be argued to be public domain.
From Aljazeera News –
On July 19, 2011, Aaron Swartz, a computer programmer and activist, was arrested for downloading 4.8 million academic articles. The articles constituted nearly the entire catalogue of JSTOR, a scholarly research database. Universities that want to use JSTOR are charged as much as $50,000 in annual subscription fees.
Individuals who want to use JSTOR must shell out an average of $19 per article. The academics who write the articles are not paid for their work, nor are the academics who review it. The only people who profit are the 211 employees of JSTOR.
Swartz thought this was wrong. The paywall, he argued, constituted “private theft of public culture”. It hurt not only the greater public, but also academics who must “pay money to read the work of their colleagues”.
For attempting to make scholarship accessible to people who cannot afford it, Swartz is facing a $1 million fine and up to 35 years in prison. The severity of the charges shocked activists fighting for open access publication. But it shocked academics too, for different reasons.
http://www.aljazeera.com/indepth/opinion/2012/10/20121017558785551.html