nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "PublicServers" by RBalmes
Date Fri, 25 Dec 2009 09:38:13 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "PublicServers" page has been changed by RBalmes.
http://wiki.apache.org/nutch/PublicServers?action=diff&rev1=71&rev2=72

--------------------------------------------------

  = Public search engines using Nutch =
+ 
  Please sort by name alphabetically
  
-  * [[http://askaboutoil.com|AskAboutOil]] is a vertical search portal for the petroleum
industry.
+   * [[http://askaboutoil.com|AskAboutOil]] is a vertical search portal for the petroleum
industry.
  
-  * [[http://www.asbestosinfo.info|Asbestos]] is a vertical search portal and discussion
forum for the asbestos and related information.
+   * [[http://www.asbestosinfo.info|Asbestos]] is a vertical search portal and discussion
forum for the asbestos and related information.
  
-  * [[http://www.baynote.com/go|Baynote]] provides free hosted Nutch search for businesses.
+   * [[http://www.baynote.com/go|Baynote]] provides free hosted Nutch search for businesses.
  
-  * [[http://betherebesquare.com|BeThere BeSquare]] is an Event Search Engine for the San
Francisco Bay Area that allows users to specify keywords, date, city, address, and category
and get details about events in 4 different views.
+   * [[http://betherebesquare.com|BeThere BeSquare]] is an Event Search Engine for the San
Francisco Bay Area that allows users to specify keywords, date, city, address, and category
and get details about events in 4 different views.
  
-  * [[http://www.bible-ref.om/|Biible]] is the first biblical search engine that allows people
to search the web for comments of biblical verse or range of verse. 6 major languages are
fully recognized and 150 partially for now. Based on Nutch.
+   * [[http://www.bigsearch.ca/|Bigsearch.ca]] uses nutch open source software to deliver
its search results.
  
-  * [[http://www.bigsearch.ca/|Bigsearch.ca]] uses nutch open source software to deliver
its search results.
+   * [[http://busytonight.com/|BusyTonight]]: Search for any event in the United States,
by keyword, location, and date. Event listings are automatically crawled and updated from
original source Web sites.
  
-  * [[http://busytonight.com/|BusyTonight]]: Search for any event in the United States, by
keyword, location, and date. Event listings are automatically crawled and updated from original
source Web sites.
+   * [[http://www.centralbudapest.com/search|Central Budapest Search]] is a search engine
for English language sites focussing on Budapest news, restaurants, accommodation, life and
events.
  
-  * [[http://www.centralbudapest.com/search|Central Budapest Search]] is a search engine
for English language sites focussing on Budapest news, restaurants, accommodation, life and
events.
+   * [[http://circuitscout.com|Circuit Scout]] is a search engine for electrical circuits.
  
-  * [[http://circuitscout.com|Circuit Scout]] is a search engine for electrical circuits.
+   * [[http://www.comtecsearch.com|Comtec Search]] is a search engine for UK Tour Operator
Package Holiday Brochures.
  
-  * [[http://www.comtecsearch.com|Comtec Search]] is a search engine for UK Tour Operator
Package Holiday Brochures.
+   * [[http://www.coder-suche.de|Coder-Suche.de]] searchs for coding stuff like apis, documentations,
tutorials, openBooks and more. Its origin is german, its contents are mainly english.
  
-  * [[http://www.coder-suche.de|Coder-Suche.de]] searchs for coding stuff like apis, documentations,
tutorials, openBooks and more. Its origin is german, its contents are mainly english.
+   * [[http://campusgw.library.cornell.edu/|Cornell University Library]] is collaborating
with the research group of Thorsten Joachims to develop a learning search engine for library
web pages based on Nutch. The nutch-based search engine is near the bottom of the page.
  
-  * [[http://campusgw.library.cornell.edu/|Cornell University Library]] is collaborating
with the research group of Thorsten Joachims to develop a learning search engine for library
web pages based on Nutch. The nutch-based search engine is near the bottom of the page.
+   * [[http://search.creativecommons.org/|Creative Commons]] is a search engine for creative
commons licensed material.
  
-  * [[http://search.creativecommons.org/|Creative Commons]] is a search engine for creative
commons licensed material.
+   * [[http://www.dadi360.com/|Dadi360]] Usee nutch search engine for providing search of
Chinese language websites in North America.
  
-  * [[http://www.dadi360.com/|Dadi360]] Usee nutch search engine for providing search of
Chinese language websites in North America.
+   * [[http://www.ecolicommunity.org/Websearch|Ecolhub Web Search]] an E. coli specific search
engine based on Nutch. EcoliHub WebSearch includes only those sites relevant to E. coli, thereby
reducing the number of spurious hits. Searches can be optionally limited to your choice of
resources. More than 110,000 pages to search. More resources getting added.
  
-  * [[http://www.ecolicommunity.org/Websearch|Ecolhub Web Search]] an E. coli specific search
engine based on Nutch. EcoliHub WebSearch includes only those sites relevant to E. coli, thereby
reducing the number of spurious hits. Searches can be optionally limited to your choice of
resources. More than 110,000 pages to search. More resources getting added.
+   * [[http://www.epivista.de/|Epivista]] is a search engine of epilepsy related web sites.
  
-  * [[http://www.epivista.de/|Epivista]] is a search engine of epilepsy related web sites.
+   * [[http://www.eroscanner.com/|eroscanner]] is a search engine for german adult stuff.
Watching the quality of ranking in this hard-fought area might be very interesting. (Warning:
'''NSFW''')
  
-  * [[http://www.eroscanner.com/|eroscanner]] is a search engine for german adult stuff.
Watching the quality of ranking in this hard-fought area might be very interesting. (Warning:
'''NSFW''')
+   * [[http://www.ertech.ch/|ertech]] uses nutch as its search engine. It is integrated with
the CMS system aarcat from aarboard.
  
-  * [[http://www.ertech.ch/|ertech]] uses nutch as its search engine. It is integrated with
the CMS system aarcat from aarboard.
+   * [[http://www.erzsuche.de|Erzsuche.de]] is a local search engine for the Erzgebirge (For
what? It is the home of the nutcracker) With spell check feature
  
-  * [[http://www.erzsuche.de|Erzsuche.de]] is a local search engine for the Erzgebirge (For
what? It is the home of the nutcracker) With spell check feature
+   * [[http://search.fileratings.com|FileRatings Search]] is a search engine of software
product.
  
-  * [[http://search.fileratings.com|FileRatings Search]] is a search engine of software product.
+   * [[http://www.gensphere.org/|GenSphere]] - Genealogy Search Engine based on Nutch.
  
-  * [[http://www.gensphere.org/|GenSphere]] - Genealogy Search Engine based on Nutch.
+   * [[http://www.gina-erotic-search.net/|Gina Wild Erotic Search Engine]] is based on nutch
and uses the language identifier modul to present results according to the choosen language.
 (Warning: '''NSFW''')
  
-  * [[http://www.gina-erotic-search.net/|Gina Wild Erotic Search Engine]] is based on nutch
and uses the language identifier modul to present results according to the choosen language.
 (Warning: '''NSFW''')
+   * [[http://www.jboss.com/search.jsp?query=http&x=0&y=0|jboss homepage]] The jboss
(tm) homepage runs a nutch as homepage search engine.
  
-  * [[http://www.jboss.com/search.jsp?query=http&x=0&y=0|jboss homepage]] The jboss
(tm) homepage runs a nutch as homepage search engine.
+   * [[http://www.jcintersonic.com/|J&C Intersonic]] uses nutch as its search engine.
  
-  * [[http://www.jcintersonic.com/|J&C Intersonic]] uses nutch as its search engine.
+   * [[http://www.jumblefox.com.au/|Jumble Fox]] - The Australian Search Engine
  
-  * [[http://www.jumblefox.com.au/|Jumble Fox]] - The Australian Search Engine
+   * [[http://www.knowmydestination.com/|KnowMyDestination]] - Search Engine for Travel related
stuff. We have created this search engine by using Google WebAPIs to fetch relavant start
URLs and then use Nutch to crawl and index those URLs.
  
-  * [[http://www.knowmydestination.com/|KnowMyDestination]] - Search Engine for Travel related
stuff. We have created this search engine by using Google WebAPIs to fetch relavant start
URLs and then use Nutch to crawl and index those URLs.
+   * [[http://krugle.com|Krugle]] uses Nutch to crawl web pages for code, archives and technically-interesting
content. We also use a modified version of Nutch to crawl CVS/Subversion repositories, and
the NutchBean/distributed searcher support to search and generate hits for code and tech info
queries.
  
-  * [[http://krugle.com|Krugle]] uses Nutch to crawl web pages for code, archives and technically-interesting
content. We also use a modified version of Nutch to crawl CVS/Subversion repositories, and
the NutchBean/distributed searcher support to search and generate hits for code and tech info
queries.
+   * [[http://www.labforculture.org|LabforCulture]] - The essential tool for everyone in
arts and culture who creates, collaborates, shares and produces across borders in Europe.
  
-  * [[http://www.labforculture.org|LabforCulture]] - The essential tool for everyone in arts
and culture who creates, collaborates, shares and produces across borders in Europe.
+   * [[http://LOOQ.EU/|LOOQ.EU]] - European search engine which indexes sites in Europe.
  
-  * [[http://LOOQ.EU/|LOOQ.EU]] - European search engine which indexes sites in Europe.
+   * [[http://LDSsearch.com/|LDSsearch.com]] - Search engine which indexes sites with a positive
bias toward the mormon church.
  
-  * [[http://LDSsearch.com/|LDSsearch.com]] - Search engine which indexes sites with a positive
bias toward the mormon church.
+   * [[http://www.millionpixelsearchpage.com|The Million Pixel Search Page]] - Search engine
for Alex Tew's [[http://www.milliondollarhomepage.com|Million Dollar Homepage]].
  
-  * [[http://www.millionpixelsearchpage.com|The Million Pixel Search Page]] - Search engine
for Alex Tew's [[http://www.milliondollarhomepage.com|Million Dollar Homepage]].
+   * [[http://www.misterbot.fr|Misterbot.fr]] a search engine for french language web sites.
  
-  * [[http://www.misterbot.fr|Misterbot.fr]] a search engine for french language web sites.
+   * [[http://search.mountbatten.net|Mountbatten Search]] a search engine that crawls only
the part of the Internet located in Uganda.
  
-  * [[http://search.mountbatten.net|Mountbatten Search]] a search engine that crawls only
the part of the Internet located in Uganda.
+   * [[http://www.mozdex.com|mozDex]].com Running Nutch SVN release with Clustering &
Ontology support enabled.
  
-  * [[http://www.mozdex.com|mozDex]].com Running Nutch SVN release with Clustering &
Ontology support enabled.
+   * [[http://www.myopensourcejobs.com|MyOpensourcejobs]] A Opensource skills jobs site using
NUTCH and LAMP based    DRUPAL CMS.
  
-  * [[http://www.myopensourcejobs.com|MyOpensourcejobs]] A Opensource skills jobs site using
NUTCH and LAMP based    DRUPAL CMS.
+   * [[http://www.nsyght.com|Nsyght.com]] is a social search engine that customizes a users
search based on their social graph.
  
-  * [[http://www.nsyght.com|Nsyght.com]] is a social search engine that customizes a users
search based on their social graph.
+   * [[http://www.nursewebsearch.com|Nurse Web Search]] - Health Internet Search Engine.
  
-  * [[http://www.nursewebsearch.com|Nurse Web Search]] - Health Internet Search Engine.
+   * [[http://www.netluchs.de/|Netluchs.de]] Searchengine for german language websites.
  
-  * [[http://www.netluchs.de/|Netluchs.de]] Searchengine for german language websites.
+   * [[http://nowaccepting.com|NowAccepting.com]] is a job search engine.
  
-  * [[http://nowaccepting.com|NowAccepting.com]] is a job search engine.
+   * [[http://www.playfuls.com/|Playfuls.com]] is a search engine that indexes the most important
english gaming-related websites.
  
-  * [[http://www.playfuls.com/|Playfuls.com]] is a search engine that indexes the most important
english gaming-related websites.
+   * [[http://www.gouv.qc.ca/|Government of Quebec websites]] Over 400 websites of the government
of Quebec (Canada) are indexed by Nutch. The Web application has been developped by [[http://www.doculibre.com/index_en.html/|Doculibre
inc.]]
  
-  * [[http://www.gouv.qc.ca/|Government of Quebec websites]] Over 400 websites of the government
of Quebec (Canada) are indexed by Nutch. The Web application has been developped by [[http://www.doculibre.com/index_en.html/|Doculibre
inc.]]
+   * [[http://search2.net/|search2.net]] is a general search engine with an international
index.
+   * [[http://www.searchmitchell.com/|SearchMitchell.com]] is a community search engine for
businesses and organizations in Mitchell, SD.
  
+   * [[http://www.umkreisfinder.de/|UmkreisFinder.de]] is running the [[GeoPosition]] plugin
for local searches in Germany and in German. Please insert a search term in the first field,
a German city name in the second field and choose a perimeter at the last field.
-  * [[http://search2.net/|search2.net]] is a search engine based on Nutch.
-  * [[http://www.searchmitchell.com/|SearchMitchell.com]] is a community search engine for
businesses and organizations in Mitchell, SD.
  
-  * [[http://www.umkreisfinder.de/|UmkreisFinder.de]] is running the GeoPosition plugin for
local searches in Germany and in German. Please insert a search term in the first field, a
German city name in the second field and choose a perimeter at the last field.
+   * [[http://webharvest.gov|Webharvest.gov]] offers full-text search of nearly 100 million
resources collected from US Federal Government websites as part of the National Archive and
Records Administration's 2004 Presidential Term Web Harvest
  
-  * [[http://webharvest.gov|Webharvest.gov]] offers full-text search of nearly 100 million
resources collected from US Federal Government websites as part of the National Archive and
Records Administration's 2004 Presidential Term Web Harvest
+   * [[http://www.werelate.org|WeRelate.org]] offers a verticle genealogy search and a MediaWiki
site featuring 1.3 million sources plus information for names and places.
  
-  * [[http://www.werelate.org|WeRelate.org]] offers a verticle genealogy search and a MediaWiki
site featuring 1.3 million sources plus information for names and places.
+   * [[http://www.synoo.com:8080|Synoo.com]] is a small web search engine
  
-  * [[http://www.synoo.com:8080|Synoo.com]] is a small web search engine
+   * [[http://www.tokenizer.org|Tokenizer]] is an online shopping search engine partially
powered by Nutch
  
-  * [[http://www.tokenizer.org|Tokenizer]] is an online shopping search engine partially
powered by Nutch
+   * [[http://www.utilitysearch.info/|UtilitySearch]] is a search engine for the regulated
utility industries (Electricity, Water, Gas, and Telecommunications) in the United States
and Canada.
+   * [[http://search.tamilsweb.com/|TamilSWeb Search]] is a search engine geared toward south
asian web content.
  
-  * [[http://www.utilitysearch.info/|UtilitySearch]] is a search engine for the regulated
utility industries (Electricity, Water, Gas, and Telecommunications) in the United States
and Canada.
-  * [[http://search.tamilsweb.com/|TamilSWeb Search]] is a search engine geared toward south
asian web content.
- 

Mime
View raw message