hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Segel" <danse...@gmail.com>
Subject Gigablast.com search engine- 10BILLION PAGES!
Date Thu, 05 Jun 2008 19:20:20 GMT
Our ultimate goal is to basically replicate gigablast.com search engine.
They claim to have less than 500 servers that contain 10billion pages
indexed, spidered and updated on a routine basis...  I am looking at
featuring 500 million pages indexed per node, and have a total of 20 nodes.
Each node will feature 2 quad core processes, 4TB (at raid 5) and 32 gb of
ram.  I believe this can be done however how many searches per second do you
think would be realistic in this instance?  We are looking at achieving
25+/- searches per second ultimately spread out over the 20 nodes... I can
really uses some advice with this one.

    Thanks,
    D. Segel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message