hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Teruhiko Kurosaka <K...@basistech.com>
Subject How big data and/or how many machines do I need to take advantage of Hadoop?
Date Wed, 31 Aug 2011 08:48:31 GMT
Hadoop newbie here.

I wrapped my company's entity extraction product in a Hadoop task,
and give it a large file of the magnitude of 100MB.
I have 4 VMs running on a 24-core CPU server, and made two of
them the slave nodes, one namenode and another job tracker.
It turned out that processing the same data size takes longer
using Hadoop than processing it in serial.

I am curious that how I can experience the advantage of
Hadoop.  Is having many physical machines essential?
Would I need to process Terabytes of data? What would be
the minimum set up where I can experience the advantage
of Hadoop?
T. "Kuro" Kurosaka

View raw message