hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dieter Plaetinck <dieter.plaeti...@intec.ugent.be>
Subject Re: HDFS and Openstack - avoiding excessive redundancy
Date Mon, 14 Nov 2011 10:32:23 GMT
Or more general:
isn't using virtualized i/o counter effective when dealing with hadoop M/R?
I would think that for running hadoop M/R you'd want predictable and consistent i/o on each
node,
not to mention your bottlenecks are usually disk i/o (and maybe CPU), so using virtualisation
makes
things less performant and less predictable, so, inferior.  Or am I missing something?

Dieter

On Sat, 12 Nov 2011 07:54:05 +0000
Graeme Seaton <lists@graemes.com> wrote:

> One advantage to using Hadoop replication though, is that it provides
> a greater pool of potential servers for M/R jobs to execute on.  If
> you simply use Openstack replication it will appear to the JobTracker
> that a particular block only exists on a single server and should
> only be executed on that node.  This may have have an impact
> depending on your workload profile.
> 
> Regards,
> Graeme
> 
> On 12/11/11 07:24, Dejan Menges wrote:
> > Replication factor for HDFS can easily be changed to 1 if you don't
> > need it's redundancy in hdfs-site.xml
> >
> > Regards,
> > Dejo
> >
> > Sent from my iPhone
> >
> > On 12. 11. 2011., at 03:58, Edmon Begoli<ebegoli@gmail.com>  wrote:
> >
> >> A question related to standing up cloud infrastructure for running
> >> Hadoop/HDFS.
> >>
> >> We are building up an infrastructure using Openstack which has its
> >> own storage management redundancy.
> >>
> >> We are planning to use Openstack to instantiate Hadoop nodes (HDFS,
> >> M/R tasks, Hive, HBase)
> >> on demand.
> >>
> >> The problem is that HDFS by design creates three copies of the
> >> data, so there is a 4x times redundancy
> >> which we would prefer to avoid.
> >>
> >> I am asking here if anyone has had a similar case and if anyone has
> >> had any helpful solution to recommend.
> >>
> >> Thank you in advance,
> >> Edmon
> 


Mime
View raw message