hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sundeep Kambhampati <kambh...@cse.ohio-state.edu>
Subject Re: How to speed up Hadoop?
Date Fri, 06 Sep 2013 01:16:56 GMT
On 9/5/2013 8:57 PM, Preethi Vinayak Ponangi wrote:
> Solution 1: Throw more hardware at the cluster. That's the whole point 
> of hadoop.
> Solution 2: Try to optimize the mapreduce jobs. It depends on what 
> kind of jobs you are running.
>
> I wouldn't suggest decreasing the number of replications as it kind of 
> defeats the purpose of using Hadoop. You could do this if you can't 
> get more hardware, are running experimental non-critical 
> non-production data.
>
> What kind of Hadoop monitoring are you talking about?
>
> Regards,
> Vinayak.
>
>
> On Thu, Sep 5, 2013 at 7:51 PM, Chris Embree <cembree@gmail.com 
> <mailto:cembree@gmail.com>> wrote:
>
>     I think you just went backwards.   more replicas (generally
>     speaking) are better.
>
>     I'd take 60 cheap, 1 U servers over 20 "highly fault tolerant"
>     ones for almost every problem.  I'd get them for the same or less
>     $ too.
>
>
>
>
>     On Thu, Sep 5, 2013 at 8:41 PM, Sundeep Kambhampati
>     <kambhamp@cse.ohio-state.edu <mailto:kambhamp@cse.ohio-state.edu>>
>     wrote:
>
>         Hi all,
>
>             I am looking for ways to configure Hadoop inorder to speed
>         up data processing. Assuming all my nodes are highly fault
>         tolerant, will making data replication factor 1 speed up the
>         processing? Are there some way to disable failure monitoring
>         done by Hadoop?
>
>         Thank you for your time.
>
>         -Sundeep
>
>
>
Thank you for your inputs. I can't currently add more hardware.

By monitoring I mean something like speculative execution.

Regards,
Sundeep

Mime
View raw message