hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Leffel" <daniel.lef...@gmail.com>
Subject Re: Writes - Poor performance on EC2
Date Thu, 21 Aug 2008 20:48:12 GMT
As a person with way too much experience with small instances on EC2,
let me tell you, go with x-large instances and make sure they are all
in the same availability zone. You'll have a much happier experience.

On 8/21/08, Jim Kellerman <jim@powerset.com> wrote:
> > -----Original Message-----
>  > From: Manish Katyal [mailto:manish.katyal@gmail.com]
>  > Sent: Thursday, August 21, 2008 11:24 AM
>  > To: hbase-user@hadoop.apache.org
>  > Subject: Re: Writes - Poor performance on EC2
>  >
>  > By looking at the iostat numbers, it appears the problem is that my data is
>  > being inserted in the reduce step - as a result only 2 of the region servers
>  > (# equal to tasktrackers) are being used at any given time (in fact, are
>  > getting slammed while the others are idle).
>  > I guess the solution is:
>  > - either randomly sort the data so the writes will be performed against
>  > different region servers (load balancing). The downside, the writes will
>  > take longer.
>
>
> This is not true at all. Random writes are run the same speed as sequential writes, and
causes region splits sooner so that they will actually perform better than sequential writes
during a bulk upload.
>
>

Mime
View raw message