hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "George P. Stathis" <gstat...@traackr.com>
Subject Re: Latency related configs for 0.90
Date Wed, 20 Apr 2011 13:15:13 GMT
Sorry to bump this, but we could really use a hand here. Right now, we have
a very hard time seeing repeatable read/write consistency. Any suggestions
are welcome.


On Tue, Apr 19, 2011 at 3:08 PM, George P. Stathis <gstathis@traackr.com>wrote:

> Hi all,
> In this chapter of our 0.89 to 0.90 migration saga, we are seeing what we
> suspect might be latency related artifacts.
> The setting:
>    - Our EC2 dev environment running our CI builds
>    - CDH3 U0 (both hadoop and hbase) setup in pseudo-clustered mode
> We have several unit tests that have started mysteriously failing in random
> ways as soon as we migrated our EC2 CI build to the new 0.90 CDH3. Those
> tests used to run against 0.89 and never failed before. They also run OK on
> our local macbooks. On EC2, we are seeing lots of issues where the setup
> data is not being persisted in time for the tests to assert against them.
> They are also not always being torn down properly.
> We first suspected our new code around secondary indexes; we do have
> extensive unit tests around it that provide us with a solid level of
> confidence that it works properly in our CRUD scenarios. We also performance
> tested against the old hbase-trx contrib code and our new secondary indexes
> seem to be running slightly faster as well (of course, that could be due to
> the bump from 0.89 to 0.90).
> We first started seeing issues running our hudson build on the same machine
> as the hbase pseudo-cluster. We figured that was putting too much load on
> the box, so we created a separate large instance on EC2 to host just the
> 0.90 stack. This migration nearly quadrupled the number of unit tests
> failing at times. The only difference between for first and second CI setup
> is the network in between.
> Before we start tearing down our code line by line, I'd like to see if
> there are latency related configuration tweaks we could try to make the
> setup more resilient to network lag. Are there any hbase/zookepper settings
> that might help? For instance, we see things such as HBASE_SLAVE_SLEEP
> in hbase-env.sh . Can that help?
> Any suggestions are more than welcome. Also, the overview above may not be
> enough to go on, so please let me know if I could provide more details.
> Thank you in advance for any help.
> -GS

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message