hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject scanner is returning everything in parent region plus one of the daughters?
Date Sun, 14 Jun 2009 16:52:39 GMT
This possibly belongs in one of the new existing/open issues put up over the
past few days:

Insert 1000 rows with random row keys, and induce a split (see test.rb
attached to HBASE-1500). I would expect that no more than 1000 rows should
be returned from a row count. However, the following is a series of row
counts obtained after running the test, with total reinitialization in 
between, 5 times:


Also the shell provides an additional clue:

    Current count: 1000, row: ffdcee2a75742697b375edef62fa4b75

    1516 row(s) in 2.9530 seconds

Looks like the parent region is fully iterated first, then in addition 
one of the daughters?

Also, as these issues come up, kindly consider adding test cases to the
test suite to catch these regressions. It seems the current coverage for
scanners is letting big issues pass unnoticed.

One thing we could do right away is commit my 'test.rb' reimplemented
as Java/JUnit into the suite, with some additional logic to test that
the scanners return the count of unique row keys inserted. If no -1 I
will go ahead and do that. 

  - Andy

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message