hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Wolf <opus...@gmail.com>
Subject Re: Confirming a Bug
Date Fri, 23 Mar 2012 12:04:37 GMT
Hi Michel,

I agree it doesn't make sense, but then I believe we are tracking a bug.

I don't know about speculative execution, but I certainly did not switch 
it on.

I am just counting the number of rows that come back in the Result.

If you are interested in this, try my Unit test.  I'd be very interested 
to see if behaves the same for others.

http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java


Here is the output.  It shows how the number of results and key value pairs varies as caching
in changed, and families are included.  It shows the bug starting with 3 families and 5000
caching.  It also shows a new bug, where the query always fails with an IOException with 4
families.

CacheSize FamilyCount ResultCount KeyValueCount
1000 1 10000 10
5000 1 10000 10



On 3/23/12 7:55 AM, Michel Segel wrote:
> Peter, that doesnt make sense.
>
> I mean I believe you in what you are saying, but don't see how a VPN in would cause this
variance in results.
>
> Do you have any speculative execution turned on?
>
> Are you counting just the numbers of rows in the result set, or are you using counters
in the map reduce? (I'm assuming that you are running a map/reduce, and not just a simple
connection and single threaded scan...).
>
> I apologize if this had already been answered, I hadn't been following this too closely.
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Mar 22, 2012, at 8:01 PM, Peter Wolf<opus111@gmail.com>  wrote:
>
>> Hello again Lars and Lars,
>>
>> Here is some additional information that may help you track this down.
>>
>> I think this behavior has something to do with my VPN.  My servers are on the Amazon
Cloud and I normally run my client on my laptop via a VPN (Tunnelblick: OS X 10.7.3; Tunnelblick
3.2.3 (build 2891.2932)).  This is where I see the buggy behavior I describe.
>>
>> However, when my Client is running on an EC2 machine, then I get different behavior.
 I can not prove that it is always correct, but in at least one case my current code does
not work on my laptop, but gets the correct number of results on an EC2 machine.  Note that
my scans are also much faster on the EC2 machine.
>>
>> I will do more tests to see if I can localize it further.
>>
>> Hope this helps
>> Thank you again
>> Peter
>>
>>
>> On 3/19/12 2:24 PM, Peter Wolf wrote:
>>> Hello Lars and Lars,
>>>
>>> Thank you for you help and attention.
>>>
>>> I wrote a standalone test that exhibits the bug.
>>>
>>> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java
>>>
>>> Here is the output.  It shows how the number of results and key value pairs varies
as caching in changed, and families are included.  It shows the bug starting with 3 families
and 5000 caching.  It also shows a new bug, where the query always fails with an IOException
with 4 families.
>>>
>>> CacheSize FamilyCount ResultCount KeyValueCount
>>> 1000 1 10000 10
>>> 5000 1 10000 10


Mime
View raw message