hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject Re: Confirming a Bug
Date Fri, 23 Mar 2012 18:41:02 GMT

Speculative execution is on by default.

http://hbase.apache.org/book.html#mapreduce.specex





On 3/23/12 8:04 AM, "Peter Wolf" <opus111@gmail.com> wrote:

>Hi Michel,
>
>I agree it doesn't make sense, but then I believe we are tracking a bug.
>
>I don't know about speculative execution, but I certainly did not switch
>it on.
>
>I am just counting the number of rows that come back in the Result.
>
>If you are interested in this, try my Unit test.  I'd be very interested
>to see if behaves the same for others.
>
>http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java
>
>
>Here is the output.  It shows how the number of results and key value
>pairs varies as caching in changed, and families are included.  It shows
>the bug starting with 3 families and 5000 caching.  It also shows a new
>bug, where the query always fails with an IOException with 4 families.
>
>CacheSize FamilyCount ResultCount KeyValueCount
>1000 1 10000 10
>5000 1 10000 10
>
>
>
>On 3/23/12 7:55 AM, Michel Segel wrote:
>> Peter, that doesnt make sense.
>>
>> I mean I believe you in what you are saying, but don't see how a VPN in
>>would cause this variance in results.
>>
>> Do you have any speculative execution turned on?
>>
>> Are you counting just the numbers of rows in the result set, or are you
>>using counters in the map reduce? (I'm assuming that you are running a
>>map/reduce, and not just a simple connection and single threaded
>>scan...).
>>
>> I apologize if this had already been answered, I hadn't been following
>>this too closely.
>>
>> Sent from a remote device. Please excuse any typos...
>>
>> Mike Segel
>>
>> On Mar 22, 2012, at 8:01 PM, Peter Wolf<opus111@gmail.com>  wrote:
>>
>>> Hello again Lars and Lars,
>>>
>>> Here is some additional information that may help you track this down.
>>>
>>> I think this behavior has something to do with my VPN.  My servers are
>>>on the Amazon Cloud and I normally run my client on my laptop via a VPN
>>>(Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)).  This
>>>is where I see the buggy behavior I describe.
>>>
>>> However, when my Client is running on an EC2 machine, then I get
>>>different behavior.  I can not prove that it is always correct, but in
>>>at least one case my current code does not work on my laptop, but gets
>>>the correct number of results on an EC2 machine.  Note that my scans
>>>are also much faster on the EC2 machine.
>>>
>>> I will do more tests to see if I can localize it further.
>>>
>>> Hope this helps
>>> Thank you again
>>> Peter
>>>
>>>
>>> On 3/19/12 2:24 PM, Peter Wolf wrote:
>>>> Hello Lars and Lars,
>>>>
>>>> Thank you for you help and attention.
>>>>
>>>> I wrote a standalone test that exhibits the bug.
>>>>
>>>> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java
>>>>
>>>> Here is the output.  It shows how the number of results and key value
>>>>pairs varies as caching in changed, and families are included.  It
>>>>shows the bug starting with 3 families and 5000 caching.  It also
>>>>shows a new bug, where the query always fails with an IOException with
>>>>4 families.
>>>>
>>>> CacheSize FamilyCount ResultCount KeyValueCount
>>>> 1000 1 10000 10
>>>> 5000 1 10000 10
>
>



Mime
View raw message