hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Get and Scan return different results in 0.20.2
Date Sat, 23 Jan 2010 00:14:58 GMT
So after an offline discussion and some more discussion on IRC, it was
found that the problem was similar to
http://issues.apache.org/jira/browse/HBASE-29 and was caused by clock
skew. The fact that they set their timestamps exacerbates the problem
because the different clients had wildly different dates; if it was
the region server setting the ts then it would be more consistant.

The resolution for the user is to resolve the clock skew and on the
HBase side we need to make the get behave more like the scan.

J-D

On Fri, Jan 22, 2010 at 12:11 PM, Joost Ouwerkerk <joost@openplaces.org> wrote:
> We do set an explicit timestamp, and I understand that we may be among the
> few in this regard.  We haven't performed any deletes on those rows.  I will
> try flushing and let you know...
>
> On Fri, Jan 22, 2010 at 1:52 PM, Stack <stack@duboce.net> wrote:
>
>> How were cells inserted?  With explicit timestamp?  Any deletes
>> floating around?  If you flush the region, does the behavior change?
>> (See 'tools' in the shell.... do hbase> flush 'regionname'... you'll
>> have to figure out the region that is hosting the row you are looking
>> at).  Can you bundle up the region that these cells are in and pass it
>> to us somehow?
>> St.Ack
>>
>> On Fri, Jan 22, 2010 at 7:56 AM, Joost Ouwerkerk <joost@openplaces.org>
>> wrote:
>> > We're seeing some dangerously inconsistent behaviour in retrieving data
>> from
>> > HBase.  In particular circumstances whose conditions are still unclear,
>> get
>> > and scan (without timestamp params) are returning different versions of a
>> > column.  We are running 0.20.2.  See below for evidence.
>> >
>> > hbase(main):006:0> scan 'generated_pages',{STARTROW=>'240:
>> > http://com.golflink.www/golf-courses/course.aspx?course=1008656
>> > ',LIMIT=>2,COLUMNS=>['attribute:url']}
>> > ROW                          COLUMN+CELL
>> >
>> >  240:http://com.golflink.www column=attribute:url, timestamp=*
>> > 5429280163307928320*, value=\001http://www.golflin
>> >  /golf-courses/course.aspx?c
>> k.com/golf-courses/course.aspx?course=1008656
>> >
>> >  ourse=1008656
>> >
>> > 2 row(s) in 0.0100 seconds
>> >
>> > hbase(main):007:0> get 'generated_pages', '240:
>> > http://com.golflink.www/golf-courses/course.aspx?course=1008656',
>> > COLUMN=>'attribute:url'
>> > timestamp=*5429243797819101088*, value=\001
>> > http://www.golflink.com/golf-courses/course.aspx?course=1008656
>> > 1 row(s) in 0.0020 seconds
>> >
>> > Any ideas about how this is possible?
>> >
>> > joost.
>> >
>>
>

Mime
View raw message