hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Issue with bulk loader tool
Date Sat, 07 Nov 2009 23:00:15 GMT
Its what Lars says Murali, a region's startkey is inclusive and its endkey
exclusive.  If it exists, it should be in the region has it for a start key
(It will not be duplicated in both).

For .META., there is usually only one Region instance in a .META. table.
Its startkey will be the empty key so its not suprirising its first key is
different from the empty key.  What do you see when you look at the second
region in your just uploaded table?  I'd expect the key 666629fe4378c096 to
be first in the region whose startkey is 666629fe4378c096.

Thanks for figuring MAPREDUCE-565 could trip us up.  Your hadoop is not
0.20.1?

Yours,
St.Ack



On Sat, Nov 7, 2009 at 7:58 AM, Murali Krishna. P <muralikpbhat@yahoo.com>wrote:

> Thanks Lars for the clarification,
>    But where does the record recide ? Is it duplicated to both the regions
> ?? When I use HFile.Reader, the first key in the second region is different.
> May be this behaviour(overlap) is only in .META. ?
>    The issue is that when I request for that boundary record, it is loging
> the next region.
>
> 09/11/07 07:52:05 DEBUG client.HConnectionManager$TableServers: Cached
> location address: 76.13.20.58:60020, regioninfo: REGION => {NAME =>
> '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192, TABLE =>
> {{NAME => '.META.', IS_META => 'true', MEMSTORE_FLUSHSIZE => '16384',
> FAMILIES => [{NAME => 'historian', VERSIONS => '2147483647', COMPRESSION =>
> 'NONE', TTL => '604800', BLOCKSIZE => '8192', IN_MEMORY => 'false',
> BLOCKCACHE => 'false'}, {NAME => 'info', VERSIONS => '10', COMPRESSION =>
> 'NONE', TTL => '2147483647', BLOCKSIZE => '8192', IN_MEMORY => 'false',
> BLOCKCACHE => 'false'}]}}
> 09/11/07 07:52:05 DEBUG client.HConnectionManager$TableServers: Cached
> location address: 76.13.20.114:60020, regioninfo: REGION => {NAME =>
> 'test12,333305184e0f7c3e,1257515988652', STARTKEY => '333305184e0f7c3e',
> ENDKEY => '666629fe4378c096', ENCODED => 170637321, TABLE => {{NAME =>
> 'test12', FAMILIES => [{NAME => 'image', VERSIONS => '3', COMPRESSION =>
> 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCACHE => 'true'}]}}
>
>  Thanks,
> Murali Krishna
>
>
>
>
> ________________________________
> From: Lars George <lars@worldlingo.com>
> To: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org>
> Sent: Sat, 7 November, 2009 9:19:37 PM
> Subject: Re: Issue with bulk loader tool
>
> Hi Murali,
>
> What you see is normal the last keys do indeed overlap. The last key of a
> region is exclusive and marks the first key of the subsequent region.
>
> Lars
>
> On Nov 7, 2009, at 9:05, "Murali Krishna. P" <muralikpbhat@yahoo.com>
> wrote:
>
> > Hi,
> > I got it resolved. https://issues.apache.org/jira/browse/HADOOP-5750 was
> causing this, even though I supplied a custom total ordering partitioner, it
> didnt use that.
> >
> >
> >   Now the regions looks properly sorted, but facing a new issue. The last
> key of the each region is not retrievable. The table.jsp  page shows the
> start and end key wrongly.
> > for eg, take first 2 regions
> > region1: start : end: 333305184e0f7c3e
> > region2: start: 333305184e0f7c3e end: 666629fe4378c096
> >
> > The end key of first region = start key of second ??
> >
> > If I get the first and last key using HFile.Reader, it shows as follows:
> >
> > HFileUtil /hbase/test12/98766318/image/9052388247118781160
> > FirstKey:00000d7d4f36c112imagevalue�������
> > LastKey:333305184e0f7c3eimagevalue�������
> >
> > HFileUtil /hbase/test12/170637321/image/7602871928600243730
> > FirstKey:33338d45cc2491b8imagevalue�������
> > LastKey:666629fe4378c096imagevalue�������
> >
> > So, according to this first key of 2nd region is 33338d45cc2491b8 not
> 333305184e0f7c3e which is correct!
> >
> > Now when I do a get on 333305184e0f7c3e with debug on, it is loading the
> second region which is wrong!
> >
> > Some thing went wrong with the index?
> >
> > Thanks,
> > Murali Krishna
> >
> >
> >
> >
> > ________________________________
> > From: stack <stack@duboce.net>
> > To: hbase-user@hadoop.apache.org
> > Sent: Sat, 7 November, 2009 6:26:03 AM
> > Subject: Re: Issue with bulk loader tool
> >
> > On Fri, Nov 6, 2009 at 12:58 AM, Murali Krishna. P
> > <muralikpbhat@yahoo.com>wrote:
> >
> >> Hi,
> >> If I increase hbase.hregion.max.filesize so that all the records holds
> in
> >> one region (and one reducer ), all the records as retrievable. If one
> >> reducer creates multiple hfile or multiple reducer creates one hfile
> each,
> >> the problem occurs.
> >>
> >>
> >
> > Multiple hfiles in a region?  Or are you saying if a reducer creates
> > multiple regions?  There is supposed to be one file per region only when
> > done.
> >
> > Thanks for digging in,
> > St.Ack
> >
> >
> >
> >
> >> Does that give any clue?
> >>
> >> Thanks,
> >> Murali Krishna
> >>
> >>
> >>
> >>
> >> ________________________________
> >> From: Murali Krishna. P <muralikpbhat@yahoo.com>
> >> To: hbase-user@hadoop.apache.org
> >> Sent: Thu, 5 November, 2009 6:34:20 PM
> >> Subject: Re: Issue with bulk loader tool
> >>
> >> Hi Stack,
> >> Sorry, could not look into this last week...
> >>
> >> I got problem with the Htable interface as well. Some records i am not
> >> retrieve from Htable as well.
> >> I lost the old table, but reproduced the problem with a different table.
> >>
> >> I cannot send the region since it is very huge. will try to give as much
> >> info as possible here :)
> >>
> >> There are total 5 regions as below in that table:
> >> Name
> >>
> >> Encoded Name
> >> Start Key
> >> End Key
> >> test1,,1257414794600
> >> 106817540
> >> fffe9c7f87c8332a
> >> test1,fffe9c7f87c8332a,1257414794616
> >> 1346846599 fffe9c7f87c8332a fffebe279c0ac4d2
> >> test1,fffebe279c0ac4d2,1257414794628
> >> 1835851728 fffebe279c0ac4d2 fffec418284d6fbc
> >> test1,fffec418284d6fbc,1257414794637
> >> 1078205908 fffec418284d6fbc fffef7a12ea22498
> >> test1,fffef7a12ea22498,1257414794647
> >> 1515378663 fffef7a12ea22498
> >>
> >> I am looking for a key, say 000011d1bc8cd6fe . This should be in the
> first
> >> region ?
> >>
> >> using hfile tool,
> >> org.apache.hadoop.hbase.io.hfile.HFile -k -f
> >> /hbase/test1/106817540/image/3828859735461759684 -v -m -p |  grep
> >> 000011d1bc8cd6fe
> >> The first region doesn't have it. Not sure what happened to that record.
> >>
> >> For a working key, it gives the record properly as below
> >> K:
> >>
> \x00\x100003bdd08ca88ee2\x05imagevalue\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x04
> >> V: \xFF...
> >>
> >> Please let me know if you need more information
> >>
> >> Thanks,
> >> Murali Krishna
> >>
> >>
> >>
> >>
> >> ________________________________
> >> From: stack <stack@duboce.net>
> >> To: hbase-user@hadoop.apache.org
> >> Sent: Mon, 2 November, 2009 11:05:43 PM
> >> Subject: Re: Issue with bulk loader tool
> >>
> >> Murali:
> >>
> >> Any developments worth mentioning?
> >>
> >> St.Ack
> >>
> >>
> >> On Fri, Oct 30, 2009 at 10:14 AM, stack <stack@duboce.net> wrote:
> >>
> >>> That is interesting.  It'd almost point to a shell issue.  Enable DEBUG
> >> so
> >>> client can see it.  Then rerun shell.  Is it at least loading the right
> >>> region?  (The regions start and end keys span the asked for key?).  I
> >> took a
> >>> look at your attached .META. scan.  All looks good there.  The region
> >>> specifications look right.  If you want to bundle up the region that is
> >>> failing -- the one that the failing key comes out of, I can take a look
> >>> here.  You could also try playing with the HFile tool: ./bin/hbase
> >>> org.apache.hadoop.hbase.io.hfile.HFile.  Run the former and it'll
> output
> >>> usage.  You should be able to get it to dump content of the region (You
> >> need
> >>> to supply flags like -v to see actual keys to the HFile tool else it
> just
> >>> runs its check silently).    Check for your key.  Check things like
> >>> timestamp on it.  Maybe its 100 years in advance of now or something?
> >>>
> >>> Yours,
> >>> St.Ack
> >>>
> >>>
> >>> On Fri, Oct 30, 2009 at 9:01 AM, Murali Krishna. P <
> >> muralikpbhat@yahoo.com
> >>>> wrote:
> >>>
> >>>> Attached ".META"
> >>>>
> >>>> Interesting, I was able to get the row from HTable via java code. But
> >> from
> >>>> the shell, still getting following
> >>>>
> >>>> hbase(main):004:0> get 'TestTable2', 'ffffef95bcbf2638'
> >>>> 0 row(s) in 1.2250 seconds
> >>>>
> >>>> Thanks,
> >>>> Murali Krishna
> >>>>
> >>>> Thanks,
> >>>> Murali Krishna
> >>>>
> >>>>
> >>>> ------------------------------
> >>>> *From:* stack <stack@duboce.net>
> >>>> *To:* hbase-user@hadoop.apache.org
> >>>> *Sent:* Fri, 30 October, 2009 8:39:46 PM
> >>>> *Subject:* Re: Issue with bulk loader tool
> >>>>
> >>>> Can you send a listing of ".META."?
> >>>>
> >>>> hbase> scan ".META."
> >>>>
> >>>> Also, can you bring a region down from hdfs, tar and gzip it, and then
> >> put
> >>>> it someplace I can pull so I can take a look?
> >>>>
> >>>> Thanks,
> >>>> St.Ack
> >>>>
> >>>>
> >>>> On Fri, Oct 30, 2009 at 3:31 AM, Murali Krishna. P
> >>>> <muralikpbhat@yahoo.com>wrote:
> >>>>
> >>>>> Hi guys,
> >>>>> I created a table according to hbase-48. A mapreduce job which
> >> creates
> >>>>> HFiles and then used loadtable.rb script to create the table.
> >> Everything
> >>>>> worked fine and i was able to scan the table. But when i do a get
for
> >> a
> >>>> key
> >>>>> displayed in the scan output, it is not retrieving the row. shell
> says
> >> 0
> >>>>> row.
> >>>>>
> >>>>> I tried using one reducer to ensure total ordering, but still same
> >>>> issue.
> >>>>>
> >>>>>
> >>>>> My mapper is like:
> >>>>> context.write(new
> >>>>> ImmutableBytesWritable(((Text)key).toString().getBytes()), new
> >>>>> KeyValue(((Text)key).toString().getBytes(), "family1".getBytes(),
> >>>>>                   "column1".getBytes(), getValueBytes()));
> >>>>>
> >>>>>
> >>>>> Please help me investigate this.
> >>>>>
> >>>>> Thanks,
> >>>>> Murali Krishna
> >>>>>
> >>>>
> >>>
> >>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message