hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Issue with bulk loader tool
Date Sun, 08 Nov 2009 06:51:48 GMT
So, do you think we are dropping the first key in the region?
Thanks,
St.Ack

On Sat, Nov 7, 2009 at 9:17 PM, Murali Krishna. P <muralikpbhat@yahoo.com>wrote:

> No, the first key is 6666909d611e8d7e for the region which says startKey is
> 666629fe4378c096.
> (this is actually the next key in the order).
>
> HFile -p:-
> Scanning -> /hbase/test12/336573097/image/2362265315474952099
> K: \x00\x106666909d611e8d7e\x05imagevalue\x7F\x..
>
>  HFileUtil /hbase/test12/336573097/image/2362265315474952099 :-
> FirstKey:6666909d611e8d7eimagevalue�������
> LastKey:99998c8f356b0d86imagevalue�������
>
> But the scan .META. shows the start key as 666629fe4378c096. (attached
> .META.)
>
> This seems to be the case for all the regions. (the actual firstKey is next
> one from claimed firstKey)
>
> I am on hadoop0.20.0
>
> Thanks,
> Murali Krishna
>
>
> ------------------------------
> *From:* stack <stack@duboce.net>
> *To:* hbase-user@hadoop.apache.org
> *Sent:* Sun, 8 November, 2009 4:30:15 AM
>
> *Subject:* Re: Issue with bulk loader tool
>
> Its what Lars says Murali, a region's startkey is inclusive and its endkey
> exclusive.  If it exists, it should be in the region has it for a start key
> (It will not be duplicated in both).
>
> For .META., there is usually only one Region instance in a .META. table.
> Its startkey will be the empty key so its not suprirising its first key is
> different from the empty key.  What do you see when you look at the second
> region in your just uploaded table?  I'd expect the key 666629fe4378c096 to
> be first in the region whose startkey is 666629fe4378c096.
>
> Thanks for figuring MAPREDUCE-565 could trip us up.  Your hadoop is not
> 0.20.1?
>
> Yours,
> St.Ack
>
>
>
> On Sat, Nov 7, 2009 at 7:58 AM, Murali Krishna. P <muralikpbhat@yahoo.com
> >wrote:
>
> > Thanks Lars for the clarification,
> >    But where does the record recide ? Is it duplicated to both the
> regions
> > ?? When I use HFile.Reader, the first key in the second region is
> different.
> > May be this behaviour(overlap) is only in .META. ?
> >    The issue is that when I request for that boundary record, it is
> loging
> > the next region.
> >
> > 09/11/07 07:52:05 DEBUG client.HConnectionManager$TableServers: Cached
> > location address: 76.13.20.58:60020, regioninfo: REGION => {NAME =>
> > '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192, TABLE
> =>
> > {{NAME => '.META.', IS_META => 'true', MEMSTORE_FLUSHSIZE => '16384',
> > FAMILIES => [{NAME => 'historian', VERSIONS => '2147483647', COMPRESSION
> =>
> > 'NONE', TTL => '604800', BLOCKSIZE => '8192', IN_MEMORY => 'false',
> > BLOCKCACHE => 'false'}, {NAME => 'info', VERSIONS => '10', COMPRESSION
=>
> > 'NONE', TTL => '2147483647', BLOCKSIZE => '8192', IN_MEMORY => 'false',
> > BLOCKCACHE => 'false'}]}}
> > 09/11/07 07:52:05 DEBUG client.HConnectionManager$TableServers: Cached
> > location address: 76.13.20.114:60020, regioninfo: REGION => {NAME =>
> > 'test12,333305184e0f7c3e,1257515988652', STARTKEY => '333305184e0f7c3e',
> > ENDKEY => '666629fe4378c096', ENCODED => 170637321, TABLE => {{NAME =>
> > 'test12', FAMILIES => [{NAME => 'image', VERSIONS => '3', COMPRESSION =>
> > 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> > BLOCKCACHE => 'true'}]}}
> >
> >  Thanks,
> > Murali Krishna
> >
> >
> >
> >
> > ________________________________
> > From: Lars George <lars@worldlingo.com>
> > To: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org>
> > Sent: Sat, 7 November, 2009 9:19:37 PM
> > Subject: Re: Issue with bulk loader tool
> >
> > Hi Murali,
> >
> > What you see is normal the last keys do indeed overlap. The last key of a
> > region is exclusive and marks the first key of the subsequent region.
> >
> > Lars
> >
> > On Nov 7, 2009, at 9:05, "Murali Krishna. P" <muralikpbhat@yahoo.com>
> > wrote:
> >
> > > Hi,
> > > I got it resolved. https://issues.apache.org/jira/browse/HADOOP-5750was
> > causing this, even though I supplied a custom total ordering partitioner,
> it
> > didnt use that.
> > >
> > >
> > >  Now the regions looks properly sorted, but facing a new issue. The
> last
> > key of the each region is not retrievable. The table.jsp  page shows the
> > start and end key wrongly.
> > > for eg, take first 2 regions
> > > region1: start : end: 333305184e0f7c3e
> > > region2: start: 333305184e0f7c3e end: 666629fe4378c096
> > >
> > > The end key of first region = start key of second ??
> > >
> > > If I get the first and last key using HFile.Reader, it shows as
> follows:
> > >
> > > HFileUtil /hbase/test12/98766318/image/9052388247118781160
> > > FirstKey:00000d7d4f36c112imagevalue�������
> > > LastKey:333305184e0f7c3eimagevalue�������
> > >
> > > HFileUtil /hbase/test12/170637321/image/7602871928600243730
> > > FirstKey:33338d45cc2491b8imagevalue�������
> > > LastKey:666629fe4378c096imagevalue�������
> > >
> > > So, according to this first key of 2nd region is 33338d45cc2491b8 not
> > 333305184e0f7c3e which is correct!
> > >
> > > Now when I do a get on 333305184e0f7c3e with debug on, it is loading
> the
> > second region which is wrong!
> > >
> > > Some thing went wrong with the index?
> > >
> > > Thanks,
> > > Murali Krishna
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: stack <stack@duboce.net>
> > > To: hbase-user@hadoop.apache.org
> > > Sent: Sat, 7 November, 2009 6:26:03 AM
> > > Subject: Re: Issue with bulk loader tool
> > >
> > > On Fri, Nov 6, 2009 at 12:58 AM, Murali Krishna. P
> > > <muralikpbhat@yahoo.com>wrote:
> > >
> > >> Hi,
> > >> If I increase hbase.hregion.max.filesize so that all the records holds
> > in
> > >> one region (and one reducer ), all the records as retrievable. If one
> > >> reducer creates multiple hfile or multiple reducer creates one hfile
> > each,
> > >> the problem occurs.
> > >>
> > >>
> > >
> > > Multiple hfiles in a region?  Or are you saying if a reducer creates
> > > multiple regions?  There is supposed to be one file per region only
> when
> > > done.
> > >
> > > Thanks for digging in,
> > > St.Ack
> > >
> > >
> > >
> > >
> > >> Does that give any clue?
> > >>
> > >> Thanks,
> > >> Murali Krishna
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: Murali Krishna. P <muralikpbhat@yahoo.com>
> > >> To: hbase-user@hadoop.apache.org
> > >> Sent: Thu, 5 November, 2009 6:34:20 PM
> > >> Subject: Re: Issue with bulk loader tool
> > >>
> > >> Hi Stack,
> > >> Sorry, could not look into this last week...
> > >>
> > >> I got problem with the Htable interface as well. Some records i am not
> > >> retrieve from Htable as well.
> > >> I lost the old table, but reproduced the problem with a different
> table.
> > >>
> > >> I cannot send the region since it is very huge. will try to give as
> much
> > >> info as possible here :)
> > >>
> > >> There are total 5 regions as below in that table:
> > >> Name
> > >>
> > >> Encoded Name
> > >> Start Key
> > >> End Key
> > >> test1,,1257414794600
> > >> 106817540
> > >> fffe9c7f87c8332a
> > >> test1,fffe9c7f87c8332a,1257414794616
> > >> 1346846599 fffe9c7f87c8332a fffebe279c0ac4d2
> > >> test1,fffebe279c0ac4d2,1257414794628
> > >> 1835851728 fffebe279c0ac4d2 fffec418284d6fbc
> > >> test1,fffec418284d6fbc,1257414794637
> > >> 1078205908 fffec418284d6fbc fffef7a12ea22498
> > >> test1,fffef7a12ea22498,1257414794647
> > >> 1515378663 fffef7a12ea22498
> > >>
> > >> I am looking for a key, say 000011d1bc8cd6fe . This should be in the
> > first
> > >> region ?
> > >>
> > >> using hfile tool,
> > >> org.apache.hadoop.hbase.io.hfile.HFile -k -f
> > >> /hbase/test1/106817540/image/3828859735461759684 -v -m -p |  grep
> > >> 000011d1bc8cd6fe
> > >> The first region doesn't have it. Not sure what happened to that
> record.
> > >>
> > >> For a working key, it gives the record properly as below
> > >> K:
> > >>
> >
> \x00\x100003bdd08ca88ee2\x05imagevalue\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x04
> > >> V: \xFF...
> > >>
> > >> Please let me know if you need more information
> > >>
> > >> Thanks,
> > >> Murali Krishna
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: stack <stack@duboce.net>
> > >> To: hbase-user@hadoop.apache.org
> > >> Sent: Mon, 2 November, 2009 11:05:43 PM
> > >> Subject: Re: Issue with bulk loader tool
> > >>
> > >> Murali:
> > >>
> > >> Any developments worth mentioning?
> > >>
> > >> St.Ack
> > >>
> > >>
> > >> On Fri, Oct 30, 2009 at 10:14 AM, stack <stack@duboce.net> wrote:
> > >>
> > >>> That is interesting.  It'd almost point to a shell issue.  Enable
> DEBUG
> > >> so
> > >>> client can see it.  Then rerun shell.  Is it at least loading the
> right
> > >>> region?  (The regions start and end keys span the asked for key?).
 I
> > >> took a
> > >>> look at your attached .META. scan.  All looks good there.  The region
> > >>> specifications look right.  If you want to bundle up the region that
> is
> > >>> failing -- the one that the failing key comes out of, I can take a
> look
> > >>> here.  You could also try playing with the HFile tool: ./bin/hbase
> > >>> org.apache.hadoop.hbase.io.hfile.HFile.  Run the former and it'll
> > output
> > >>> usage.  You should be able to get it to dump content of the region
> (You
> > >> need
> > >>> to supply flags like -v to see actual keys to the HFile tool else it
> > just
> > >>> runs its check silently).    Check for your key.  Check things like
> > >>> timestamp on it.  Maybe its 100 years in advance of now or something?
> > >>>
> > >>> Yours,
> > >>> St.Ack
> > >>>
> > >>>
> > >>> On Fri, Oct 30, 2009 at 9:01 AM, Murali Krishna. P <
> > >> muralikpbhat@yahoo.com
> > >>>> wrote:
> > >>>
> > >>>> Attached ".META"
> > >>>>
> > >>>> Interesting, I was able to get the row from HTable via java code.
> But
> > >> from
> > >>>> the shell, still getting following
> > >>>>
> > >>>> hbase(main):004:0> get 'TestTable2', 'ffffef95bcbf2638'
> > >>>> 0 row(s) in 1.2250 seconds
> > >>>>
> > >>>> Thanks,
> > >>>> Murali Krishna
> > >>>>
> > >>>> Thanks,
> > >>>> Murali Krishna
> > >>>>
> > >>>>
> > >>>> ------------------------------
> > >>>> *From:* stack <stack@duboce.net>
> > >>>> *To:* hbase-user@hadoop.apache.org
> > >>>> *Sent:* Fri, 30 October, 2009 8:39:46 PM
> > >>>> *Subject:* Re: Issue with bulk loader tool
> > >>>>
> > >>>> Can you send a listing of ".META."?
> > >>>>
> > >>>> hbase> scan ".META."
> > >>>>
> > >>>> Also, can you bring a region down from hdfs, tar and gzip it, and
> then
> > >> put
> > >>>> it someplace I can pull so I can take a look?
> > >>>>
> > >>>> Thanks,
> > >>>> St.Ack
> > >>>>
> > >>>>
> > >>>> On Fri, Oct 30, 2009 at 3:31 AM, Murali Krishna. P
> > >>>> <muralikpbhat@yahoo.com>wrote:
> > >>>>
> > >>>>> Hi guys,
> > >>>>> I created a table according to hbase-48. A mapreduce job which
> > >> creates
> > >>>>> HFiles and then used loadtable.rb script to create the table.
> > >> Everything
> > >>>>> worked fine and i was able to scan the table. But when i do
a get
> for
> > >> a
> > >>>> key
> > >>>>> displayed in the scan output, it is not retrieving the row.
shell
> > says
> > >> 0
> > >>>>> row.
> > >>>>>
> > >>>>> I tried using one reducer to ensure total ordering, but still
same
> > >>>> issue.
> > >>>>>
> > >>>>>
> > >>>>> My mapper is like:
> > >>>>> context.write(new
> > >>>>> ImmutableBytesWritable(((Text)key).toString().getBytes()),
new
> > >>>>> KeyValue(((Text)key).toString().getBytes(), "family1".getBytes(),
> > >>>>>                  "column1".getBytes(), getValueBytes()));
> > >>>>>
> > >>>>>
> > >>>>> Please help me investigate this.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Murali Krishna
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message