hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Kelkar <rohitkel...@gmail.com>
Subject Re: Scanner problem after bulk load hfile
Date Tue, 16 Jul 2013 23:39:20 GMT
Now its working correctly. I had to do a
myTableWriter.appendTrackedTimestampsToMetadata(); after writing my KVs and
before closing the file.

- R


On Tue, Jul 16, 2013 at 6:20 PM, Rohit Kelkar <rohitkelkar@gmail.com> wrote:

> Oh wait. Didn't realise that I had the HbaseAdmin major compact code
> turned on when I tested :(
> It is still not working. Following is the code
>
> StoreFile.Writer myHfileWriter = new StoreFile.WriterBuilder(hbaseConf,
> new CacheConfig(hbaseConf), hdfs,
> HFile.DEFAULT_BLOCKSIZE).withFilePath(myHFilePath).build();
> KeyValue kv = new KeyValue(row.getBytes(), cf.getBytes(),
> keyStr.getBytes(), System.currentTimeMillis(), valueStr.getBytes());
> myHfileWriter.close()
>
> - R
>
>
> On Tue, Jul 16, 2013 at 6:15 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Looks like the following should be put in RefGuide.
>>
>> Cheers
>>
>> On Tue, Jul 16, 2013 at 3:40 PM, lars hofhansl <larsh@apache.org> wrote:
>>
>> > Hah. Was *just* about to reply with this. The fix in HBASE-8055 is not
>> > strictly necessary.
>> > How did you create your HFiles? See this comment:
>> >
>> https://issues.apache.org/jira/browse/HBASE-8055?focusedCommentId=13600499&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13600499
>> >
>> > -- Lars
>> > ________________________________
>> > From: Jimmy Xiang <jxiang@cloudera.com>
>> > To: user <user@hbase.apache.org>
>> > Sent: Tuesday, July 16, 2013 2:41 PM
>> > Subject: Re: Scanner problem after bulk load hfile
>> >
>> >
>> > HBASE-8055 should have fixed it.
>> >
>> >
>> > On Tue, Jul 16, 2013 at 2:33 PM, Rohit Kelkar <rohitkelkar@gmail.com>
>> > wrote:
>> >
>> > > This ( http://pastebin.com/yhx4apCG ) is the error on the region
>> server
>> > > side when execute the following on the shell -
>> > > get 'mytable', 'myrow', 'cf:q'
>> > >
>> > > - R
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Jul 16, 2013 at 3:28 PM, Jimmy Xiang <jxiang@cloudera.com>
>> > wrote:
>> > >
>> > > > Do you see any exception/logging in the region server side?
>> > > >
>> > > >
>> > > > On Tue, Jul 16, 2013 at 1:15 PM, Rohit Kelkar <
>> rohitkelkar@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Yes. I tried everything from myTable.flushCommits() to
>> > > > > myTable.clearRegionCache() before and after the
>> > > > > LoadIncrementalHFiles.doBulkLoad(). But it doesn't seem to work.
>> This
>> > > is
>> > > > > what I am doing right now to get things moving although I think
>> this
>> > > may
>> > > > > not be the recommended approach -
>> > > > >
>> > > > > HBaseAdmin hbaseAdmin = new HBaseAdmin(hbaseConf);
>> > > > > hbaseAdmin.majorCompact(myTableName.getBytes());
>> > > > > myTable.close();
>> > > > > hbaseAdmin.close();
>> > > > >
>> > > > > - R
>> > > > >
>> > > > >
>> > > > > On Mon, Jul 15, 2013 at 9:14 AM, Amit Sela <amits@infolinks.com>
>> > > wrote:
>> > > > >
>> > > > > > Well, I know it's kind of voodoo but try it once before
>> pre-split
>> > and
>> > > > > once
>> > > > > > after. Worked for me.
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jul 15, 2013 at 7:27 AM, Rohit Kelkar <
>> > rohitkelkar@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Thanks Amit, I am also using 0.94.2 . I am also pre-splitting
>> > and I
>> > > > > tried
>> > > > > > > the table.clearRegionCache() but still doesn't work.
>> > > > > > >
>> > > > > > > - R
>> > > > > > >
>> > > > > > >
>> > > > > > > On Sun, Jul 14, 2013 at 3:45 AM, Amit Sela <
>> amits@infolinks.com>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > If new regions are created during the bulk load
(are you
>> > > > > pre-splitting
>> > > > > > > ?),
>> > > > > > > > maybe try myTable.clearRegionCache() after the
bulk load (or
>> > even
>> > > > > after
>> > > > > > > the
>> > > > > > > > pre-splitting if you do pre-split).
>> > > > > > > > This should clear the region cache. I needed to
use this
>> > because
>> > > I
>> > > > am
>> > > > > > > > pre-splitting my tables for bulk load.
>> > > > > > > > BTW I'm using HBase 0.94.2
>> > > > > > > > Good luck!
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, Jul 12, 2013 at 6:50 PM, Rohit Kelkar
<
>> > > > rohitkelkar@gmail.com
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > I am having problems while scanning a table
created using
>> > > HFile.
>> > > > > > > > > This is what I am doing -
>> > > > > > > > > Once Hfile is created I use following code
to bulk load
>> > > > > > > > >
>> > > > > > > > > LoadIncrementalHFiles loadTool = new
>> > > LoadIncrementalHFiles(conf);
>> > > > > > > > > HTable myTable = new HTable(conf, mytablename.getBytes());
>> > > > > > > > > loadTool.doBulkLoad(new Path(outputHFileBaseDir
+ "/" +
>> > > > > mytablename),
>> > > > > > > > > mytableTable);
>> > > > > > > > >
>> > > > > > > > > Then scan the table using-
>> > > > > > > > >
>> > > > > > > > > HTable table = new HTable(conf, mytable);
>> > > > > > > > > Scan scan = new Scan();
>> > > > > > > > > scan.addColumn("cf".getBytes(), "q".getBytes());
>> > > > > > > > > ResultScanner scanner = table.getScanner(scan);
>> > > > > > > > > for (Result rr = scanner.next(); rr != null;
rr =
>> > > > scanner.next()) {
>> > > > > > > > > numRowsScanned += 1;
>> > > > > > > > > }
>> > > > > > > > >
>> > > > > > > > > This code crashes with following error -
>> > > > > > http://pastebin.com/SeKAeAST
>> > > > > > > > > If I remove the scan.addColumn from the code
then the code
>> > > works.
>> > > > > > > > >
>> > > > > > > > > Similarly on the hbase shell -
>> > > > > > > > > - A simple count 'mytable' in hbase shell
gives the
>> correct
>> > > > count.
>> > > > > > > > > - A scan 'mytable' gives correct results.
>> > > > > > > > > - get 'mytable', 'myrow', 'cf:q' crashes
>> > > > > > > > >
>> > > > > > > > > The hadoop dfs -ls /hbase/mytable shows the
.tableinfo,
>> .tmp,
>> > > the
>> > > > > > > > directory
>> > > > > > > > > for region etc.
>> > > > > > > > >
>> > > > > > > > > Now if I do a major_compact 'mytable' and
then execute my
>> > code
>> > > > with
>> > > > > > the
>> > > > > > > > > scan.addColumn statement then it works. Also
the get
>> > 'mytable',
>> > > > > > > 'myrow',
>> > > > > > > > > 'cf:q' works.
>> > > > > > > > >
>> > > > > > > > > My question is
>> > > > > > > > > What is major_compact doing to enable the
scanner that the
>> > > > > > > > > LoadIncrementalFiles tool is not? I am sure
I am missing a
>> > step
>> > > > > after
>> > > > > > > the
>> > > > > > > > > LoadIncrementalFiles.
>> > > > > > > > >
>> > > > > > > > > - R
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message