hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Pinckernell <...@illx.org>
Subject Re: Pig load to HBase not invoking coprocessor
Date Sun, 25 Mar 2012 15:53:38 GMT
In the pig 0.9.2 source in HBaseStorage there are 2 put.add(family,
qualifier, ts, value) statements.  Simply invoking the other method
put.add(family, qualifier, value) without the 'ts' that pig creates 'long
ts=System.currentTimeMillis();' fixed the issue.  I now see my coprocessor
code being called.  Looks like those puts were just trying to use an
incorrect row version.

I'll let the pig folks know and create a jira for them.
Thanks for the help!

On Sun, Mar 25, 2012 at 8:47 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Do you know the has() method was called from line 207 or 192 ?
>
> According to pig code:
>    public Put createPut(Object key, byte type) throws IOException {
>        Put put = new Put(objToBytes(key, type));
> Leading to the following ctor:
>  public Put(byte [] row, RowLock rowLock) {
>      this(row, HConstants.LATEST_TIMESTAMP, rowLock);
>
> FYI
>
> On Sun, Mar 25, 2012 at 1:35 AM, Nick Pinckernell <nap@illx.org> wrote:
>
> > Thank you!  That got me in the right direction.  Yes, my region observer
> > overrides prePut().
> >
> > Here is what I found out through debugging the region server:
> > When using the HBase client API the Put has the correct KeyValue
> timestamp
> > (which matches up its Mutation 'ts')
> > but when using Pig to load it, the timestamps do not match up, thus the
> > Put.has() method [line 255] does not return true on line 273 from the
> > following check:
> >
> >        if (Arrays.equals(kv.getFamily(), family) &&
> > Arrays.equals(kv.getQualifier(), qualifier)
> >            && kv.getTimestamp() == ts) {
> >
> > failing on 'kv.getTimestamp() == ts'
> >
> > I'm not yet sure why the KeyValue timestamp (gotten from
> > KeyValue.getTimestamp()) is being set incorrectly from the pig load.
> >
> > On Sat, Mar 24, 2012 at 3:37 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > hbase.mapreduce.TableOutputFormat is used by HBaseStorage
> > > The Put reaches region server and ends up in HRegion.doMiniBatchPut()
> > where
> > > I see:
> > >
> > >    if (coprocessorHost != null) {
> > >      for (int i = 0; i < batchOp.operations.length; i++) {
> > >        Pair<Put, Integer> nextPair = batchOp.operations[i];
> > >        Put put = nextPair.getFirst();
> > >        if (coprocessorHost.prePut(put, walEdit, put.getWriteToWAL())) {
> > >
> > > Was your code in prePut() ?
> > >
> > > Cheers
> > >
> > > On Sat, Mar 24, 2012 at 11:19 AM, Nick Pinckernell <nap@illx.org>
> wrote:
> > >
> > > > Hi, I posted this over at the pig forums and Dmitriy suggested I ask
> on
> > > the
> > > > hbase list as well (original post here:
> > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/pig-user/201203.mbox/ajax/%3CCABsY1jQFaiw%3Dbirw3ZukmdwKmY6EV9z75%2BxSTU_%2BmZsyBwsB2A%40mail.gmail.com%3E
> > > > )
> > > >
> > > > I'm having a possible issue with a simple pig load that writes to an
> > > HBase
> > > > table.  The issue is that when I run the test pig script it does not
> > > invoke
> > > > the region observer coprocessor on the table.  I have verified that
> my
> > > > coprocessor executes when I use the HBase client API to do a simple
> > > put().
> > > >
> > > > Simple pig script is as follows (test.pig):
> > > > register /dev/hbase-0.92.0/hbase-0.92.0.jar;
> > > > register /dev/hbase-0.92.0/lib/zookeeper-3.4.2.jar;
> > > > register /dev/hbase-0.92.0/lib/guava-r09.jar;
> > > > A = load '/tmp/testdata.csv' using PigStorage(',');
> > > > store A into 'hbase://test' using
> > > > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('f:t');
> > > >
> > > > Using the following environment variables and command:
> > > > export HADOOP_HOME=/dev/hadoop-1.0.0
> > > > export PIG_CLASSPATH=/dev/hadoop-1.0.0/conf
> > > > export HBASE_HOME=/dev/hbase-0.92.0/
> > > > export PIG_CLASSPATH="`${HBASE_HOME}/bin/hbase
> > classpath`:$PIG_CLASSPATH"
> > > > /dev/pig-0.9.2/bin/pig -x local -f test.pig
> > > >
> > > > I have also tried 'pig -x mapreduce' and it still does not seem to
> > invoke
> > > > the coprocessor.  After looking through the HBaseStorage class it
> > appears
> > > > that the RecordWriter is getting HBase Put objects and that
> ultimately
> > > > those are getting flushed so I'm not sure why the coprocessor is not
> > > > executing.
> > > >
> > > > Is this by design, or am I missing something about how the output
> from
> > > the
> > > > pig job is being loaded into the HBase table?
> > > > Thank you
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message