hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: Coprocessors causes data access delay?
Date Mon, 18 Apr 2016 10:36:14 GMT
>>store.getSmallestReadPoint()

This is the problem.   At the time of write some other readers (opened
in past) were present, then all those are present readPnts and u will
get min of all such read points.

Its a pity that we are not passing the read point to the CP pre hook.
That is obtained in RegionScannerImpl.

If u can change the HBase code and build,  can just try passing the
readPnt also to CP and use that to open this Scanner and test once?

I refer below code where we call CP pre hook

public KeyValueScanner getScanner(Scan scan,
      final NavigableSet<byte []> targetCols, long readPt) throws IOException {
    lock.readLock().lock();
    try {
      KeyValueScanner scanner = null;
      if (this.getCoprocessorHost() != null) {
        scanner = this.getCoprocessorHost().preStoreScannerOpen(this,
scan, targetCols);
      }
      scanner = createScanner(scan, targetCols, readPt, scanner);
      return scanner;
    } finally {
      lock.readLock().unlock();
    }
  }



-Anoop-

On Mon, Apr 18, 2016 at 12:56 PM, Jozef Vilcek <jozo.vilcek@gmail.com> wrote:
> I have new information after some more debugging:
>
> I am sure coprocessor is not filtering out values in get/put hooks.
> The observed behavior is:
>   We create non-existing cell via PUT and immediately after the call
>   returns, we are trying to get the created cell value via GET. The GET
>   is done every second in a loop until the cell is returned (the result
>   is not empty).
>   For most of the time the first GET request succeeds, but there are
>   some periods of time when the obtained results are empty and the cell
>   appears just after some time ranging from few seconds to something
>   like 20 minutes.
>   We do such test every minute in a separate process and log the results.
>   Excerpt from such log:
>
>     [put-time 20:19:03] [successful-get-time 20:21:03 = after-120.0-secs]
>     [put-time 20:20:03] [successful-get-time 20:21:03 = after-59.6-secs]
>     [put-time 23:01:02] [successful-get-time 23:05:50 = after-288.2-secs]
>     [put-time 23:02:02] [successful-get-time 23:05:50 = after-228.1-secs]
>     [put-time 23:03:02] [successful-get-time 23:05:50 = after-167.9-secs]
>     [put-time 23:04:02] [successful-get-time 23:05:50 = after-107.7-secs]
>     [put-time 23:05:02] [successful-get-time 23:05:50 = after-47.5-secs]
>     [put-time 23:08:02] [successful-get-time 23:31:59 = after-1436.9-secs]
>     [put-time 23:09:02] [successful-get-time 23:31:59 = after-1376.8-secs]
>
>   Note that the cells seem to appear in "waves", multiple at the same time.
>   Unsuccessful GETs (those between the PUT and the final successful GET)
>   don't timeout and return just empty result without any cells.
>
>   Coprocessor is logging "trafic" in pre/post get/put hooks for tested
> keys.
>   PUTs and GETs are always observed immediately client send requests, for
>   for problematic cases, coprocessor does not see any data, yet, for GETs
>
>
> Problem is somehow related to coprocessor itself. When I switch to
> coprocessor which does only logging, observed data delays disappear.
> However, observed delays are happening only sporadically for small period
> of time. Like it is triggered by some other action (compaction, hbase being
> under load, ...?).
>
> I can not share the full code, but "simplified" coprocessor, which also
> produced this behavior looks like this:
>
> RegionObserver coprocessor, which just creates a new scanner with dummy
> filter in its preStoreScannerOpen, preFlushScannerOpen and
> preCompactScannerOpen:
>
>   preStoreScannerOpen:
>     return new StoreScanner(store, store.getScanInfo(),
>       scan.setFilter(new DummyFilter()), targetCols,
>  store.getSmallestReadPoint());
>
>   preFlushScannerOpen:
>     return new StoreScanner(store, store.getScanInfo(), new
> Scan().setFilter(new DummyFilter()),
>       Collections.singletonList(memstoreScanner),
>       ScanType.COMPACT_RETAIN_DELETES, store.getSmallestReadPoint(),
>       HConstants.OLDEST_TIMESTAMP);
>
>   preCompactScannerOpen:
>     return new StoreScanner(store, store.getScanInfo(),
>       new Scan().setFilter(new DummyFilter()), scanners, scanType,
>       store.getSmallestReadPoint(), earliestPutTs);
>
> Besides some logging, the DummyFilter efectively just returns
> ReturnCode.INCLUDE
> in its filterKeyValue() method.
>
> Shooting blind. Can it be that key under some circumstances can not be seen
> by StoreScanner until it is flushed somewhere?
>
> Many thanks in advance,
> Jozo
>
> On Fri, Apr 1, 2016 at 10:47 AM, Anoop John <anoop.hbase@gmail.com> wrote:
>
>> Can you provide us the code u have written for the pre hooks + the
>> custom filter code.  You sure the filter is not filtering out this
>> recent cell?  Add some logs there and check?
>>
>> -Anoop-
>>
>> On Thu, Mar 31, 2016 at 12:57 PM, Jozef Vilcek <jozo.vilcek@gmail.com>
>> wrote:
>> > Test was done via hbase shell prompt. I am not sure on it's behavior. Can
>> > it hide some timeouts and return nothing? I hope not.
>> > I will try to reproduce it with java API.
>> >
>> > Coprocessor does not do anything asynchronous in it's hooks. It is hooked
>> > only on preStoreScannerOpen, preFlushScannerOpen, preCompactScannerOpen.
>> > What it does is, create a StoreScanner with custom filter, which skips
>> > certain cells based on fixed rules line qualifier name pattern or how old
>> > cell is.
>> >
>> >
>> >> Do you wait for the Put to return before you issue the Get?  Or have you
>> >> disabled auto flushing on your table instance --
>> HTable.setAutoFlush(false)?
>> >
>> >> Coprocessor hooks executed on the Put and Get paths are blocking, so
>> unless
>> >> you're overriding normal processing in the prePut() hook and doing
>> >> something asynchronous, the Put should be there by the time it returns
>> to
>> >> the client.  But it all depends on what is your coprocessor actually
>> >> doing.  Can you tell us more about that?
>> >
>> >> On Tue, Mar 29, 2016 at 6:22 AM Ted Yu <yuzhihong@gmail.com> wrote:
>> >
>> >> > Is it possible to come up with unit test showing this behavior using>
>> > simplified coprocessor ?> >> > Thanks> >> > > On
Mar 29, 2016, at 5:42
>> AM, Jozef Vilcek <jozo.vilcek@gmail.com> wrote:> > >> > >
Hi,> > >> > > I
>> need a help to shed some light on following observation:> > >> > >
Through
>> hbase shell I do PUT to a non existing cell (create) and> > immediate> >
>
>> GET.> > > When doing this against hbase table without a coprocessor, every
>> GET> > shows> > > the value immediately. When I enable a coprocessor
on the
>> table and> > > perform> > > the same check, I do not get a value
for every
>> GET. Sometimes GET returns> > > nothing, like cell is not there. If I delay
>> GET few minutes after PUT,> > then> > > data are correctly retrieved.>
> >>
>> > > I want to understand what is going on.> > > Is coprocessor is
doing
>> something wrong? Is this somehow normal? or ... ?> > >> > > Coprocessor
is
>> a RegionObserver doing data filtering / expiration based> > on> > >
>> predefined rules.> > > HBase version is 1.0.0 from cloudera cdh5.5.1
>> distibution.> > >> > > Thanks,> > > Jozo
>> >>
>>

Mime
View raw message