hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Batch returned value and exception handling.
Date Fri, 15 Mar 2013 00:14:26 GMT
ReplicationSink.batch() is calling the HTable.batch(list) without
looking at the results. The idea is, should we allow something like
HTableInterface.batch(List<>, null) for the cases where we don't need
to retrieve the result of the calls.

Today, on the HConnectionManager.processBatchCallback() you will get a
NPE right from the first line "if (results.length != list.size())".

If I have 1000 increments to send and have nothing planned in case
they fail, do I really want to create an array of 1000 objects for
nothing, iterate over it, etc. when it might have been possible to
simply drop it?

If results == null, in HConnectionManager.processBatchCallback we can
use workingList in step 3 instead of iterating again in step 4.

I will try to explain that a bit more in the JIRA.

JM


2013/3/14 Ted Yu <yuzhihong@gmail.com>:
> bq.  Should we mark it as deprecated in the interface too?
>
> Yes. That was my intention.
>
> I am not clear about your second suggestion, though.
>
> Cheers
>
> On Thu, Mar 14, 2013 at 3:36 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> I agree.
>>
>> This method is also in the interface declaration. Should we mark it as
>> deprecated in the interface too?
>>
>> Also, if someone don't want to get the results, should we find a way
>> to allow he user to pass null for results?
>>
>> 2013/3/14 Ted Yu <yuzhihong@gmail.com>:
>> > Looking at this batch() method in HTable:
>> >
>> >   Object[] batch(final List<? extends Row> actions) throws IOException,
>> > InterruptedException;
>> > I think the above method should be deprecated due to the issue raised by
>> > Amit.
>> > The following method is more reliable:
>> >
>> >   void batch(final List<?extends Row> actions, final Object[] results)
>> > throws IOException, InterruptedException;
>> > I plan to raise a JIRA for deprecating the first method, if I don't hear
>> > objections.
>> >
>> > Cheers
>> >
>> > On Thu, Mar 14, 2013 at 11:55 AM, Jean-Marc Spaggiari <
>> > jean-marc@spaggiari.org> wrote:
>> >
>> >> Amit, do it that way:
>> >>
>> >>       Object[] res = new Object[batch.size()];
>> >>       try {
>> >>         table.batch(batch, res);
>> >>
>> >> Then res will contain the result, and the exception even if you will
>> >> catch a RetriesExhaustedWithDetailsException because your batch got
>> >> one.
>> >>
>> >> JM
>> >>
>> >> 2013/3/14 Jean-Marc Spaggiari <jean-marc@spaggiari.org>:
>> >> > Can you paste the compelte stacktrace here with the causes too?
>> >> >
>> >> > I will try you piece of code locally to try to reproduce.
>> >> >
>> >> > JM
>> >> >
>> >> > 2013/3/14 Amit Sela <amits@infolinks.com>:
>> >> >> I did look at HConnectionManager and that is the reason I expected
>> the
>> >> >> scenario you just described but running the test I ran from the
>> >> development
>> >> >> environment (IntelliJ IDEA) I did not get any returned value, instead
>> >> the
>> >> >> exception is thrown and after I catch it the result is null...
>> >> >>
>> >> >> Object[] res = null;
>> >> >> try {
>> >> >>       res = table.batch(batch);
>> >> >> } catch (RetriesExhaustedWithDetailsException
>> >> >> retriesExhaustedWithDetailsException) {
>> >> >>       retriesExhaustedWithDetailsException.printStackTrace();
>> >> >> }
>> >> >> if (res == null) {
>> >> >> System.out.println("No results - returned null.");
>> >> >> return;
>> >> >> }
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thu, Mar 14, 2013 at 7:52 PM, Jean-Marc Spaggiari <
>> >> >> jean-marc@spaggiari.org> wrote:
>> >> >>
>> >> >>> Hi Amit,
>> >> >>>
>> >> >>> Just take a look at the processBatchCallback method in
>> >> HConnectionManager.
>> >> >>>
>> >> >>> There you will see how the result is populated, and when an
>> exception
>> >> >>> is returned.
>> >> >>>
>> >> >>> In your example below, if you look at the content of the returned
>> >> >>> array, you should see one cell with the result of the increment,
and
>> >> >>> one cell with a Throwable into it.
>> >> >>>
>> >> >>> JM
>> >> >>>
>> >> >>> 2013/3/14 Amit Sela <amits@infolinks.com>:
>> >> >>> > Hi all,
>> >> >>> >
>> >> >>> > I did some testing with HTableInterface#batch() for batching
>> >> Increments
>> >> >>> and
>> >> >>> > I was wondering about the returned value Object[].
>> >> >>> >
>> >> >>> > As I understand (or would expect), the returned value
would be:
>> >> >>> >
>> >> >>> > null - all batch of increments failed.
>> >> >>> > An object in the array is null / is Exception - that increment
has
>> >> >>> failed.
>> >> >>> >
>> >> >>> > So I ran some tests and executed a batch of two Increment
Objects
>> on
>> >> two
>> >> >>> > different row keys, where one of them is valid and the
other one
>> has
>> >> a
>> >> >>> > family that does not exist.
>> >> >>> > When calling HTableInterface#batch() I
>> >> >>> > get RetriesExhaustedWithDetailsException but looking at
the
>> counter
>> >> in
>> >> >>> > HBase it looks like the valid increment was executed.
>> >> >>> >
>> >> >>> > Shouldn't I get an Object[2] where one of the objects
is null
>> >> >>> > / RetriesExhaustedWithDetailsException ?
>> >> >>> >
>> >> >>> > How can I know # of success/failures in the batch ? What
is the
>> >> >>> "contract"
>> >> >>> > here ?
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> >
>> >> >>> > Amit.
>> >> >>>
>> >>
>>

Mime
View raw message