Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 51BFC17426 for ; Mon, 23 Mar 2015 20:07:24 +0000 (UTC) Received: (qmail 35568 invoked by uid 500); 23 Mar 2015 20:07:24 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 35494 invoked by uid 500); 23 Mar 2015 20:07:24 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 35483 invoked by uid 99); 23 Mar 2015 20:07:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Mar 2015 20:07:23 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of fhueske@gmail.com designates 209.85.160.173 as permitted sender) Received: from [209.85.160.173] (HELO mail-yk0-f173.google.com) (209.85.160.173) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Mar 2015 20:07:19 +0000 Received: by ykfs63 with SMTP id s63so78923182ykf.2 for ; Mon, 23 Mar 2015 13:04:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=F2Oven4ArenYvL3GFsVVfUZzEpKg5nn9xXMdRF9iZGM=; b=z5euDKgoQPrMVWB01cEwHqa8h8LPkdUUNdP/y2dYhzwZhEvNg9dF9hQzlivPvlV2Fs KIS606uVuA7qzXCIzs55tG2AmOWZvEeANRD8FwkRuTBhxvw+2Pc4zO6DX4hwvkR7W3nv yw+AFM616AnOyhG/L+9U4NGoX5VyrZZqU6/NnWR4SoV0/GJwGPlYNelaoR4rFcUjyzue FZUO11DgGEjzwTKQD7eMekwDZiQ1YDbxmRGbPhKU/gWcQmhs7k6OvtgAZhNWdedkDBSX APsB+Qz/oySrghc0kO5r7XLJtgZEku1H76CAmYs5xs4BY5hNqW/gIAhkemT2o7G+oY8U 6DLQ== MIME-Version: 1.0 X-Received: by 10.236.8.2 with SMTP id 2mr780464yhq.37.1427141083832; Mon, 23 Mar 2015 13:04:43 -0700 (PDT) Received: by 10.170.170.2 with HTTP; Mon, 23 Mar 2015 13:04:43 -0700 (PDT) In-Reply-To: References: Date: Mon, 23 Mar 2015 21:04:43 +0100 Message-ID: Subject: Re: HBase TableOutputFormat From: Fabian Hueske To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a11c1bf4e58a68f0511fa2d9f X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1bf4e58a68f0511fa2d9f Content-Type: text/plain; charset=UTF-8 Creating a JIRA issue never hurts. Have you tried to add your code snippet to the HadoopOutputFormatBase.configure() method? Seems to me the right place for it. Do you want to open a PR for that? 2015-03-23 16:01 GMT+01:00 Flavio Pompermaier : > Any news about this? Could someone look into the problem or should I open > a ticket in JIRA? > > On Sun, Mar 22, 2015 at 12:09 PM, Flavio Pompermaier > wrote: > >> Hi Stephan, >> the problem is when you try to write into HBase with the >> HadoopOutputFormat. >> Unfortunately the recordWriter of the HBase TableOutputFormat requires a >> Table object to be instantiated through the setConf() method (otherwise you >> get a nullPointer), and it sets also other parameters in the passed conf. >> Thus I think that hadoop OutputFormat implementing Configurable should be >> initialized somewhere with a call to the setConf() as I tried to do. >> Moreover there's the problem with the Flink Hadoop OutputFormat that >> requires a property to be set in the configuration (mapred.output.dir) that >> is not possible to set right now. Or am I'm doing something wrong? Could >> you try to write some data to an HBase TableOutputFormat and verify this >> problem? >> >> Thanks again, >> Flavio >> >> >> On Sat, Mar 21, 2015 at 8:16 PM, Stephan Ewen wrote: >> >>> Hi Flavio! >>> >>> The issue that abstract classes and interfaces are not supported is >>> definitely fixed in 0.9. >>> >>> Your other fix (adding the call for configuring the output format) - is >>> that always needed, or just important in a special case? How has the output >>> format worked before? >>> >>> If this is critical to the functionality, would you open a pull request >>> with this patch? >>> >>> Greetings, >>> Stephan >>> >>> >>> >>> On Fri, Mar 20, 2015 at 6:28 PM, Flavio Pompermaier < >>> pompermaier@okkam.it> wrote: >>> >>>> 0.8,1 >>>> >>>> On Fri, Mar 20, 2015 at 6:11 PM, Stephan Ewen wrote: >>>> >>>>> Hi Flavio! >>>>> >>>>> Is this on Flink 0.9-SNAPSHOT or 0.8.1 ? >>>>> >>>>> Stephan >>>>> >>>>> >>>>> On Fri, Mar 20, 2015 at 6:03 PM, Flavio Pompermaier < >>>>> pompermaier@okkam.it> wrote: >>>>> >>>>>> To make it work I had to clone the Flink repo, imporrt the Flink-java >>>>>> project and modify the >>>>>> HadoopOutputFormatBase in the open() and finalizeGlobal and call >>>>>> >>>>>> if(this.mapreduceOutputFormat instanceof Configurable){ >>>>>> >>>>>> ((Configurable)this.mapreduceOutputFormat).setConf(this.configuration); >>>>>> } >>>>>> otherwise the "mapred.output.dir" property was always null :( >>>>>> >>>>>> On Fri, Mar 20, 2015 at 10:27 AM, Flavio Pompermaier < >>>>>> pompermaier@okkam.it> wrote: >>>>>> >>>>>>> Hi guys, >>>>>>> >>>>>>> I was trying to insert into an HBase table with Flink 0.8.1 and it >>>>>>> seems to be not possible without creating a custom version of the HBase >>>>>>> TableOutputFormat that specialize Mutation with Put. >>>>>>> This is my code using the standard Flink APIs: >>>>>>> >>>>>>> myds.output(new HadoopOutputFormat(new >>>>>>> TableOutputFormat(), job)); >>>>>>> >>>>>>> and this is the Exception I get: >>>>>>> >>>>>>> Exception in thread "main" >>>>>>> org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and >>>>>>> abstract classes are not valid types: class >>>>>>> org.apache.hadoop.hbase.client.Mutation >>>>>>> at >>>>>>> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:885) >>>>>>> at >>>>>>> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:877) >>>>>>> .... >>>>>>> >>>>>>> So I had to copy the TableOutputFormat, rename it as >>>>>>> HBaseTableOutputFormat and change Mutation to Put as TableOutputFormat Type >>>>>>> argument. >>>>>>> However the table filed is not initialized because setConf is not >>>>>>> called. Is this a bug of the HadoopOutputFormat wrapper that does not check >>>>>>> is the outputFormat is an instance of Configurable and call setConf (as it >>>>>>> happens for the inputSlit)? >>>>>>> >>>>>>> Best, >>>>>>> Flavio >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>> >> > --001a11c1bf4e58a68f0511fa2d9f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Creating a JIRA issue never hurts.
Hav= e you tried to add your code snippet to the HadoopOutputFormatBase.configur= e() method? Seems to me the right place for it.

Do you want to= open a PR for that?

2015-03-23 16:01 GMT+01:00 Flavio Pompermaier <pompermaier= @okkam.it>:
Any news about this? Could someone look into the problem or should I open= a ticket in JIRA?

On Sun, Mar 22, 2015 at 12:09 PM, Flavio Pompermaie= r <pompermaier@okkam.it> wrote:
Hi Stephan,
the problem is when you try to wr= ite into HBase with the HadoopOutputFormat.
Unfortunately the rec= ordWriter of the HBase TableOutputFormat requires a Table object to be inst= antiated through the setConf() method (otherwise you get a nullPointer), an= d it sets also other parameters in the passed conf. Thus I think that hadoo= p OutputFormat implementing Configurable should be initialized somewhere wi= th a call to the setConf() as I tried to do.
Moreover there's= the problem with the Flink Hadoop OutputFormat that requires a property to= be set in the configuration (mapred.output.dir) that is not possible to se= t right now. Or am I'm doing something wrong? Could you try to write so= me data to an HBase TableOutputFormat and verify this problem?
Thanks again,
Flavio


On Sat, Mar 21, 2015 at = 8:16 PM, Stephan Ewen <sewen@apache.org> wrote:
Hi Flavio!

The iss= ue that abstract classes and interfaces are not supported is definitely fix= ed in 0.9.

Your other fix (adding the call for con= figuring the output format) - is that always needed, or just important in a= special case? How has the output format worked before?

If this is critical to the functionality, would you open a pull reque= st with this patch?

Greetings,
Stephan



On Fri, Mar 20, 2015 at 6:28 PM, Flavio Po= mpermaier <pompermaier@okkam.it> wrote:
0.8,1

On Fri, Mar 20, 2015 at 6:11 PM, Stephan Ew= en <sewen@apache.org> wrote:
Hi Flavio!

Is this on Flink 0.9-SNAPS= HOT or 0.8.1 ?

Steph= an


On Fri, Mar 20, 2015 at 6:03 PM, Flavio= Pompermaier <pompermaier@okkam.it> wrote:
To make it work I had to clone the Fli= nk repo, imporrt the Flink-java project and modify the
HadoopOutputForm= atBase in the open() and finalizeGlobal and call

<= div> if(this.mapreduceOutputFo= rmat instanceof Configurable){
((Configurable)this.mapreduceOutputFormat).setConf(this.config= uration);
}
otherwise the= =C2=A0"mapred.output.dir" property was always null :(
<= div>

On Fri, Mar 2= 0, 2015 at 10:27 AM, Flavio Pompermaier <pompermaier@okkam.it> wrote:
Hi guys,

I was trying to insert into an HBase table with Flink 0.8.1 and it see= ms to be not possible without creating a custom version of the HBase TableO= utputFormat that specialize Mutation with Put.
This is my code us= ing the standard Flink APIs:

myds.output(new H= adoopOutputFormat<Text, Put>(new TableOutputFormat<Text>(), job= ));

and this is the Exception I get:

Exception in thread "main" org.apache.flink.= api.common.functions.InvalidTypesException: Interfaces and abstract classes= are not valid types: class org.apache.hadoop.hbase.client.Mutation
at org.apache.flink.api.jav= a.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:885)
<= div> at org.apache.flink.api.ja= va.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:877)
=
....
=
So I had to copy the TableOutputFormat, rename = it as HBaseTableOutputFormat and change Mutation to Put as TableOutputForma= t Type argument.
However the table filed is not initialized becau= se setConf is not called. Is this a bug of the HadoopOutputFormat wrapper t= hat does not check is the outputFormat is an instance of Configurable and c= all setConf (as it happens for the inputSlit)?

Bes= t,
Flavio








--001a11c1bf4e58a68f0511fa2d9f--