flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@apache.org>
Subject Re: HBase 0.98 addon for Flink 0.8
Date Fri, 14 Nov 2014 20:24:34 GMT
What exactly is required to configure the TableInputFormat?
Would it be easier and more flexible to just set the hostname of the HBase
master, the table name, etc, directly as strings in the InputFormat?

2014-11-14 15:34 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:

> Both from shell with run command and from web client
> On Nov 14, 2014 2:32 PM, "Fabian Hueske" <fhueske@apache.org> wrote:
> >
> > In this case, the initialization happens when the InputFormat is
> > instantiated at the submission client and the Table info is serialized as
> > part of the InputFormat and shipped out to all TaskManagers for
> execution.
> > However, if the initialization is done within configure it happens on
> each
> > TaskManager when initializing the InputFormat.
> > These are two separate JVMs in a distributed setting with different
> > classpaths.
> >
> > How do you submit your job for execution?
> >
> > 2014-11-14 13:58 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:
> >
> > > The strange thing us that everything works if I create HTable outside
> > > configure()..
> > > On Nov 14, 2014 10:32 AM, "Stephan Ewen" <sewen@apache.org> wrote:
> > >
> > > > I think that this is a case where the wrong classloader is used:
> > > >
> > > > If the HBase classes are part of the flink lib directory, they are
> loaded
> > > > with the system class loader. When they look for anything in the
> > > classpath,
> > > > they will do so with the system classloader.
> > > >
> > > > You configuration is in the user code jar that you submit, so it is
> only
> > > > available through the user-code classloader.
> > > >
> > > > Any way you can load the configuration yourself and give that
> > > configuration
> > > > to HBase?
> > > >
> > > > Stephan
> > > > Am 13.11.2014 22:06 schrieb "Flavio Pompermaier" <
> pompermaier@okkam.it
> >:
> > > >
> > > > > The only config files available are within the submitted jar.
> Things
> > > > works
> > > > > in eclipse using local environment while fails deploying to the
> cluster
> > > > > On Nov 13, 2014 10:01 PM, <fhueske@gmail.com> wrote:
> > > > >
> > > > > > Does the HBase jar in the lib folder contain a config that could
> be
> > > > used
> > > > > > instead of the config in the job jar file? Or is simply no config
> at
> > > > all
> > > > > > available when the configure method is called?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Fabian Hueske
> > > > > > Phone:      +49 170 5549438
> > > > > > Email:      fhueske@gmail.com
> > > > > > Web:         http://www.user.tu-berlin.de/fabian.hueske
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > From: Flavio Pompermaier
> > > > > > Sent: ‎Thursday‎, ‎13‎. ‎November‎, ‎2014 ‎21‎:‎43
> > > > > > To: dev@flink.incubator.apache.org
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > The hbase jar is in the lib directory on each node while the
> config
> > > > files
> > > > > > are within the jar file I submit from the web client.
> > > > > > On Nov 13, 2014 9:37 PM, <fhueske@gmail.com> wrote:
> > > > > >
> > > > > > > Have you added the hbase.jar file with your HBase config to the
> > > ./lib
> > > > > > > folders of your Flink setup (JobManager, TaskManager) or is it
> > > > bundled
> > > > > > with
> > > > > > > your job.jar file?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Fabian Hueske
> > > > > > > Phone:      +49 170 5549438
> > > > > > > Email:      fhueske@gmail.com
> > > > > > > Web:         http://www.user.tu-berlin.de/fabian.hueske
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > From: Flavio Pompermaier
> > > > > > > Sent: ‎Thursday‎, ‎13‎. ‎November‎, ‎2014 ‎18‎:‎36
> > > > > > > To: dev@flink.incubator.apache.org
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Any help with this? :(
> > > > > > >
> > > > > > > On Thu, Nov 13, 2014 at 2:06 PM, Flavio Pompermaier <
> > > > > > pompermaier@okkam.it>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > We definitely discovered that instantiating HTable and Scan
> in
> > > > > > > configure()
> > > > > > > > method of TableInputFormat causes problem in distributed
> > > > environment!
> > > > > > > > If you look at my implementation at
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>
> https://github.com/fpompermaier/incubator-flink/blob/master/flink-addons/flink-hbase/src/main/java/org/apache/flink/addons/hbase/TableInputFormat.java
> > > > > > > > you can see that Scan and HTable were made transient and
> > > recreated
> > > > > > within
> > > > > > > > configure but this causes HBaseConfiguration.create() to fail
> > > > > searching
> > > > > > > for
> > > > > > > > classpath files...could you help us understanding why?
> > > > > > > >
> > > > > > > > On Wed, Nov 12, 2014 at 8:10 PM, Flavio Pompermaier <
> > > > > > > pompermaier@okkam.it>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Usually, when I run a mapreduce job both on Spark and Hadoop
> I
> > > > just
> > > > > > put
> > > > > > > >> *-site.xml files into the war I submit to the cluster and
> that's
> > > > > it. I
> > > > > > > >> think the problem appeared when I made the HTable a private
> > > > > transient
> > > > > > > field
> > > > > > > >> and the table istantiation was moved in the configure
> method.
> > > > > > > >> Could it be a valid reason? we still have to make a deeper
> debug
> > > > but
> > > > > > I'm
> > > > > > > >> trying ro figure out where to investigate..
> > > > > > > >> On Nov 12, 2014 8:03 PM, "Robert Metzger" <
> rmetzger@apache.org>
> > > > > > wrote:
> > > > > > > >>
> > > > > > > >>> Hi,
> > > > > > > >>> Maybe its an issue with the classpath? As far as I know is
> > > Hadoop
> > > > > > > reading
> > > > > > > >>> the configuration files from the classpath. Maybe is the
> > > > > > hbase-site.xml
> > > > > > > >>> file not accessible through the classpath when running on
> the
> > > > > > cluster?
> > > > > > > >>>
> > > > > > > >>> On Wed, Nov 12, 2014 at 7:40 PM, Flavio Pompermaier <
> > > > > > > >>> pompermaier@okkam.it>
> > > > > > > >>> wrote:
> > > > > > > >>>
> > > > > > > >>> > Today we tried tp execute a job on the cluster instead of
> on
> > > > > local
> > > > > > > >>> executor
> > > > > > > >>> > and we faced the problem that the hbase-site.xml was
> > > basically
> > > > > > > >>> ignored. Is
> > > > > > > >>> > there a reason why the TableInputFormat is working
> correctly
> > > on
> > > > > > local
> > > > > > > >>> > environment while it doesn't on a cluster?
> > > > > > > >>> > On Nov 10, 2014 10:56 AM, "Fabian Hueske" <
> > > fhueske@apache.org>
> > > > > > > wrote:
> > > > > > > >>> >
> > > > > > > >>> > > I don't think we need to bundle the HBase input and
> output
> > > > > format
> > > > > > > in
> > > > > > > >>> a
> > > > > > > >>> > > single PR.
> > > > > > > >>> > > So, I think we can proceed with the IF only and target
> the
> > > OF
> > > > > > > later.
> > > > > > > >>> > > However, the fix for Kryo should be in the master
> before
> > > > > merging
> > > > > > > the
> > > > > > > >>> PR.
> > > > > > > >>> > > Till is currently working on that and said he expects
> this
> > > to
> > > > > be
> > > > > > > >>> done by
> > > > > > > >>> > > end of the week.
> > > > > > > >>> > >
> > > > > > > >>> > > Cheers, Fabian
> > > > > > > >>> > >
> > > > > > > >>> > >
> > > > > > > >>> > > 2014-11-07 12:49 GMT+01:00 Flavio Pompermaier <
> > > > > > > pompermaier@okkam.it
> > > > > > > >>> >:
> > > > > > > >>> > >
> > > > > > > >>> > > > I fixed also the profile for Cloudera CDH5.1.3. You
> can
> > > > build
> > > > > > it
> > > > > > > >>> with
> > > > > > > >>> > the
> > > > > > > >>> > > > command:
> > > > > > > >>> > > >       mvn clean install -Dmaven.test.skip=true
> > > > > > -Dhadoop.profile=2
> > > > > > > >>> > > >  -Pvendor-repos,cdh5.1.3
> > > > > > > >>> > > >
> > > > > > > >>> > > > However, it would be good to generate the specific
> jar
> > > when
> > > > > > > >>> > > > releasing..(e.g.
> > > > > > > >>> > > >
> > > flink-addons:flink-hbase:0.8.0-hadoop2-cdh5.1.3-incubating)
> > > > > > > >>> > > >
> > > > > > > >>> > > > Best,
> > > > > > > >>> > > > Flavio
> > > > > > > >>> > > >
> > > > > > > >>> > > > On Fri, Nov 7, 2014 at 12:44 PM, Flavio Pompermaier <
> > > > > > > >>> > > pompermaier@okkam.it>
> > > > > > > >>> > > > wrote:
> > > > > > > >>> > > >
> > > > > > > >>> > > > > I've just updated the code on my fork (synch with
> > > current
> > > > > > > master
> > > > > > > >>> and
> > > > > > > >>> > > > > applied improvements coming from comments on
> related
> > > PR).
> > > > > > > >>> > > > > I still have to understand how to write results
> back to
> > > > an
> > > > > > > HBase
> > > > > > > >>> > > > > Sink/OutputFormat...
> > > > > > > >>> > > > >
> > > > > > > >>> > > > >
> > > > > > > >>> > > > > On Mon, Nov 3, 2014 at 12:05 PM, Flavio Pompermaier
> <
> > > > > > > >>> > > > pompermaier@okkam.it>
> > > > > > > >>> > > > > wrote:
> > > > > > > >>> > > > >
> > > > > > > >>> > > > >> Thanks for the detailed answer. So if I run a job
> from
> > > > my
> > > > > > > >>> machine
> > > > > > > >>> > I'll
> > > > > > > >>> > > > >> have to download all the scanned data in a
> > > table..right?
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >> Always regarding the GenericTableOutputFormat it
> is
> > > not
> > > > > > clear
> > > > > > > >>> to me
> > > > > > > >>> > > how
> > > > > > > >>> > > > >> to proceed..
> > > > > > > >>> > > > >> I saw in the hadoop compatibility addon that it is
> > > > > possible
> > > > > > to
> > > > > > > >>> have
> > > > > > > >>> > > such
> > > > > > > >>> > > > >> compatibility using HBaseUtils class so the open
> > > method
> > > > > > should
> > > > > > > >>> > become
> > > > > > > >>> > > > >> something like:
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >> @Override
> > > > > > > >>> > > > >> public void open(int taskNumber, int numTasks)
> throws
> > > > > > > >>> IOException {
> > > > > > > >>> > > > >> if (Integer.toString(taskNumber + 1).length() > 6)
> {
> > > > > > > >>> > > > >> throw new IOException("Task id too large.");
> > > > > > > >>> > > > >> }
> > > > > > > >>> > > > >> TaskAttemptID taskAttemptID =
> > > > > > > >>> > TaskAttemptID.forName("attempt__0000_r_"
> > > > > > > >>> > > > >> + String.format("%" + (6 -
> > > Integer.toString(taskNumber +
> > > > > > > >>> > 1).length())
> > > > > > > >>> > > +
> > > > > > > >>> > > > >> "s"," ").replace(" ", "0")
> > > > > > > >>> > > > >> + Integer.toString(taskNumber + 1)
> > > > > > > >>> > > > >> + "_0");
> > > > > > > >>> > > > >>  this.configuration.set("mapred.task.id",
> > > > > > > >>> > taskAttemptID.toString());
> > > > > > > >>> > > > >> this.configuration.setInt("mapred.task.partition",
> > > > > > taskNumber
> > > > > > > +
> > > > > > > >>> 1);
> > > > > > > >>> > > > >> // for hadoop 2.2
> > > > > > > >>> > > > >> this.configuration.set("mapreduce.task.attempt.id
> ",
> > > > > > > >>> > > > >> taskAttemptID.toString());
> > > > > > > >>> > > > >>
> this.configuration.setInt("mapreduce.task.partition",
> > > > > > > >>> taskNumber +
> > > > > > > >>> > 1);
> > > > > > > >>> > > > >>  try {
> > > > > > > >>> > > > >> this.context =
> > > > > > > >>> > > > >>
> > > > > > HadoopUtils.instantiateTaskAttemptContext(this.configuration,
> > > > > > > >>> > > > >> taskAttemptID);
> > > > > > > >>> > > > >> } catch (Exception e) {
> > > > > > > >>> > > > >> throw new RuntimeException(e);
> > > > > > > >>> > > > >> }
> > > > > > > >>> > > > >> final HFileOutputFormat2 outFormat = new
> > > > > > HFileOutputFormat2();
> > > > > > > >>> > > > >> try {
> > > > > > > >>> > > > >> this.writer =
> outFormat.getRecordWriter(this.context);
> > > > > > > >>> > > > >> } catch (InterruptedException iex) {
> > > > > > > >>> > > > >> throw new IOException("Opening the writer was
> > > > > interrupted.",
> > > > > > > >>> iex);
> > > > > > > >>> > > > >> }
> > > > > > > >>> > > > >> }
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >> But I'm not sure about how to pass the JobConf to
> the
> > > > > class,
> > > > > > > if
> > > > > > > >>> to
> > > > > > > >>> > > merge
> > > > > > > >>> > > > >> config fileas, where HFileOutputFormat2 writes the
> > > data
> > > > > and
> > > > > > > how
> > > > > > > >>> to
> > > > > > > >>> > > > >> implement the public void writeRecord(Record
> record)
> > > > API.
> > > > > > > >>> > > > >> Could I do a little chat off the mailing list with
> the
> > > > > > > >>> implementor
> > > > > > > >>> > of
> > > > > > > >>> > > > >> this extension?
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >> On Mon, Nov 3, 2014 at 11:51 AM, Fabian Hueske <
> > > > > > > >>> fhueske@apache.org>
> > > > > > > >>> > > > >> wrote:
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >>> Hi Flavio
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > > >>> let me try to answer your last question on the
> user's
> > > > > list
> > > > > > > (to
> > > > > > > >>> the
> > > > > > > >>> > > best
> > > > > > > >>> > > > >>> of
> > > > > > > >>> > > > >>> my HBase knowledge).
> > > > > > > >>> > > > >>> "I just wanted to known if and how regiom
> splitting
> > > is
> > > > > > > >>> handled. Can
> > > > > > > >>> > > you
> > > > > > > >>> > > > >>> explain me in detail how Flink and HBase
> works?what
> > > is
> > > > > not
> > > > > > > >>> fully
> > > > > > > >>> > > clear
> > > > > > > >>> > > > to
> > > > > > > >>> > > > >>> me is when computation is done by region servers
> and
> > > > when
> > > > > > > data
> > > > > > > >>> > start
> > > > > > > >>> > > > flow
> > > > > > > >>> > > > >>> to a Flink worker (that in ky test job is only my
> pc)
> > > > and
> > > > > > how
> > > > > > > >>> ro
> > > > > > > >>> > > > >>> undertsand
> > > > > > > >>> > > > >>> better the important logged info to understand if
> my
> > > > job
> > > > > is
> > > > > > > >>> > > performing
> > > > > > > >>> > > > >>> well"
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > > >>> HBase partitions its tables into so called
> "regions"
> > > of
> > > > > > keys
> > > > > > > >>> and
> > > > > > > >>> > > stores
> > > > > > > >>> > > > >>> the
> > > > > > > >>> > > > >>> regions distributed in the cluster using HDFS. I
> > > think
> > > > an
> > > > > > > HBase
> > > > > > > >>> > > region
> > > > > > > >>> > > > >>> can
> > > > > > > >>> > > > >>> be thought of as a HDFS block. To make reading an
> > > HBase
> > > > > > table
> > > > > > > >>> > > > efficient,
> > > > > > > >>> > > > >>> region reads should be locally done, i.e., an
> > > > InputFormat
> > > > > > > >>> should
> > > > > > > >>> > > > >>> primarily
> > > > > > > >>> > > > >>> read region that are stored on the same machine
> as
> > > the
> > > > IF
> > > > > > is
> > > > > > > >>> > running
> > > > > > > >>> > > > on.
> > > > > > > >>> > > > >>> Flink's InputSplits partition the HBase input by
> > > > regions
> > > > > > and
> > > > > > > >>> add
> > > > > > > >>> > > > >>> information about the storage location of the
> region.
> > > > > > During
> > > > > > > >>> > > execution,
> > > > > > > >>> > > > >>> input splits are assigned to InputFormats that
> can do
> > > > > local
> > > > > > > >>> reads.
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > > >>> Best, Fabian
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > > >>> 2014-11-03 11:13 GMT+01:00 Stephan Ewen <
> > > > > sewen@apache.org
> > > > > > >:
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > > >>> > Hi!
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>> > The way of passing parameters through the
> > > > configuration
> > > > > > is
> > > > > > > >>> very
> > > > > > > >>> > old
> > > > > > > >>> > > > >>> (the
> > > > > > > >>> > > > >>> > original HBase format dated back to that time).
> I
> > > > would
> > > > > > > >>> simply
> > > > > > > >>> > make
> > > > > > > >>> > > > the
> > > > > > > >>> > > > >>> > HBase format take those parameters through the
> > > > > > constructor.
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>> > Greetings,
> > > > > > > >>> > > > >>> > Stephan
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>> > On Mon, Nov 3, 2014 at 10:59 AM, Flavio
> > > Pompermaier <
> > > > > > > >>> > > > >>> pompermaier@okkam.it>
> > > > > > > >>> > > > >>> > wrote:
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>> > > The problem is that I also removed the
> > > > > > > >>> GenericTableOutputFormat
> > > > > > > >>> > > > >>> because
> > > > > > > >>> > > > >>> > > there is an incompatibility between hadoop1
> and
> > > > > hadoop2
> > > > > > > for
> > > > > > > >>> > class
> > > > > > > >>> > > > >>> > > TaskAttemptContext and
> TaskAttemptContextImpl..
> > > > > > > >>> > > > >>> > > then it would be nice if the user doesn't
> have to
> > > > > worry
> > > > > > > >>> about
> > > > > > > >>> > > > passing
> > > > > > > >>> > > > >>> > > pact.hbase.jtkey and pact.job.id
> parameters..
> > > > > > > >>> > > > >>> > > I think it is probably a good idea to remove
> > > > hadoop1
> > > > > > > >>> > > compatibility
> > > > > > > >>> > > > >>> and
> > > > > > > >>> > > > >>> > keep
> > > > > > > >>> > > > >>> > > enable HBase addon only for hadoop2 (as
> before)
> > > and
> > > > > > > decide
> > > > > > > >>> how
> > > > > > > >>> > to
> > > > > > > >>> > > > >>> mange
> > > > > > > >>> > > > >>> > > those 2 parameters..
> > > > > > > >>> > > > >>> > >
> > > > > > > >>> > > > >>> > > On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen
> <
> > > > > > > >>> > sewen@apache.org>
> > > > > > > >>> > > > >>> wrote:
> > > > > > > >>> > > > >>> > >
> > > > > > > >>> > > > >>> > > > It is fine to remove it, in my opinion.
> > > > > > > >>> > > > >>> > > >
> > > > > > > >>> > > > >>> > > > On Mon, Nov 3, 2014 at 10:11 AM, Flavio
> > > > > Pompermaier <
> > > > > > > >>> > > > >>> > > pompermaier@okkam.it>
> > > > > > > >>> > > > >>> > > > wrote:
> > > > > > > >>> > > > >>> > > >
> > > > > > > >>> > > > >>> > > > > That is one class I removed because it
> was
> > > > using
> > > > > > the
> > > > > > > >>> > > deprecated
> > > > > > > >>> > > > >>> API
> > > > > > > >>> > > > >>> > > > > GenericDataSink..I can restore them but
> the
> > > it
> > > > > will
> > > > > > > be
> > > > > > > >>> a
> > > > > > > >>> > good
> > > > > > > >>> > > > >>> idea to
> > > > > > > >>> > > > >>> > > > > remove those warning (also because from
> what
> > > I
> > > > > > > >>> understood
> > > > > > > >>> > the
> > > > > > > >>> > > > >>> Record
> > > > > > > >>> > > > >>> > > APIs
> > > > > > > >>> > > > >>> > > > > are going to be removed).
> > > > > > > >>> > > > >>> > > > >
> > > > > > > >>> > > > >>> > > > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian
> > > Hueske <
> > > > > > > >>> > > > >>> fhueske@apache.org>
> > > > > > > >>> > > > >>> > > > wrote:
> > > > > > > >>> > > > >>> > > > >
> > > > > > > >>> > > > >>> > > > > > I'm not familiar with the HBase
> connector
> > > > code,
> > > > > > but
> > > > > > > >>> are
> > > > > > > >>> > you
> > > > > > > >>> > > > >>> maybe
> > > > > > > >>> > > > >>> > > > looking
> > > > > > > >>> > > > >>> > > > > > for the GenericTableOutputFormat?
> > > > > > > >>> > > > >>> > > > > >
> > > > > > > >>> > > > >>> > > > > > 2014-11-03 9:44 GMT+01:00 Flavio
> > > Pompermaier
> > > > <
> > > > > > > >>> > > > >>> pompermaier@okkam.it
> > > > > > > >>> > > > >>> > >:
> > > > > > > >>> > > > >>> > > > > >
> > > > > > > >>> > > > >>> > > > > > > | was trying to modify the example
> > > setting
> > > > > > > >>> > > > hbaseDs.output(new
> > > > > > > >>> > > > >>> > > > > > > HBaseOutputFormat()); but I can't see
> any
> > > > > > > >>> > > HBaseOutputFormat
> > > > > > > >>> > > > >>> > > > > class..maybe
> > > > > > > >>> > > > >>> > > > > > we
> > > > > > > >>> > > > >>> > > > > > > shall use another class?
> > > > > > > >>> > > > >>> > > > > > >
> > > > > > > >>> > > > >>> > > > > > > On Mon, Nov 3, 2014 at 9:39 AM,
> Flavio
> > > > > > > Pompermaier
> > > > > > > >>> <
> > > > > > > >>> > > > >>> > > > > pompermaier@okkam.it
> > > > > > > >>> > > > >>> > > > > > >
> > > > > > > >>> > > > >>> > > > > > > wrote:
> > > > > > > >>> > > > >>> > > > > > >
> > > > > > > >>> > > > >>> > > > > > > > Maybe that's something I could add
> to
> > > the
> > > > > > HBase
> > > > > > > >>> > example
> > > > > > > >>> > > > and
> > > > > > > >>> > > > >>> > that
> > > > > > > >>> > > > >>> > > > > could
> > > > > > > >>> > > > >>> > > > > > be
> > > > > > > >>> > > > >>> > > > > > > > better documented in the Wiki.
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > > Since we're talking about the
> wiki..I
> > > was
> > > > > > > >>> looking at
> > > > > > > >>> > > the
> > > > > > > >>> > > > >>> Java
> > > > > > > >>> > > > >>> > > API (
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > >
> > > > > > > >>> > > > >>> > > > > >
> > > > > > > >>> > > > >>> > > > >
> > > > > > > >>> > > > >>> > > >
> > > > > > > >>> > > > >>> > >
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > >
> > > > > > > >>> > >
> > > > > > > >>> >
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html
> > > > > > > >>> > > > >>> )
> > > > > > > >>> > > > >>> > > > > > > > and the link to the KMeans example
> is
> > > not
> > > > > > > working
> > > > > > > >>> > > (where
> > > > > > > >>> > > > it
> > > > > > > >>> > > > >>> > says
> > > > > > > >>> > > > >>> > > > For
> > > > > > > >>> > > > >>> > > > > a
> > > > > > > >>> > > > >>> > > > > > > > complete example program, have a
> look
> > > at
> > > > > > KMeans
> > > > > > > >>> > > > Algorithm).
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > > Best,
> > > > > > > >>> > > > >>> > > > > > > > Flavio
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > > On Mon, Nov 3, 2014 at 9:12 AM,
> Flavio
> > > > > > > >>> Pompermaier <
> > > > > > > >>> > > > >>> > > > > > pompermaier@okkam.it
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > > wrote:
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > >> Ah ok, perfect! That was the
> reason
> > > why
> > > > I
> > > > > > > >>> removed it
> > > > > > > >>> > > :)
> > > > > > > >>> > > > >>> > > > > > > >>
> > > > > > > >>> > > > >>> > > > > > > >> On Mon, Nov 3, 2014 at 9:10 AM,
> > > Stephan
> > > > > > Ewen <
> > > > > > > >>> > > > >>> > sewen@apache.org>
> > > > > > > >>> > > > >>> > > > > > wrote:
> > > > > > > >>> > > > >>> > > > > > > >>
> > > > > > > >>> > > > >>> > > > > > > >>> You do not really need a HBase
> data
> > > > sink.
> > > > > > You
> > > > > > > >>> can
> > > > > > > >>> > > call
> > > > > > > >>> > > > >>> > > > > > > >>> "DataSet.output(new
> > > > > > > >>> > > > >>> > > > > > > >>> HBaseOutputFormat())"
> > > > > > > >>> > > > >>> > > > > > > >>>
> > > > > > > >>> > > > >>> > > > > > > >>> Stephan
> > > > > > > >>> > > > >>> > > > > > > >>> Am 02.11.2014 23:05 schrieb
> "Flavio
> > > > > > > >>> Pompermaier" <
> > > > > > > >>> > > > >>> > > > > > pompermaier@okkam.it
> > > > > > > >>> > > > >>> > > > > > > >:
> > > > > > > >>> > > > >>> > > > > > > >>>
> > > > > > > >>> > > > >>> > > > > > > >>> > Just one last thing..I removed
> the
> > > > > > > >>> HbaseDataSink
> > > > > > > >>> > > > >>> because I
> > > > > > > >>> > > > >>> > > > think
> > > > > > > >>> > > > >>> > > > > it
> > > > > > > >>> > > > >>> > > > > > > was
> > > > > > > >>> > > > >>> > > > > > > >>> > using the old APIs..can someone
> > > help
> > > > me
> > > > > > in
> > > > > > > >>> > updating
> > > > > > > >>> > > > >>> that
> > > > > > > >>> > > > >>> > > class?
> > > > > > > >>> > > > >>> > > > > > > >>> >
> > > > > > > >>> > > > >>> > > > > > > >>> > On Sun, Nov 2, 2014 at 10:55
> AM,
> > > > Flavio
> > > > > > > >>> > > Pompermaier <
> > > > > > > >>> > > > >>> > > > > > > >>> pompermaier@okkam.it>
> > > > > > > >>> > > > >>> > > > > > > >>> > wrote:
> > > > > > > >>> > > > >>> > > > > > > >>> >
> > > > > > > >>> > > > >>> > > > > > > >>> > > Indeed this time the build
> has
> > > been
> > > > > > > >>> successful
> > > > > > > >>> > :)
> > > > > > > >>> > > > >>> > > > > > > >>> > >
> > > > > > > >>> > > > >>> > > > > > > >>> > > On Sun, Nov 2, 2014 at 10:29
> AM,
> > > > > Fabian
> > > > > > > >>> Hueske
> > > > > > > >>> > <
> > > > > > > >>> > > > >>> > > > > > fhueske@apache.org
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > > >>> > wrote:
> > > > > > > >>> > > > >>> > > > > > > >>> > >
> > > > > > > >>> > > > >>> > > > > > > >>> > >> You can also setup Travis to
> > > build
> > > > > > your
> > > > > > > >>> own
> > > > > > > >>> > > Github
> > > > > > > >>> > > > >>> > > > > repositories
> > > > > > > >>> > > > >>> > > > > > by
> > > > > > > >>> > > > >>> > > > > > > >>> > linking
> > > > > > > >>> > > > >>> > > > > > > >>> > >> it to your Github account.
> That
> > > > way
> > > > > > > >>> Travis can
> > > > > > > >>> > > > >>> build all
> > > > > > > >>> > > > >>> > > > your
> > > > > > > >>> > > > >>> > > > > > > >>> branches
> > > > > > > >>> > > > >>> > > > > > > >>> > >> (and
> > > > > > > >>> > > > >>> > > > > > > >>> > >> you can also trigger
> rebuilds if
> > > > > > > something
> > > > > > > >>> > > fails).
> > > > > > > >>> > > > >>> > > > > > > >>> > >> Not sure if we can manually
> > > > trigger
> > > > > > > >>> retrigger
> > > > > > > >>> > > > >>> builds on
> > > > > > > >>> > > > >>> > > the
> > > > > > > >>> > > > >>> > > > > > Apache
> > > > > > > >>> > > > >>> > > > > > > >>> > >> repository.
> > > > > > > >>> > > > >>> > > > > > > >>> > >>
> > > > > > > >>> > > > >>> > > > > > > >>> > >> Support for Hadoop 1 and 2
> is
> > > > > indeed a
> > > > > > > >>> very
> > > > > > > >>> > good
> > > > > > > >>> > > > >>> > addition
> > > > > > > >>> > > > >>> > > > :-)
> > > > > > > >>> > > > >>> > > > > > > >>> > >>
> > > > > > > >>> > > > >>> > > > > > > >>> > >> For the discusion about the
> PR
> > > > > > itself, I
> > > > > > > >>> would
> > > > > > > >>> > > > need
> > > > > > > >>> > > > >>> a
> > > > > > > >>> > > > >>> > bit
> > > > > > > >>> > > > >>> > > > more
> > > > > > > >>> > > > >>> > > > > > > time
> > > > > > > >>> > > > >>> > > > > > > >>> to
> > > > > > > >>> > > > >>> > > > > > > >>> > >> become more familiar with
> > > HBase. I
> > > > > do
> > > > > > > >>> also not
> > > > > > > >>> > > > have
> > > > > > > >>> > > > >>> a
> > > > > > > >>> > > > >>> > > HBase
> > > > > > > >>> > > > >>> > > > > > setup
> > > > > > > >>> > > > >>> > > > > > > >>> > >> available
> > > > > > > >>> > > > >>> > > > > > > >>> > >> here.
> > > > > > > >>> > > > >>> > > > > > > >>> > >> Maybe somebody else of the
> > > > community
> > > > > > who
> > > > > > > >>> was
> > > > > > > >>> > > > >>> involved
> > > > > > > >>> > > > >>> > > with a
> > > > > > > >>> > > > >>> > > > > > > >>> previous
> > > > > > > >>> > > > >>> > > > > > > >>> > >> version of the HBase
> connector
> > > > could
> > > > > > > >>> comment
> > > > > > > >>> > on
> > > > > > > >>> > > > your
> > > > > > > >>> > > > >>> > > > question.
> > > > > > > >>> > > > >>> > > > > > > >>> > >>
> > > > > > > >>> > > > >>> > > > > > > >>> > >> Best, Fabian
> > > > > > > >>> > > > >>> > > > > > > >>> > >>
> > > > > > > >>> > > > >>> > > > > > > >>> > >> 2014-11-02 9:57 GMT+01:00
> Flavio
> > > > > > > >>> Pompermaier <
> > > > > > > >>> > > > >>> > > > > > > pompermaier@okkam.it
> > > > > > > >>> > > > >>> > > > > > > >>> >:
> > > > > > > >>> > > > >>> > > > > > > >>> > >>
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > As suggestes by Fabian I
> moved
> > > > the
> > > > > > > >>> > discussion
> > > > > > > >>> > > on
> > > > > > > >>> > > > >>> this
> > > > > > > >>> > > > >>> > > > > mailing
> > > > > > > >>> > > > >>> > > > > > > >>> list.
> > > > > > > >>> > > > >>> > > > > > > >>> > >> >
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > I think that what is still
> to
> > > be
> > > > > > > >>> discussed
> > > > > > > >>> > is
> > > > > > > >>> > > > >>> how  to
> > > > > > > >>> > > > >>> > > > > > retrigger
> > > > > > > >>> > > > >>> > > > > > > >>> the
> > > > > > > >>> > > > >>> > > > > > > >>> > >> build
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > on Travis (I don't have an
> > > > > account)
> > > > > > > and
> > > > > > > >>> if
> > > > > > > >>> > the
> > > > > > > >>> > > > PR
> > > > > > > >>> > > > >>> can
> > > > > > > >>> > > > >>> > be
> > > > > > > >>> > > > >>> > > > > > > >>> integrated.
> > > > > > > >>> > > > >>> > > > > > > >>> > >> >
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > Maybe what I can do is to
> move
> > > > the
> > > > > > > HBase
> > > > > > > >>> > > example
> > > > > > > >>> > > > >>> in
> > > > > > > >>> > > > >>> > the
> > > > > > > >>> > > > >>> > > > test
> > > > > > > >>> > > > >>> > > > > > > >>> package
> > > > > > > >>> > > > >>> > > > > > > >>> > >> (right
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > now I left it in the main
> > > > folder)
> > > > > so
> > > > > > > it
> > > > > > > >>> will
> > > > > > > >>> > > > force
> > > > > > > >>> > > > >>> > > Travis
> > > > > > > >>> > > > >>> > > > to
> > > > > > > >>> > > > >>> > > > > > > >>> rebuild.
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > I'll do it within a couple
> of
> > > > > hours.
> > > > > > > >>> > > > >>> > > > > > > >>> > >> >
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > Another thing I forgot to
> say
> > > is
> > > > > > that
> > > > > > > >>> the
> > > > > > > >>> > > hbase
> > > > > > > >>> > > > >>> > > extension
> > > > > > > >>> > > > >>> > > > is
> > > > > > > >>> > > > >>> > > > > > now
> > > > > > > >>> > > > >>> > > > > > > >>> > >> compatible
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > with both hadoop 1 and 2.
> > > > > > > >>> > > > >>> > > > > > > >>> > >> >
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > Best,
> > > > > > > >>> > > > >>> > > > > > > >>> > >> > Flavio
> > > > > > > >>> > > > >>> > > > > > > >>> > >>
> > > > > > > >>> > > > >>> > > > > > > >>> > >
> > > > > > > >>> > > > >>> > > > > > > >>> >
> > > > > > > >>> > > > >>> > > > > > > >>>
> > > > > > > >>> > > > >>> > > > > > > >>
> > > > > > > >>> > > > >>> > > > > > > >
> > > > > > > >>> > > > >>> > > > > > >
> > > > > > > >>> > > > >>> > > > > >
> > > > > > > >>> > > > >>> > > > >
> > > > > > > >>> > > > >>> > > >
> > > > > > > >>> > > > >>> > >
> > > > > > > >>> > > > >>> >
> > > > > > > >>> > > > >>>
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >>
> > > > > > > >>> > > > >
> > > > > > > >>> > > >
> > > > > > > >>> > >
> > > > > > > >>> >
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > >
> > > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message