apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhupesh Chawda <bhup...@datatorrent.com>
Subject Re: Adding features to HBase Input Operators in Malhar-contrib
Date Wed, 23 Dec 2015 14:17:12 GMT
Thanks for the inputs.
As an input operator, I am targeting just the Scan operation. Get operation
may be supported better as a generic operator (like a query operator) which
I can take up later.

-Bhupesh

On Tue, Dec 22, 2015 at 3:48 PM, Mohit Jotwani <mohit@datatorrent.com>
wrote:

> +1
>
> Regards,
> Mohit
>
> On Tue, Dec 22, 2015 at 11:21 AM, Chinmay Kolhatkar <
> chinmay@datatorrent.com
> > wrote:
>
> > +1 for above.
> > I see that there is HbaseGetOperator but but its abstract no concrete
> > implementation of this I can find.
> > Are you going to implement of that too?
> >
> > Maybe the concrete implementation of HbaseGetOperator should have this.
> >
> > Also, I want to mention one thing about scan from my previous experience
> of
> > Hbase. The Hbase client is synchronous.
> > This means when you fire a scan call, until certain number of records are
> > received at client end, the function blocks.
> > This causes a lot of problems in the current thread as it might just get
> > blocked for a long period of time.
> > Plus, there are always network related latency to add to the problem.
> >
> > Usually the way to deal with this is to fire scan like queries on a
> > separate thread and then consume the results in the main thread.
> >
> > Please take care of this scenario while implementation of scan operator.
> >
> > -Chinmay.
> >
> >
> > ~ Chinmay.
> >
> > On Tue, Dec 22, 2015 at 11:08 AM, Sandeep Deshmukh <
> > sandeep@datatorrent.com>
> > wrote:
> >
> > > +1 for this Bhupesh.
> > >
> > > Additionally, I would suggest to add support for;
> > > 1. Point query
> > > 2. Returning any row version
> > >
> > > The above two are key features of HBase and should be supported.
> > >
> > > Regards,
> > > Sandeep
> > >
> > > On Fri, Dec 18, 2015 at 4:39 PM, Bhupesh Chawda <
> bhupesh@datatorrent.com
> > >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > The current HBasePOJOInputOperator does not allow us to do the
> > following:
> > > >
> > > >    1. Allow us to specify a set of "column family: column" and fetch
> > data
> > > >    only for these columns.
> > > >    2. Output format is currently a POJO. We need to have other output
> > > >    formats such that "columnFamily:column" representation is
> supported.
> > > > Map /
> > > >    CSV are some of the options.
> > > >    3. Allow specifying "end row-key" to stop scanning a table.
> > > >    4. No metrics.
> > > >
> > > > I am planning to add the above functionality to the HBase Input
> > > operators.
> > > > These features may go into the HBaseScanOperator /
> > > HBasePOJOInputOperator.
> > > >
> > > > Please let me know your comments.
> > > >
> > > > Thanks.
> > > >
> > > > Bhupesh
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message