beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Solomon Duskis (JIRA)" <>
Subject [jira] [Commented] (BEAM-2955) Create a Cloud Bigtable HBase connector
Date Wed, 13 Sep 2017 22:59:00 GMT


Solomon Duskis commented on BEAM-2955:

It's awesome that you added the Dynamic rebalancing!  I'm ok with extending HBaseIO, as long
as there aren't any other overriding concerns.  I'd like to explore the possibility of templates
(ValueProviders) as the configuration of HBaseIO.

> Create a Cloud Bigtable HBase connector
> ---------------------------------------
>                 Key: BEAM-2955
>                 URL:
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-gcp
>            Reporter: Solomon Duskis
>            Assignee: Solomon Duskis
> The Cloud Bigtable (CBT) team has had a Dataflow connector maintained in a different
repo for awhile. Recently, we did some reworking of the Cloud Bigtable client that would allow
it to better coexist in the Beam ecosystem, and we also released a Beam connector in our repository
that exposes HBase idioms rather than the Protobuf idioms of BigtableIO.  More information
about the customer experience of the HBase connector can be found here: [].
> The Beam repo is a much better place to house a Cloud Bigtable HBase connector.  There
are a couple of ways we can implement this new connector:
> # The CBT connector depends on artifacts in the io/hbase maven project.  We can create
a new extend HBaseIO for the purposes of CBT.  We would have to add some features to HBaseIO
to make that work (dynamic rebalancing, and a way for HBase and CBT's size estimation models
to coexist)
> # The BigtableIO connector works well, and we can add an adapter layer on top of it.
 I have a proof of concept of it here: [].
> # We can build a separate CBT HBase connector.
> I'm happy to do the work.  I would appreciate some guidance and discussion about the
right approach.

This message was sent by Atlassian JIRA

View raw message