hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jignesh Patel <jigneshmpa...@gmail.com>
Subject Re: Hbase delta load
Date Thu, 21 Mar 2013 20:24:18 GMT
We are trying to bring two different databases in synch. So in real time we
insert data in 2 dbs(totally different format).
But in the night we run a batch job and do cross checking if db2(which is
actually Hbase) is missing a row or two we will insert it.

Data Matching:
We need to do user verification - i.e. when a new user inserted we will
check his demographics and based on that we conclude user already exist or


On Thu, Mar 21, 2013 at 12:20 PM, Andrew Purtell <apurtell@apache.org>wrote:

> I think you may need to provide just a bit more information about your
> use case. Could you define a bit more 'delta' and 'data matching'?
> In a sense, every bulk load is a delta: updates for insert into a
> larger table, representing a set of changes as a batch.
> We could consider the existing HBase mechanisms for handling
> multiversioning to be a simple "data matching functionality" via
> simple existence testing by coordinate, although I know that is not
> what you mean (but I don't know what you mean precisely).
> * - coordinate: { row, column, qualifier, timestamp }
> On 3/21/13, Jignesh Patel <jigneshmpatel@gmail.com> wrote:
> > We have a requirement to support data matching while loading deltas to
> > HBase.
> > I see there is a utility to support bulk loading.
> > http://hbase.apache.org/book/arch.bulk.load.html
> >
> > But is there any way to support daily delta loading?
> > Is there any open sourced MDM software which can be integrated with
> HBase?
> >
> > Does Hbase has any data matching functionality?
> >
> > -Jignesh
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message