hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Noels <stev...@outerthought.org>
Subject Re: slow move from rdbm to hadoop/hbase(is there replication strategies for this?)
Date Wed, 08 Dec 2010 13:42:02 GMT
On Tue, Dec 7, 2010 at 10:55 PM, Hiller, Dean  (Contractor)
<dean.hiller@broadridge.com> wrote:
> We are going to move 7 terabytes(set to grow to 35 when our SLA goes
> from 2 years to 10 years of storage) from an RDBMS to hadoop/hbase type
> system and I was wondering if anyone knew of how to get events from
> hbase on persisted/modified entities so that changes can be replicated
> to our RDBMS easily.

Again, I would suggest you to take a look at the RowLog library I
mentioned in my previous post. We use it as a message queue to
asynchronously feed SOLR indices, which somehow sounds as what you
need during your transition week. The Rowlog processor scans a WAL of
row update entries at regular intervals, giving you a near-real-time
up-to-dateness of your RDBMS replication queue. Nothing out of the box
though, you'll have some code to write but at the very least the
tricky bits are solved already for you.

As Todd is suggesting, you will need to be careful to not overload
your RDBMS, but doing it using a tighter integrated mechanism than
mass M/R might reduce that chance.

Kind regards,

Steven Noels
Open Source Content Applications
Makers of Kauri, Daisy CMS and Lily

View raw message