hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Postgresql replication into hbase.
Date Fri, 25 Jul 2008 21:38:30 GMT
Tim Sell wrote:
> It would be handy to be able to easily dump data from postgresql
> straight to hbase. Then keep the data in hbase up to date.
> I've made a simple python tool called hbreplic (I'm very willing to
> come up with an easier to type name).
How do you pronounce that?

> It has two main purposes, bootstrap, where it copies columns from
> postgresql tables to hbase.
> And, play, where it processes incoming insert, update and delete
> events on the postgresql tables and update hbase with them.
> The hbase table/family/column layout is whatever you want it to be.
> The hbase row keys at the moment are taken from a specified postgresql
> column (presumably the primary key, but not enforced), with an
> optional prefix.
> It handles schema changes, in that it doesn't care what the table
> looks like as long as the table has the columns that you specify in an
> ini file.
> It makes use of PgQ which is part of skytools (a bunch of postgresql
> database tools released by skype).
> PgQ is a queuing management thing for events.
> It depends on python, skytools, and thrift.
> It's pretty rudimentary at the moment, but easy to use.
> We'd like to open source it and make it better.
> Would people be interested in this?
> Is there some kind of hbase contrib we could potentially add this to?
> On Monday we'll probably make the source available somewhere with instructions.

It sounds excellent Tim.  A nice contrib.  If you want to add it, add it 
to a JIRA and I'll add it under hbase/contrib.  Add a bit of doc. so 
browsers can figure what it is -- especially since current name gives no 
clue what it does  (smile).

View raw message