hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject Re: Replicating data into HBase
Date Sat, 18 Apr 2009 01:10:17 GMT
If you data is not to complex with multi fields etc. you could try to use 
mysql bin logs just use
mysqlbinlog http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html to 
process bin logs and generate
a text version of the logs and process them with a map and then reduce in to 
the table. this
would not provide live data but you could run a simple shell script to 
process
the bin logs then delete or move them if you needed to sync up you could 
call mysql to start a new bin log the shell
script could be ran as a cron job and it would pick up the latest bin log 
and start the job.

I would use linux command
find /binlog/location/*.bin -mmin +5
to find the logs that are ready to process.
That will give you all the bin logs that have not been modflyed in 5 mins

If your insert/update querys are not to complex to process it would be 
simple

Billy



"Brian Forney" <bforney@integral7.com> wrote in 
message news:FDE7BB03-3A6B-41E3-B31B-E5FE577B1589@integral7.com...
> Ryan,
>
> Thanks. Yep, I've read the Bigtable paper (now and in 2006) and 
> understand that HBase and Bigtable are essentially large maps and do  not 
> use the relational model.
>
> Still interested in hearing if others have successfully done this.  (I'm 
> mostly looking for ways to speed up the implementation of a one- way 
> replication: from a relational DB to HBase.)
>
> Thanks,
> Brian
>
> On Apr 17, 2009, at 5:45 PM, Ryan Rawson wrote:
>
>> HBase is not a relational database, so many things that are in a SQL
>> database dont exist.
>>
>> eg:
>> - sequences
>> - secondary declarative keys
>> - joins
>> - advance query features such as order by, group by
>> - operators of any kind
>>
>> Given conventions (eg: naming of index tables), it might be possible  to
>> semi-automatedly convert data, but it might not efficiently take 
>> advantage
>> of HBase's unique schema-less design.
>>
>> I suggest you have a look at the Google's bigtable paper, as it has  the 
>> same
>> underlying model that HBase does.
>>
>> Good luck!
>>
>>
>> On Fri, Apr 17, 2009 at 3:30 PM, Brian Forney 
>> <bforney@integral7.com> wrote:
>>
>>> Hi all,
>>>
>>> I'd like to replicate a large dataset from a relational database  into 
>>> HBase
>>> for better throughput of MapReduce jobs. Has anyone had success 
>>> replicating
>>> from a relational database (in my case SQL Server) to HBase?
>>>
>>> Thanks,
>>> Brian
>>>
>
> 



Mime
View raw message