hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Replicating data into HBase
Date Sat, 18 Apr 2009 16:52:11 GMT
You might take a look at Tim Sells' postgres to hbase uploader scripts here
for ideas:
http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/examples/uploaders/
St.Ack

2009/4/18 Billy Pearson <sales@pearsonwholesale.com>

> If you data is not to complex with multi fields etc. you could try to use
> mysql bin logs just use
> mysqlbinlog http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html to
> process bin logs and generate
> a text version of the logs and process them with a map and then reduce in
> to the table. this
> would not provide live data but you could run a simple shell script to
> process
> the bin logs then delete or move them if you needed to sync up you could
> call mysql to start a new bin log the shell
> script could be ran as a cron job and it would pick up the latest bin log
> and start the job.
>
> I would use linux command
> find /binlog/location/*.bin -mmin +5
> to find the logs that are ready to process.
> That will give you all the bin logs that have not been modflyed in 5 mins
>
> If your insert/update querys are not to complex to process it would be
> simple
>
> Billy
>
>
>
> "Brian Forney" <bforney@integral7.com> wrote in message
> news:FDE7BB03-3A6B-41E3-B31B-E5FE577B1589@integral7.com...
>
>  Ryan,
>>
>> Thanks. Yep, I've read the Bigtable paper (now and in 2006) and understand
>> that HBase and Bigtable are essentially large maps and do  not use the
>> relational model.
>>
>> Still interested in hearing if others have successfully done this.  (I'm
>> mostly looking for ways to speed up the implementation of a one- way
>> replication: from a relational DB to HBase.)
>>
>> Thanks,
>> Brian
>>
>> On Apr 17, 2009, at 5:45 PM, Ryan Rawson wrote:
>>
>>  HBase is not a relational database, so many things that are in a SQL
>>> database dont exist.
>>>
>>> eg:
>>> - sequences
>>> - secondary declarative keys
>>> - joins
>>> - advance query features such as order by, group by
>>> - operators of any kind
>>>
>>> Given conventions (eg: naming of index tables), it might be possible  to
>>> semi-automatedly convert data, but it might not efficiently take
>>> advantage
>>> of HBase's unique schema-less design.
>>>
>>> I suggest you have a look at the Google's bigtable paper, as it has  the
>>> same
>>> underlying model that HBase does.
>>>
>>> Good luck!
>>>
>>>
>>> On Fri, Apr 17, 2009 at 3:30 PM, Brian Forney <bforney@integral7.com>
>>> wrote:
>>>
>>>  Hi all,
>>>>
>>>> I'd like to replicate a large dataset from a relational database  into
>>>> HBase
>>>> for better throughput of MapReduce jobs. Has anyone had success
>>>> replicating
>>>> from a relational database (in my case SQL Server) to HBase?
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>>
>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message