hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 임정택 <kabh...@gmail.com>
Subject Re: Can I configure multiple M/Rs and normal processes to one workflow?
Date Wed, 04 Feb 2015 22:30:25 GMT
This cluster is in service for manipulating OLTP (HBase), so I'm finding
simpler solution which may not required to modify cluster.

Can Oozie handle this workflow?
On 2015년 2월 5일 (목) at 오전 5:03 Ted Yu <yuzhihong@gmail.com> wrote:

> Have you considered using Apache Phoenix ?
> That way all your data is stored in one place.
>
> See http://phoenix.apache.org/
>
> Cheers
>
> On Tue, Feb 3, 2015 at 6:44 PM, 임정택 <kabhwan@gmail.com> wrote:
>
>> Hello all.
>>
>> We're periodically scan HBase tables to aggregate statistic information,
>> and store it to MySQL.
>>
>> We have 3 kinds of CP (kind of data source), each has one Channel and one
>> Article table.
>> (Channel : Article is 1:N relation.)
>>
>> All CPs table schema are different a bit, so in order to aggregate we
>> should apply different logics, with joining Channel and Article.
>>
>> I've thought about workflow like this, but I wonder it can make sense.
>>
>> 1. run single process which initializes MySQL by creating table, deleting
>> row, etc.
>> 2. run 3 M/Rs simultaneously to aggregate statistic information for each
>> CP, and insert rows  per Channel to MySQL.
>> 3. run single process which finalizes whole aggregation - runs
>> aggregation query from MySQL to insert new row to MySQL, rolling table, etc.
>>
>> Definitely 1,2,3 should be run in a row.
>>
>> Any helps are really appreciated!
>> Thanks.
>>
>> Regards.
>> Jungtaek Lim (HeartSaVioR)
>>
>

Mime
View raw message