hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Can I configure multiple M/Rs and normal processes to one workflow?
Date Wed, 04 Feb 2015 22:35:00 GMT
bq. Can Oozie handle this workflow?

I think so.
Better confirm on oozie mailing list.

Cheers

On Wed, Feb 4, 2015 at 2:30 PM, 임정택 <kabhwan@gmail.com> wrote:

> This cluster is in service for manipulating OLTP (HBase), so I'm finding
> simpler solution which may not required to modify cluster.
>
> Can Oozie handle this workflow?
>
> On 2015년 2월 5일 (목) at 오전 5:03 Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Have you considered using Apache Phoenix ?
>> That way all your data is stored in one place.
>>
>> See http://phoenix.apache.org/
>>
>> Cheers
>>
>> On Tue, Feb 3, 2015 at 6:44 PM, 임정택 <kabhwan@gmail.com> wrote:
>>
>>> Hello all.
>>>
>>> We're periodically scan HBase tables to aggregate statistic information,
>>> and store it to MySQL.
>>>
>>> We have 3 kinds of CP (kind of data source), each has one Channel and
>>> one Article table.
>>> (Channel : Article is 1:N relation.)
>>>
>>> All CPs table schema are different a bit, so in order to aggregate we
>>> should apply different logics, with joining Channel and Article.
>>>
>>> I've thought about workflow like this, but I wonder it can make sense.
>>>
>>> 1. run single process which initializes MySQL by creating table,
>>> deleting row, etc.
>>> 2. run 3 M/Rs simultaneously to aggregate statistic information for each
>>> CP, and insert rows  per Channel to MySQL.
>>> 3. run single process which finalizes whole aggregation - runs
>>> aggregation query from MySQL to insert new row to MySQL, rolling table, etc.
>>>
>>> Definitely 1,2,3 should be run in a row.
>>>
>>> Any helps are really appreciated!
>>> Thanks.
>>>
>>> Regards.
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>

Mime
View raw message