incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: How to set up HDFS -> MySQL from trunk?
Date Wed, 17 Mar 2010 04:08:37 GMT
Chukwa use case is probably not effected by the decision on MAPREDUCE-1126.
Chukwa key is composed of Long (time partition), String (primary key), Long
(timestamp).  The value is composed of Avro blob.  I like to try out using
Avro to serialize the comparator, but it makes no difference in Chukwa use
case because it is likely that I have to write my own comparator to begin
with for Tfile.  I agree with Chris Douglas and Time White said, the Avro
serializing comparator should be optional.

I like Tim's example:

Schema keySchema = ...
AvroGenericData.setMapOutputKeySchema(job, keySchema);

Hope this helps.

Regards,
Eric

On 3/16/10 2:56 PM, "Jeff Hammerbacher" <hammer@cloudera.com> wrote:

> Hey Eric,
> 
> Could you chime in on MAPREDUCE-815 with your potential use case? We're
> currently blocked on other issues, but getting more use cases on the table
> will be helpful.
> 
> Thanks,
> Jeff
> 
> On Mon, Mar 15, 2010 at 7:41 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>> Hi Kirk,
>> 
>> The Avro + Tfile plan depends on
>> https://issues.apache.org/jira/browse/MAPREDUCE-815.  The work can start
>> once Avro Input/Out format patch is included in a release build of Hadoop.
>> Hence, I would project to complete this migration would be at least six
>> months from Avro Mapreduce ready.  It's a fair big chunk of work, and it
>> would be great if people want to pitch in to build aggregator piece to
>> control the workflow.  See https://issues.apache.org/jira/browse/CHUKWA-444
>> for reference.
>> 
>> Regards,
>> Eric
>> 
>> On 3/15/10 3:03 PM, "Kirk True" <kirk@mustardgrain.com> wrote:
>> 
>>> Hi Eric,
>>> 
>>> Any notion as to the ETA for completion of the migration?
>>> 
>>> Thanks,
>>> Kirk
>>> 
>>> Eric Yang wrote:
>>>> 
>>>> Hi Kirk,
>>>> 
>>>> I am working on a design which removes MySQL from Chukwa.  I am making this
>>>> departure from MySQL because MDL framework was for prototype purpose.  It
>>>> will not scale in production system where Chukwa could be host on large
>>>> hadoop cluster.  HICC will serve data directly from HDFS in the future.
>>>> 
>>>> Meanwhile, the dbAdmin.sh from Chukwa 0.3 is still compatible with trunk
>>>> version of Chukwa.  You can load ChukwaRecords using
>>>> org.apache.hadoop.chukwa.dataloader.MetricDataLoader class or mdl.sh from
>>>> Chukwa 0.3.
>>>> 
>>>> MetricDataLoader class will be mark as deprecated, and it will not be
>>>> supported once we make transition to Avro + Tfile.
>>>> 
>>>> Regards,
>>>> Eric
>>>> 
>>>> On 3/15/10 11:56 AM, "Kirk True" <kirk@mustardgrain.com>
>>>> <mailto:kirk@mustardgrain.com>  wrote:
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I recently switched to trunk as I was experiencing a lot of issues with
>>>>> 0.3.0. In 0.3.0, there was a dbAdmin.sh script that would run and try
to
>>>>> stick data in MySQL from HDFS. However, that script is gone and when
I
>>>>> run the system as built from trunk, nothing is ever populated in the
>>>>> database. Where are the instructions for setting up the HDFS -> MySQL
>>>>> data migration for HICC?
>>>>> 
>>>>> Thanks,
>>>>> Kirk
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 
> 
> 


Mime
View raw message