incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@netflix.com>
Subject Re: Making TsProcessor's date format configurable
Date Tue, 06 Apr 2010 17:57:05 GMT
Hi, 
When you'll create a Jira for that, can you create a separate one for
JobConf?
I'll submit a patch for it.
Thanks,
  /Jerome


On 4/6/10 10:50 AM, "Jerome Boulon" <jboulon@netflix.com> wrote:

> Hi,
> 
> Could you also make sure that you force sdf to be GMT?
> sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
> 
> - Instead of an If/then/else you could use the default value in
> conf.get(key,defaultVal) to set the default format.
> 
> - You can load jobConf directlty from the mapper/reducer but you will have
> to add a new method to the AbstractProcessor/Reducer then any parser/reducer
> class will have access to it. We don't need a distributed cache todo that.
> 
> Thanks,
>   /Jerome.
> 
> On 4/6/10 10:18 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:
> 
>> Hi Bill,
>> 
>> We can introduce some configuration like this in chukwa-demux.conf.xml:
>> 
>> <property>
>>   <name>TsProcessor.time.format.some_data_type</name>
>>   <value>yyyy-MM-dd HH:mm:ss,SSS</value>
>> </property>
>> 
>> Move the SimpleDateFormat outside of constructor.
>> 
>> StringBuilder format = new StringBuilder();
>> format.append(³TsProcessor.time.format²);
>> format.append(chunk.getDataType());
>> if(conf.get(format.toString)!=null) {
>>   sdf = new SImpleDateFormat(conf.get(format.toString));
>> } else {
>>   sdf = new SImpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
>> }
>> 
>> It will require changes the MapperFactory class to include the running
>> JobConf has a HashMap or load the JobConf from distributed cache.
>> 
>> Regards,
>> Eric
>> 
>> On 4/6/10 9:55 AM, "Bill Graham" <billgraham@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> I'd like to be able to configure the date format for TSProcessor. Looking at
>>> the code, others have had the same thought:
>>> 
>>>   public TsProcessor() {
>>>     // TODO move that to config
>>>     sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
>>>   }
>>> 
>>> I can write a patch to support this change, but how do we want to make the
>>> date configurable? Currently there is a single config (AFAIK) that binds the
>>> processor class to a data type in chukwa-demux-conf.xml that looks like
>>> this:
>>> 
>>>   <property>
>>>     <name>some_data_type</name>
>>>     
>>> 
> 
<value>org.apache.hadoop.chukwa.extraction.demux.processor.mapper.TsProcessor>>
>
> <
>>> /value>
>>>     <description>Parser for some_data_type</description>
>>>   </property>
>>> 
>>> 
>>> Any suggestions for how we'd incorporate date format into that config? Or
>>> perhaps it would be a separate conf. Are there any examples in the code of
>>> processors that take configurations currently?
>>> 
>>> As a side note, I'd also like to add a configuration for what the default
>>> processor should be, since I'd prefer to change ours from DefaultProcessor
>>> to
>>> TsProcessor. Maybe 'chukwa.demux.default.processor'? Thoughts?
>>> 
>>> thanks,
>>> Bill
>>> 
>> 
>> 
> 
> 


Mime
View raw message