incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@netflix.com>
Subject Re: Making TsProcessor's date format configurable
Date Tue, 06 Apr 2010 17:50:45 GMT
Hi,

Could you also make sure that you force sdf to be GMT?
sdf.setTimeZone(TimeZone.getTimeZone("GMT"));

- Instead of an If/then/else you could use the default value in
conf.get(key,defaultVal) to set the default format.

- You can load jobConf directlty from the mapper/reducer but you will have
to add a new method to the AbstractProcessor/Reducer then any parser/reducer
class will have access to it. We don't need a distributed cache todo that.

Thanks,
  /Jerome.

On 4/6/10 10:18 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:

> Hi Bill,
> 
> We can introduce some configuration like this in chukwa-demux.conf.xml:
> 
> <property>
>   <name>TsProcessor.time.format.some_data_type</name>
>   <value>yyyy-MM-dd HH:mm:ss,SSS</value>
> </property>
> 
> Move the SimpleDateFormat outside of constructor.
> 
> StringBuilder format = new StringBuilder();
> format.append(³TsProcessor.time.format²);
> format.append(chunk.getDataType());
> if(conf.get(format.toString)!=null) {
>   sdf = new SImpleDateFormat(conf.get(format.toString));
> } else {
>   sdf = new SImpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
> }
> 
> It will require changes the MapperFactory class to include the running
> JobConf has a HashMap or load the JobConf from distributed cache.
> 
> Regards,
> Eric
> 
> On 4/6/10 9:55 AM, "Bill Graham" <billgraham@gmail.com> wrote:
> 
>> Hi,
>> 
>> I'd like to be able to configure the date format for TSProcessor. Looking at
>> the code, others have had the same thought:
>> 
>>   public TsProcessor() {
>>     // TODO move that to config
>>     sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
>>   }
>> 
>> I can write a patch to support this change, but how do we want to make the
>> date configurable? Currently there is a single config (AFAIK) that binds the
>> processor class to a data type in chukwa-demux-conf.xml that looks like this:
>> 
>>   <property>
>>     <name>some_data_type</name>
>>     
>> 
<value>org.apache.hadoop.chukwa.extraction.demux.processor.mapper.TsProcessor>>
<
>> /value>
>>     <description>Parser for some_data_type</description>
>>   </property>
>> 
>> 
>> Any suggestions for how we'd incorporate date format into that config? Or
>> perhaps it would be a separate conf. Are there any examples in the code of
>> processors that take configurations currently?
>> 
>> As a side note, I'd also like to add a configuration for what the default
>> processor should be, since I'd prefer to change ours from DefaultProcessor to
>> TsProcessor. Maybe 'chukwa.demux.default.processor'? Thoughts?
>> 
>> thanks,
>> Bill
>> 
> 
> 


Mime
View raw message