chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Graham <billgra...@gmail.com>
Subject Re: Making TsProcessor's date format configurable
Date Tue, 06 Apr 2010 18:48:54 GMT
Sure, thanks Jerome. I assigned you the JobConf work:
https://issues.apache.org/jira/browse/CHUKWA-471

And I've got the date format for TsProcessor JIRA;
https://issues.apache.org/jira/browse/CHUKWA-472

As well as making the default processor configurable:
https://issues.apache.org/jira/browse/CHUKWA-473

For this last one how about a config like this:

<property>
 <name>chukwa.demux.default.processor</name>
 <value>org.apache.hadoop.chukwa.extraction.demux.processor.mapper.DefaultProcessor</value>
</property>


On Tue, Apr 6, 2010 at 10:57 AM, Jerome Boulon <jboulon@netflix.com> wrote:

> Hi,
> When you'll create a Jira for that, can you create a separate one for
> JobConf?
> I'll submit a patch for it.
> Thanks,
>   /Jerome
>
>
> On 4/6/10 10:50 AM, "Jerome Boulon" <jboulon@netflix.com> wrote:
>
> > Hi,
> >
> > Could you also make sure that you force sdf to be GMT?
> > sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
> >
> > - Instead of an If/then/else you could use the default value in
> > conf.get(key,defaultVal) to set the default format.
> >
> > - You can load jobConf directlty from the mapper/reducer but you will
> have
> > to add a new method to the AbstractProcessor/Reducer then any
> parser/reducer
> > class will have access to it. We don't need a distributed cache todo
> that.
> >
> > Thanks,
> >   /Jerome.
> >
> > On 4/6/10 10:18 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:
> >
> >> Hi Bill,
> >>
> >> We can introduce some configuration like this in chukwa-demux.conf.xml:
> >>
> >> <property>
> >>   <name>TsProcessor.time.format.some_data_type</name>
> >>   <value>yyyy-MM-dd HH:mm:ss,SSS</value>
> >> </property>
> >>
> >> Move the SimpleDateFormat outside of constructor.
> >>
> >> StringBuilder format = new StringBuilder();
> >> format.append(³TsProcessor.time.format²);
> >> format.append(chunk.getDataType());
> >> if(conf.get(format.toString)!=null) {
> >>   sdf = new SImpleDateFormat(conf.get(format.toString));
> >> } else {
> >>   sdf = new SImpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
> >> }
> >>
> >> It will require changes the MapperFactory class to include the running
> >> JobConf has a HashMap or load the JobConf from distributed cache.
> >>
> >> Regards,
> >> Eric
> >>
> >> On 4/6/10 9:55 AM, "Bill Graham" <billgraham@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I'd like to be able to configure the date format for TSProcessor.
> Looking at
> >>> the code, others have had the same thought:
> >>>
> >>>   public TsProcessor() {
> >>>     // TODO move that to config
> >>>     sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
> >>>   }
> >>>
> >>> I can write a patch to support this change, but how do we want to make
> the
> >>> date configurable? Currently there is a single config (AFAIK) that
> binds the
> >>> processor class to a data type in chukwa-demux-conf.xml that looks like
> >>> this:
> >>>
> >>>   <property>
> >>>     <name>some_data_type</name>
> >>>
> >>>
> >
>
> <value>org.apache.hadoop.chukwa.extraction.demux.processor.mapper.TsProcessor>>
> >
> > <
> >>> /value>
> >>>     <description>Parser for some_data_type</description>
> >>>   </property>
> >>>
> >>>
> >>> Any suggestions for how we'd incorporate date format into that config?
> Or
> >>> perhaps it would be a separate conf. Are there any examples in the code
> of
> >>> processors that take configurations currently?
> >>>
> >>> As a side note, I'd also like to add a configuration for what the
> default
> >>> processor should be, since I'd prefer to change ours from
> DefaultProcessor
> >>> to
> >>> TsProcessor. Maybe 'chukwa.demux.default.processor'? Thoughts?
> >>>
> >>> thanks,
> >>> Bill
> >>>
> >>
> >>
> >
> >
>
>

Mime
View raw message