chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: PipelineStageWriter doesn't work as expected
Date Fri, 18 Dec 2009 16:59:27 GMT
I like to make a T on the incoming data.  One writer goes into HDFS, and
another writer enable real time pub/sub to monitor the data.  In my case,
the data are mirrored, not filtered.  However, I am not getting the right
result because it seems the data isn't getting written into HDFS regardless
the ordering of the writer.

Regards,
Eric

On 12/17/09 9:53 PM, "Ariel Rabkin" <asrabkin@gmail.com> wrote:

> What's the use case for this?
> 
> The original motivation for pipelined writers was so that we could do
> things like filtering before data got written.  Then it occurred to me
> that SocketTeeWriter fit fairly naturally into a pipeline.
> 
> Putting it "after" seq file writer wouldn't be too bad --
> SeqFileWriter.add() would need to call next.add().  But I would be
> hesitant to commit that change, without a really clear use case.
> 
> --Ari
> 
> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>> It works fine after I put SocketTeeWriter first.  What needs to be
>> implemented in SeqFileWriter to be able to pipe correctly?
>> 
>> Regards,
>> Eric
>> 
>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <asrabkin@gmail.com> wrote:
>> 
>>> Put the SocketTeeWriter first.
>>> 
>>> sent from my iPhone; please excuse typos and brevity.
>>> 
>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>> socket
>>>> reader program.  When I tried to configure two writers, i.e.,
>>>> SeqFileWriter
>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>> isn't
>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>> and
>>>> configured collector with:
>>>> 
>>>>  <property>
>>>>    <name>chukwaCollector.writerClass</name>
>>>> 
>>>> <value>
>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>> alue>
>>>>  </property>
>>>> 
>>>>  <property>
>>>>    <name>chukwaCollector.pipeline</name>
>>>> 
>>>> <value>
>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>  </property>
>>>> 
>>>> SeqFileWriter writes the data correctly, but when connect to
>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>> Commands
>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>> collector and PipelineStageWriter to be able to write data into
>>>> multiple
>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>> from
>>>> working?
>>>> 
>>>> Regards,
>>>> Eric
>>>> 
>> 
>> 
> 
> 


Mime
View raw message