chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <asrab...@gmail.com>
Subject Re: PipelineStageWriter doesn't work as expected
Date Fri, 18 Dec 2009 05:53:29 GMT
What's the use case for this?

The original motivation for pipelined writers was so that we could do
things like filtering before data got written.  Then it occurred to me
that SocketTeeWriter fit fairly naturally into a pipeline.

Putting it "after" seq file writer wouldn't be too bad --
SeqFileWriter.add() would need to call next.add().  But I would be
hesitant to commit that change, without a really clear use case.

--Ari

On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
> It works fine after I put SocketTeeWriter first.  What needs to be
> implemented in SeqFileWriter to be able to pipe correctly?
>
> Regards,
> Eric
>
> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <asrabkin@gmail.com> wrote:
>
>> Put the SocketTeeWriter first.
>>
>> sent from my iPhone; please excuse typos and brevity.
>>
>> On Dec 17, 2009, at 8:12 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>
>>> Hi all,
>>>
>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>> socket
>>> reader program.  When I tried to configure two writers, i.e.,
>>> SeqFileWriter
>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>> isn't
>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>> PipelineableWriter and do that and implemented setNextStage method,
>>> and
>>> configured collector with:
>>>
>>>  <property>
>>>    <name>chukwaCollector.writerClass</name>
>>>
>>> <value>
>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>> alue>
>>>  </property>
>>>
>>>  <property>
>>>    <name>chukwaCollector.pipeline</name>
>>>
>>> <value>
>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>  </property>
>>>
>>> SeqFileWriter writes the data correctly, but when connect to
>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>> Commands
>>> works fine, but data streaming doesn't happen.  How do I configure the
>>> collector and PipelineStageWriter to be able to write data into
>>> multiple
>>> writer?  Is there something on SeqFileWriter that could prevent this
>>> from
>>> working?
>>>
>>> Regards,
>>> Eric
>>>
>
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

Mime
View raw message