hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerome Boulon (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-5018) Chukwa should support pipelined writers
Date Tue, 13 Jan 2009 23:58:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663552#action_12663552
] 

jboulon edited comment on HADOOP-5018 at 1/13/09 3:57 PM:
----------------------------------------------------------------

My point is hat the pipelineWriter should be an implementation of the ChukwaWriter interface
and that's really the only thing that the collector should be aware of.
So to be able to do what you want:

1) The collector should instantiate one writer implementation based on his configuration
2) The writer should be able to get the collector configuration from somewhere (current design)
or should have an init method with a Configuration parameter
3) The contract from the collector point of view is the same call one method on the writer
class and the result is success if there's no exception

the delta with your implementation is:

- Remove code from     if (conf.get("chukwaCollector.pipeline") != null) ..
- Replace by something like:

writerClassName = conf.get("chukwaCollector.writer","org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter").
Class myWriter = conf.getClassByName(writerClassName);
Writer st = myWriter.newInstance()
st.init();

- Remove all writer initialization from CollectorStub.java
- and move all the code to create the pipeline to the init method inside a PipelineWriter
class, instead of ServletCollector.java

That way the writer interface is still simple, the collector class stay very simple and this
does not prevent anybody from having a specific writer implementation.
So at the end you have:

public class PipelineWriter implements ChukwaWriter
{

public void init() throws WriterException
{
+    if (conf.get("chukwaCollector.pipeline") != null) {
+      String pipeline = conf.get("chukwaCollector.pipeline");
+      try {
+        String[] classes = pipeline.split(",");
+        ArrayList<PipelineStageWriter> stages = new ArrayList<PipelineStageWriter>();
[...]
}

public void add(List<Chunk> chunks) throws WriterException
{ // call all PipelineStageWriter in sequence }

}



      was (Author: jboulon):
    My point is hat the pipelineWriter should be an implementation of the ChukwaWriter interface
and that's really the only thing that the collector should be aware of.
So to be able to do what you want:

1) The collector should instantiate one writer implementation based on his configuration
2) The writer should be able to get the collector configuration from somewhere (current design)
or should have an init method with a Configuration parameter
3) The contract from the collector point of view is the same call one method on the writer
class and the result is success if there's no exception

the delta with your implementation is:

- Remove code from     if (conf.get("chukwaCollector.pipeline") != null) ..
- Replace by something like:

writerClassName = conf.get("chukwaCollector.writer","org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter").
Class myWriter = conf.getClassByName(writerClassName);
Writer st = myWriter.newInstance()
st.init();

- Remove all writer initialization from CollectorStub.java
- and move all the code to create the pipeline to the init method inside a PipelineWriter
class, instead of ServletCollector.java

That way the writer interface is still simple, the collector class stay very simple and this
does not prevent anybody from having a specific writer implementation.
So at the end you have:

public class PipelineWriter implements ChukwaWriter
{

public void init() throws WriterException
{
+    if (conf.get("chukwaCollector.pipeline") != null) {
+      String pipeline = conf.get("chukwaCollector.pipeline");
+      try {
+        String[] classes = pipeline.split(",");
+        ArrayList<PipelineStageWriter> stages = new ArrayList<PipelineStageWriter>();
[...]
}

public void add(List<Chunk> chunks) throws WriterException
{
// call all PipelineStageWriter in sequence
}

}


  
> Chukwa should support pipelined writers
> ---------------------------------------
>
>                 Key: HADOOP-5018
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5018
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>         Attachments: pipeline.patch
>
>
> We ought to support chaining together writers; this will radically increase flexibility
and make it practical to add new features without major surgery by putting them in pass-through
or filter classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message