samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yan Fang" <yanfang...@gmail.com>
Subject Re: Review Request 35445: SAMZA-693: Very basic HDFS Producer service for Samza
Date Thu, 25 Jun 2015 18:35:54 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35445/#review89390
-----------------------------------------------------------



samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala (line 47)
<https://reviews.apache.org/r/35445/#comment141999>

    make this configurable as well?



samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala (line 58)
<https://reviews.apache.org/r/35445/#comment142000>

    aggree to get it pluggable.



samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala (line 62)
<https://reviews.apache.org/r/35445/#comment141978>

    I would reommend all the log msgs follow the same format with other Samza code by removing
the "s".



samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala (line 76)
<https://reviews.apache.org/r/35445/#comment141972>

    We can have the debug information there.



samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala (line 81)
<https://reviews.apache.org/r/35445/#comment141974>

    same as above



samza-hdfs/src/test/resources/samza-hdfs-test-job.properties (line 1)
<https://reviews.apache.org/r/35445/#comment141998>

    Is this used anywhere? We can put it in the testing class since it only has one property.


- Yan Fang


On June 14, 2015, 10:17 p.m., Eli Reisman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35445/
> -----------------------------------------------------------
> 
> (Updated June 14, 2015, 10:17 p.m.)
> 
> 
> Review request for samza.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-693: Very basic HDFS Producer service for Samza
> 
> 
> Diffs
> -----
> 
>   build.gradle a5f54106a822dc91ff82270df27217a8765a0d80 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducerMetrics.scala
PRE-CREATION 
>   samza-hdfs/src/test/org/apache/samza/system/hdfs/TestHdfsSystemProducer.scala PRE-CREATION

>   samza-hdfs/src/test/resources/samza-hdfs-test-job.properties PRE-CREATION 
>   settings.gradle bb07a3b84b14dcef94da1bb166eab6aa3d0026bb 
> 
> Diff: https://reviews.apache.org/r/35445/diff/
> 
> 
> Testing
> -------
> 
> New unit test, but it's fairly rudimentary. Passes "./gradlew test" and "./gradlew check"
> 
> This only supplies an HDFS Producer, and this producer only writes SequenceFiles of ByteWriteables
so far. If the patch were accepted as-is, I'd suggest future tickets for a matching HDFS Consumer,
and a pluggable set of output formats, configurable via HdfsConfig settings.
> 
> On the upside, this patch has been tested on a real cluster with real data, using several
serdes, with good results.
> 
> 
> Thanks,
> 
> Eli Reisman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message