incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Friedrich" <m...@mafr.de>
Subject Re: Review Request: Latest take on CRUNCH-97, text parsing lib for Crunch
Date Sat, 24 Nov 2012 13:32:22 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8151/#review13700
-----------------------------------------------------------



crunch/src/main/java/org/apache/crunch/lib/PTables.java
<https://reviews.apache.org/r/8151/#comment29361>

    This method doesn't seem to be used anywhere. Remove?



crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java
<https://reviews.apache.org/r/8151/#comment29362>

    Missing javadoc



crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java
<https://reviews.apache.org/r/8151/#comment29363>

    This class generates lots of warnings in Eclipse; using Extractor<?> instead of
just Extractor will shut Eclipse up.



crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java
<https://reviews.apache.org/r/8151/#comment29364>

    Needs javadoc describing the contract. I think GoF suggests doExtract() for template methods.



crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java
<https://reviews.apache.org/r/8151/#comment29365>

    I think all inner classes here should be private, like in Aggregators. Otherwise we'll
add a lot of new classes to our javadocs. Also, this class needs some more stating-the-obvious
javadoc ;-)



crunch/src/main/java/org/apache/crunch/lib/text/Parse.java
<https://reviews.apache.org/r/8151/#comment29366>

    Final, with private constructor?



crunch/src/main/java/org/apache/crunch/lib/text/Parse.java
<https://reviews.apache.org/r/8151/#comment29367>

    Does this need to be public?



crunch/src/main/java/org/apache/crunch/lib/text/ScannerFactory.java
<https://reviews.apache.org/r/8151/#comment29369>

    I think a fluent Builder pattern would be more appropriate than a Factory. This would
solve the problem of bulky method names like forDelimiterAnSkip().



crunch/src/main/java/org/apache/crunch/lib/text/ScannerFactory.java
<https://reviews.apache.org/r/8151/#comment29368>

    As per classic Code Conventions from Sun, attributes should be at the top of the class.


Pretty cool, looking forward to the next round.

It was too late already when I noticed that you can annotate multiple lines, but otherwise
I think I could get used to RB ;-)

- Matthias Friedrich


On Nov. 21, 2012, 2:31 a.m., Josh Wills wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8151/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2012, 2:31 a.m.)
> 
> 
> Review request for crunch.
> 
> 
> Description
> -------
> 
> Latest and greatest rev of the extraction library for text parsing. I ended up refactoring
the approach so that we could support nested parsing (e.g., using different Scanner instances
for different parts of a line) and collections of items on a single line.
> 
> 
> This addresses bug CRUNCH-97.
>     https://issues.apache.org/jira/browse/CRUNCH-97
> 
> 
> Diffs
> -----
> 
>   crunch/src/main/java/org/apache/crunch/lib/PTables.java e788656 
>   crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java PRE-CREATION

>   crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java PRE-CREATION

>   crunch/src/main/java/org/apache/crunch/lib/text/Extractor.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/lib/text/ExtractorStats.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/lib/text/Parse.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/lib/text/ScannerFactory.java PRE-CREATION 
>   crunch/src/test/java/org/apache/crunch/lib/text/ParseTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/8151/diff/
> 
> 
> Testing
> -------
> 
> Unit tests so far, still gathering feedback on the approach.
> 
> 
> Thanks,
> 
> Josh Wills
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message