crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills" <jwi...@cloudera.com>
Subject Re: Review Request: Latest take on CRUNCH-97, text parsing lib for Crunch
Date Sun, 25 Nov 2012 19:21:19 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8151/
-----------------------------------------------------------

(Updated Nov. 25, 2012, 7:21 p.m.)


Review request for crunch.


Changes
-------

Incorporated feedback from Matthias and Gabriel; added a bunch of javadoc.


Description
-------

Latest and greatest rev of the extraction library for text parsing. I ended up refactoring
the approach so that we could support nested parsing (e.g., using different Scanner instances
for different parts of a line) and collections of items on a single line.


This addresses bug CRUNCH-97.
    https://issues.apache.org/jira/browse/CRUNCH-97


Diffs (updated)
-----

  crunch/src/main/java/org/apache/crunch/lib/PTables.java e788656 
  crunch/src/main/java/org/apache/crunch/lib/PTables.java e788656 
  crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/Extractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ExtractorStats.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ExtractorStats.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Parse.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Parse.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ScannerFactory.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ScannerFactory.java PRE-CREATION 
  crunch/src/test/java/org/apache/crunch/lib/text/ParseTest.java PRE-CREATION 
  crunch/src/test/java/org/apache/crunch/lib/text/ParseTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/8151/diff/


Testing
-------

Unit tests so far, still gathering feedback on the approach.


Thanks,

Josh Wills


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message