incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills" <jwi...@cloudera.com>
Subject Review Request: Latest take on CRUNCH-97, text parsing lib for Crunch
Date Wed, 21 Nov 2012 02:31:28 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8151/
-----------------------------------------------------------

Review request for crunch.


Description
-------

Latest and greatest rev of the extraction library for text parsing. I ended up refactoring
the approach so that we could support nested parsing (e.g., using different Scanner instances
for different parts of a line) and collections of items on a single line.


This addresses bug CRUNCH-97.
    https://issues.apache.org/jira/browse/CRUNCH-97


Diffs
-----

  crunch/src/main/java/org/apache/crunch/lib/PTables.java e788656 
  crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/Extractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ExtractorStats.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Parse.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ScannerFactory.java PRE-CREATION 
  crunch/src/test/java/org/apache/crunch/lib/text/ParseTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/8151/diff/


Testing
-------

Unit tests so far, still gathering feedback on the approach.


Thanks,

Josh Wills


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message