incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills" <jwi...@cloudera.com>
Subject Review Request: Add helpers for parsing PCollection<String> instances
Date Mon, 03 Dec 2012 00:00:59 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8310/
-----------------------------------------------------------

Review request for crunch.


Description
-------

We should make it a bit easier to parse delimited text files into specific data types (e.g.,
ints, floats, etc.) or combinations of types-- e.g., pairs of strings and ints, a Tuple3 of
booleans, etc.


This addresses bug CRUNCH-97.
    https://issues.apache.org/jira/browse/CRUNCH-97


Diffs
-----

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java PRE-CREATION

  crunch/src/main/java/org/apache/crunch/lib/text/Extractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ExtractorStats.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Parse.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Tokenizer.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/TokenizerFactory.java PRE-CREATION 
  crunch/src/test/java/org/apache/crunch/lib/text/ParseTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/8310/diff/


Testing
-------

Unit tests.


Thanks,

Josh Wills


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message