hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader
Date Sat, 14 Nov 2009 07:06:40 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777832#action_12777832
] 

Hadoop QA commented on MAPREDUCE-1176:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424931/MAPREDUCE-1176-v2.patch
  against trunk revision 836063.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/244/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/244/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/244/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/244/console

This message is automatically generated.

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1176
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 0.20.1, 0.20.2
>         Environment: Any
>            Reporter: BitsOfInfo
>            Priority: Minor
>         Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into the mapreduce.lib.input
package. These two classes can be used when you need to read data from files containing fixed
length (fixed width) records. Such files have no CR/LF (or any combination thereof), no delimiters
etc, but each record is a fixed length, and extra data is padded with spaces. The data is
one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its corresponding FixedLengthRecordReader.
When creating a job that specifies this input format, the job must have the "mapreduce.input.fixedlengthinputformat.record.length"
property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that InputSplits do
not contain any partial records since with fixed records there is no way to determine where
a record begins if that were to occur. Each InputSplit passed to the FixedLengthRecordReader
will start at the beginning of a record, and the last byte in the InputSplit will be the last
byte of a record. The override of computeSplitSize() delegates to FileInputFormat's compute
method, and then adjusts the returned split size by doing the following: (Math.floor(fileInputFormatsComputedSplitSize
/ fixedRecordLength) * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed files. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message