hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sharad Agarwal (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3828) Write skipped records' bytes to DFS
Date Tue, 19 Aug 2008 14:05:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sharad Agarwal updated HADOOP-3828:
-----------------------------------

    Attachment: 3828_v1.patch

This works as follows:-
Write the skipped record (key,value) as SequenceFile.
By default the skipped records are written  in the folder "_skip" in the output dir. This
is configurable using SkipBadRecords.setSkipOutputPath

-The patch also fixes a corner case by initializing the variable "skipping" in TaskInProgress.
-Also it makes some changes in SortedRanges. Made it cloneable and fixed serialization of
member variable.
-cleanup in MapTask by having a different implementation of RecordReader for normal mode (skipping=false)

> Write skipped records' bytes to DFS
> -----------------------------------
>
>                 Key: HADOOP-3828
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3828
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Sharad Agarwal
>            Assignee: Sharad Agarwal
>         Attachments: 3828_v1.patch
>
>
> This is an incremental step over HADOOP-153, which provides the base skipping functionality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message