hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1210) standalone \r is treated as new line by RecordLineReader
Date Wed, 11 Nov 2009 19:08:39 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776579#action_12776579
] 

Olga Natkovich commented on MAPREDUCE-1210:
-------------------------------------------

Thanks, Owen, for clarification. Seems like the most flexible solution would be to tell RLR
what character(s) should be treated as line terminator.

> standalone \r is treated as new line by RecordLineReader
> --------------------------------------------------------
>
>                 Key: MAPREDUCE-1210
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1210
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>
> In PIg 0.6.0 we are switching to RecordLineReader from our own implementation. We are
seeing differences in record counts that were traced down to the fact that standalone \r is
treated as line end. I don't think there is any precedence for this and we would like to get
this resolved so that we can use RLR and not break backward compatibility. (This problem was
detected with real user data.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message