hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-439) Streaming does not work for text data if the records don't fit in a short UTF8 [2^16/3 characters]
Date Tue, 17 Oct 2006 23:01:39 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-439?page=all ]

Sameer Paranjpye resolved HADOOP-439.
-------------------------------------

    Fix Version/s: 0.6.0
       Resolution: Duplicate

Resolved as part of HADOOP-499

> Streaming does not work for text data if the records don't fit in a short UTF8 [2^16/3
characters]
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-439
>                 URL: http://issues.apache.org/jira/browse/HADOOP-439
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.5.0
>            Reporter: Dick King
>         Assigned To: Hairong Kuang
>            Priority: Critical
>             Fix For: 0.6.0
>
>
> The streaming code internally reads the input data into a UTF8 .  This causes truncated
data to be shipped to the mapper when the input exceeds about 21000 characters, with no notice
to the user except possibly in individual tasks' machines' logs, which people would not normally
read for apparently successful jobs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message