hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-439) Streaming does not work for text data if the records don't fit in a short UTF8 [2^16/3 characters]
Date Thu, 10 Aug 2006 01:50:13 GMT
Streaming does not work for text data if the records don't fit in a short UTF8 [2^16/3 characters]
--------------------------------------------------------------------------------------------------

                 Key: HADOOP-439
                 URL: http://issues.apache.org/jira/browse/HADOOP-439
             Project: Hadoop
          Issue Type: Bug
            Reporter: Dick King
            Priority: Critical


The streaming code internally reads the input data into a UTF8 .  This causes truncated data
to be shipped to the mapper when the input exceeds about 21000 characters, with no notice
to the user except possibly in individual tasks' machines' logs, which people would not normally
read for apparently successful jobs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message