hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8655) In TextInputFormat, while specifying textinputformat.record.delimiter the character/character sequences in data file similar to starting character/starting character sequence in delimiter were found missing in certain cases in the Map Output
Date Thu, 23 Aug 2012 09:11:42 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440166#comment-13440166
] 

Hadoop QA commented on HADOOP-8655:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12542076/HADOOP-8655%20%282%29.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 1 new or modified test files.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common:

                  org.apache.hadoop.ha.TestZKFailoverController

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1348//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1348//console

This message is automatically generated.
                
> In TextInputFormat, while specifying textinputformat.record.delimiter the character/character
sequences in data file similar to starting character/starting character sequence in delimiter
were found missing in certain cases in the Map Output
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8655
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8655
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.2
>         Environment: Linux- Ubuntu 10.04
>            Reporter: Arun A K
>              Labels: hadoop, mapreduce, textinputformat, textinputformat.record.delimiter
>         Attachments: HADOOP-8655 (2).patch, HADOOP-8655.patch, HADOOP-8655.patch, HADOOP-8655.patch,
MAPREDUCE-4519.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Set textinputformat.record.delimiter as "</entity>"
> Suppose the input is a text file with the following content
> <entity><id>1</id><name>User1</name></entity><entity><id>2</id><name>User2</name></entity><entity><id>3</id><name>User3</name></entity><entity><id>4</id><name>User4</name></entity><entity><id>5</id><name>User5</name></entity>
> Mapper was expected to get value as 
> Value 1 - <entity><id>1</id><name>User1</name>
> Value 2 - <entity><id>2</id><name>User2</name>
> Value 3 - <entity><id>3</id><name>User3</name>
> Value 4 - <entity><id>4</id><name>User4</name>
> Value 5 - <entity><id>5</id><name>User5</name>
> According to this bug Mapper gets value
> Value 1 - entity><id>1</id><name>User1</name>
> Value 2 - <entity>id>2</id><name>User2</name>
> Value 3 - <entity><id>3id><name>User3</name>
> Value 4 - <entity><id>4</id><name>User4name>
> Value 5 - <entity><id>5</id><name>User5</name>
> The pattern shown above need not occur for value 1,2,3 necessarily. The bug occurs at
some random positions in the map input.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message