hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
Date Fri, 13 May 2016 14:32:13 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282735#comment-15282735
] 

Jason Lowe commented on MAPREDUCE-6558:
---------------------------------------

Apparently Jenkins is having trouble posting to JIRA.  The precommit build was an overall
+1 for the .3 patch.  From https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6495/console:
{noformat}

+1 overall

| Vote |      Subsystem |  Runtime   | Comment
============================================================================
|   0  |        reexec  |  0m 13s    | Docker mode activated. 
|  +1  |       @author  |  0m 0s     | The patch does not contain any @author 
|      |                |            | tags.
|  +1  |    test4tests  |  0m 0s     | The patch appears to include 3 new or 
|      |                |            | modified test files.
|  +1  |    mvninstall  |  6m 39s    | trunk passed 
|  +1  |       compile  |  0m 20s    | trunk passed with JDK v1.8.0_91 
|  +1  |       compile  |  0m 24s    | trunk passed with JDK v1.7.0_101 
|  +1  |    checkstyle  |  0m 17s    | trunk passed 
|  +1  |       mvnsite  |  0m 30s    | trunk passed 
|  +1  |    mvneclipse  |  0m 13s    | trunk passed 
|  +1  |      findbugs  |  1m 1s     | trunk passed 
|  +1  |       javadoc  |  0m 20s    | trunk passed with JDK v1.8.0_91 
|  +1  |       javadoc  |  0m 26s    | trunk passed with JDK v1.7.0_101 
|  +1  |    mvninstall  |  0m 24s    | the patch passed 
|  +1  |       compile  |  0m 18s    | the patch passed with JDK v1.8.0_91 
|  +1  |         javac  |  0m 18s    | the patch passed 
|  +1  |       compile  |  0m 22s    | the patch passed with JDK v1.7.0_101 
|  +1  |         javac  |  0m 22s    | the patch passed 
|  +1  |    checkstyle  |  0m 15s    | the patch passed 
|  +1  |       mvnsite  |  0m 27s    | the patch passed 
|  +1  |    mvneclipse  |  0m 11s    | the patch passed 
|  +1  |    whitespace  |  0m 0s     | Patch has no whitespace issues. 
|  +1  |      findbugs  |  1m 12s    | the patch passed 
|  +1  |       javadoc  |  0m 18s    | the patch passed with JDK v1.8.0_91 
|  +1  |       javadoc  |  0m 23s    | the patch passed with JDK v1.7.0_101 
|  +1  |          unit  |  1m 53s    | hadoop-mapreduce-client-core in the 
|      |                |            | patch passed with JDK v1.8.0_91.
|  +1  |          unit  |  2m 15s    | hadoop-mapreduce-client-core in the 
|      |                |            | patch passed with JDK v1.7.0_101.
|  +1  |    asflicense  |  0m 18s    | Patch does not generate ASF License 
|      |                |            | warnings.
|      |                |  19m 38s   | 


|| Subsystem || Report/Notes ||
============================================================================
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12803799/MAPREDUCE-6558.3.patch
|
| JIRA Issue | MAPREDUCE-6558 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 2b492d4fc64c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3c5c57a |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101
|
| findbugs | v3.0.0 |
| JDK v1.7.0_101  Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6495/testReport/
|
| modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
| Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6495/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |
{noformat}

+1, patch looks good to me as well.  Committing this.

> multibyte delimiters with compressed input files generate duplicate records
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6558
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1, mrv2
>    Affects Versions: 2.7.2
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>         Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch, MAPREDUCE-6558.3.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record duplications
as shown in different junit tests. The number of duplicated records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 100000)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The file is a
bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message