hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MAPREDUCE-6417) MapReduceClient's primitives.h is toxic and should be extirpated
Date Tue, 24 Nov 2015 20:03:11 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025240#comment-15025240
] 

Allen Wittenauer edited comment on MAPREDUCE-6417 at 11/24/15 8:02 PM:
-----------------------------------------------------------------------

test-patch goes through a bunch of different ways to get it to apply before finally giving
up, including using git apply and patch.  Here's what trying to do a git apply manually says:

I did a fresh checkout and it definitely fails to apply:

{code}
dhcp-210:test aw$ git apply MAPREDUCE-6417.001.patch 
error: patch failed: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h:42
error: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h:
patch does not apply
error: patch failed: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h:1
error: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h:
patch does not apply
{code}

Manually using patch also fails:

{code}
dhcp-210:test aw$ patch -p1 < MAPREDUCE-6417.001.patch 
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/CMakeLists.txt
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/handler/BatchHandler.h
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/Buffers.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/Buffers.h
Hunk #3 succeeded at 291 (offset 1 line).
Hunk #4 succeeded at 479 (offset 1 line).
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/Iterator.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/NativeObjectFactory.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h
Hunk #1 FAILED at 42.
Hunk #2 succeeded at 48 (offset -1 lines).
1 out of 2 hunks FAILED -- saving rejects to file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h.rej
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h
Reversed (or previously applied) patch detected!  Assume -R? [n] 
Apply anyway? [n] 
Skipping patch.
1 out of 1 hunk ignored -- saving rejects to file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h.rej
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test/TestPrimitives.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test/TestSort.cc
{code}


was (Author: aw):
test-patch goes through a bunch of different ways to get it to apply before finally giving
up, including using git apply and patch.  Here's what trying to do a git apply manually says:

I did a fresh checkout and it definitely fails to apply:

{code}
dhcp-210:test aw$ git apply MAPREDUCE-6417.001.patch 
error: patch failed: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h:42
error: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h:
patch does not apply
error: patch failed: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h:1
error: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h:
patch does not apply
{code}

{code}
dhcp-210:test aw$ patch -p1 < MAPREDUCE-6417.001.patch 
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/CMakeLists.txt
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/handler/BatchHandler.h
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/Buffers.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/Buffers.h
Hunk #3 succeeded at 291 (offset 1 line).
Hunk #4 succeeded at 479 (offset 1 line).
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/Iterator.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/NativeObjectFactory.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h
Hunk #1 FAILED at 42.
Hunk #2 succeeded at 48 (offset -1 lines).
1 out of 2 hunks FAILED -- saving rejects to file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/commons.h.rej
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h
Reversed (or previously applied) patch detected!  Assume -R? [n] 
Apply anyway? [n] 
Skipping patch.
1 out of 1 hunk ignored -- saving rejects to file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h.rej
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test/TestPrimitives.cc
patching file hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test/TestSort.cc
{code}

> MapReduceClient's primitives.h is toxic and should be extirpated
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-6417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6417
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: client
>    Affects Versions: 2.7.0
>            Reporter: Alan Burlison
>            Assignee: Alan Burlison
>            Priority: Blocker
>         Attachments: MAPREDUCE-6417.001.patch
>
>
> MapReduceClient's primitives.h attempts to provide optimised versions of standard library
memory copy and comparison functions. It has been the subject of several portability-related
bugs:
> * HADOOP-11505 hadoop-mapreduce-client-nativetask uses bswap where be32toh is needed,
doesn't work on non-x86
> * HADOOP-11665 Provide and unify cross platform byteorder support in native code
> * MAPREDUCE-6397 MAPREDUCE makes many endian-dependent assumptions
> * HADOOP-11484 hadoop-mapreduce-client-nativetask fails to build on ARM AARCH64 due to
x86 asm statements
> At present it only works on x86 and ARM64 as it lacks definitions for bswap and bswap64
for any platforms other than those.
> However it has even more serious problems on non-x86 architectures, for example on SPARC
simple_memcpy simply doesn't work at all:
> {code}
> $ cat bang.cc
> #include <string.h>
> #define SIMPLE_MEMCPY
> #include "primitives.h"
> int main(int argc, char **argv)
> {
>     char b1[9];
>     char b2[9];
>     simple_memcpy(b2, b1, sizeof(b1));
> }
> $ gcc -o bang bang.cc && ./bang
> Bus Error (core dumped)
> {code}
> That's because simple_memcpy does pointer fiddling that results in misaligned accesses,
which are illegal on SPARC.
> fmemcmp is also broken. Even if a definition of bswap is provided, on big-endian architectures
the result is simply wrong because of its unconditional use of bswap:
> {code}
> $ cat thud.cc
> #include <stdio.h>
> #include <strings.h>
> #include "primitives.h"
> int main(int argc, char **argv)
> {
>     char a[] = { 0,1,2,0 };
>     char b[] = { 0,2,1,0 };
>     printf("%lld %d\n", fmemcmp(a, b, sizeof(a), memcmp(a, b, sizeof(a))));
> }
> $ g++ -o thud thud.cc && ./thud
> 65280 -1
> {code}
> And in addition fmemcmp suffers from the same misalignment issues as simple_memcpy and
coredumps on SPARC when asked to compare odd-sized buffers.
> primitives.h provides the following functions:
> * bswap - used in 12 files in MRC but as HADOOP-11505 points out, mostly incorrectly
as it takes no account of platform endianness
> * bswap64 - used in 4 files in MRC, same comments as per bswap apply
> * simple_memcpy - used in 3 files in MRC, should be replaced with the standard memcpy
> * fmemcmp - used in 1 file, should be replaced with the standard memcmp
> * fmemeq - used in 1 file, should be replaced with the standard memcmp
> * frmemeq - not used at all, should just be removed
> *Summary*: primitives.h should simply be deleted and replaced with the standard memory
copy & compare functions, or with thin wrappers around them where the APIs are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message