hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11807) libhdfs++: Get minidfscluster tests running under valgrind
Date Wed, 08 Nov 2017 15:24:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244137#comment-16244137

James Clampffer commented on HDFS-11807:

Looks a lot better.  I noticed a few smaller things in the recent patch:

-check the return code on the socketpair() call

-You can't assume a file descriptor is going to be in the range \[0,255\]; you'll often get
this for a process that has just spawned but that's just some luck.  If the upper bit is set
it's also going to be explicitly sign extended when it's promoted to a wider type which can
make things a little confusing to debug.  I'd also convert the binary value to a string or
hex encoding because if the least significant byte of an int is 0 that's going to treated
as a null terminator.
      // The argument contains child socket
      fd[childsocket] = (int)argv[1][0];

-Since we're sharing this test with libhdfs which can build on windows we can't unconditionally
include headers that windows won't have.  Since this test also leans heavily on *nix style
process management e.g. fork() it might be best to just avoid building this test on windows.
  #include <unistd.h>

Nitpicky stuff, not blockers but would clean things up a little:

-Don't need an extra cast when calling free()
Same idea when writing httpHost over the socket
      ASSERT_INT64_EQ(read(fd[parentsocket], (char*)httpHost, hostSize), hostSize);

-I'd change "fd[]" so "fds" or something plural to make it's clear that it's an array since
a lot of C examples will name a single descriptor "fd".  You can also just make a single int
variable at the top of main() that gets assigned the appropriate side of the pair once you've
forked just to avoid indexing (you'll still need the array to pass to socketpair).

> libhdfs++: Get minidfscluster tests running under valgrind
> ----------------------------------------------------------
>                 Key: HDFS-11807
>                 URL: https://issues.apache.org/jira/browse/HDFS-11807
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: Anatoli Shein
>         Attachments: HDFS-11807.HDFS-8707.000.patch, HDFS-11807.HDFS-8707.001.patch,
HDFS-11807.HDFS-8707.002.patch, HDFS-11807.HDFS-8707.003.patch, HDFS-11807.HDFS-8707.004.patch,
HDFS-11807.HDFS-8707.005.patch, HDFS-11807.HDFS-8707.006.patch, HDFS-11807.HDFS-8707.007.patch
> The gmock based unit tests generally don't expose race conditions and memory stomps.
 A good way to expose these is running libhdfs++ stress tests and tools under valgrind and
pointing them at a real cluster.  Right now the CI tools don't do that so bugs occasionally
slip in and aren't caught until they cause trouble in applications that use libhdfs++ for
HDFS access.
> The reason the minidfscluster tests don't run under valgrind is because the GC and JIT
compiler in the embedded JVM do things that look like errors to valgrind.  I'd like to have
these tests do some basic setup and then fork into two processes: one for the minidfscluster
stuff and one for the libhdfs++ client test.  A small amount of shared memory can be used
to provide a place for the minidfscluster to stick the hdfsBuilder object that the client
needs to get info about which port to connect to.  Can also stick a condition variable there
to let the minidfscluster know when it can shut down.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message