hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Burlison <Alan.Burli...@oracle.com>
Subject Re: DomainSocket issues on Solaris
Date Mon, 05 Oct 2015 23:34:35 GMT
On 05/10/15 18:30, Colin P. McCabe wrote:

> 1. Don't get DomainSocket working on Solaris.  Rely on the legacy
> short-circuit read instead.  It has poorer security guarantees, but
> doesn't require domain sockets.  You can add a line of code to the
> failing junit tests to skip them on Solaris.

I really don't want to do that as it relegates Solaris to only ever 
being a second-class citizen.

> 2. Use a separate "timer wheel" thread which implements coarse-grained
> timeouts by calling shutdown() on domain sockets that have been active
> for too long.  This thread could be global (one per JVM).

 From what I can tell that won't stop all the test failures as they are 
written with the assumption that per-socket timeouts are available and 
that they time out exactly when expected.

> 3. Implement the poll/select loop you discussed earlier.  As Steve
> commented, it would be easier to do this by adding new functions,
> rather than by changing existing ones.  I don't think "ifdef skid
> marks" are necessary since poll and select are supported on Linux and
> so forth as well as Solaris.  You would just need some code in
> DomainSocket.java to select the appropriate implementation at runtime
> based on the OS.

I could switch the implementation over to use poll everywhere but I 
haven't done that - Linux still uses socket timeouts. The issue is that 
in order to make poll() work I need to maintain the read/write timeouts 
alongside the filehandle - I can't store the timeout 'inside' the 
filehandle using setsockopt(). That means that the filehandle and the 
timeouts have to be stored together somewhere. The logical place to put 
the timeouts is in the same DomainSocket instances that holds the 
filehandle. If the DomainSocket JNI methods were all instance methods 
then there wouldn't be a problem, but they aren't, they are static 
methods where the integer filehandle is passed in as a parameter. And it 
wouldn't work if I change the native method parameter lists to include 
the timeouts as they need to be read/write. The only non-vile way I can 
come up with of doing this is to convert the JNI methods from static 
into instance methods. Even if that's the only change I make and I still 
pass in the filehandle as a parameter, the signatures will have changed 
as the 2nd parameter would now be an object reference and not a class 
reference.

The other option is to effectively write a complete Solaris-only 
replacement for DomainSocket, whether switching between that and the 
current one is done at compile or run-time isn't really the point. 
There's a fairly even split between the Java & JNI components of 
DomainSocket, so whichever way it's done there will be significant 
duplication of the overall logic and most likely code duplication. That 
means that bug fixes in one place have to be exactly mirrored in 
another, and that's unlikely to be sustainable.

My goal has been to keep the current logic as unchanged as possible. My 
prototype does that by literally prefixing each libc socket operation 
with a poll() call to check the filehandle is ready. The rest of the 
logic in DomainSocket is completely unchanged. That means that the 
behaviour between Linux and Solaris should be as identical as is possible.

> Since you commented that Solaris is implementing timeout support in
> the future, approaches #1 or #2 could be placeholders until that's
> finished.

Unfortunately I can't predict when that might happen by, though. In my 
prototype it probes for working timeouts at configure time, so when they 
do become available they'll be used automatically.

> I agree that there is no formal libhadoop.so compatibility policy and
> that is frustrating.  This has been an issue for those who want to run
> jars compiled against multiple different versions of hadoop through
> the same YARN instance.  We've discussed it in the past, but never
> really come up with a great solution.  The best approach really would
> be to bundle libhadoop.so inside the hadoop jar files, so that it
> could be integral to the Hadoop version itself.  However, nobody has
> done the work to make that happen.  The second-best approach would be
> to include the Hadoop version in the libhadoop name itself (so we'd
> have libhadoop28.so for hadoop 2.8, and so forth.)  Anyway, I think we
> can solve this particular issue without going down that rathole...

As I said, I believe that ship has long since sailed. Changes that have 
already been let in have I believe broken the backwards binary 
compatibility of the Java/JNI interface. Broken is broken, arguing that 
this proposal shouldn't be allowed in because it simply adds more 
brokenness to the existing brokenness is really missing the point. As 
far as I can tell, there already is no backwards compatibility.

-- 
Alan Burlison
--

Mime
View raw message