accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hayden Marchant <>
Subject Running Accumulo on the IBM JVM
Date Thu, 19 Jun 2014 10:17:58 GMT
Hi there,

I have been working on getting Accumulo running on IBM JDK, as preparation 
of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop 
distribution). I have come across a number of issues, to which I have made 
some local fixes in my own environment. Since I'm a newbie in Accumulo, I 
wanted to make sure that the approach that I have taken for resolving 
these issues is aligned with the design intent of Accumulo.

Some of the issues are real defects, and some are instances in which the 
assumption of Sun/Oracle JDK being the used JVM is hard-coded into the 

I have grouped the issues into 2 sections -  Unit test failures and 
Sun-specific dependencies (though there is an overlap)

1. Unit Test failures - should run consistently no matter which OS, Java 
vendor/version etc...
. This fails on IBM JRE, since the test is asserting order of elements in 
a HashMap. This consistently passes on Sun , and consistently fails on 
Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to 
        This test assumes a max heap of about 1GB. This fails on IBM JRE, 
since the default max heap is not specified, and on IBM JRE this depends 
on the OS (see
        Proposal: add -Xmx1g to the surefire maven plugin reference in 
parent maven pom.
        c. Both & 
org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due to 
calls to SEcureRandom with Random Number Generator Provider hard-coded as 
Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2 
issues - hard-coded calls to SecureRandom.getInstance(<algo>,"SUN") and 
also default value in Property class is "SUN". 
        Proposal: Add mechanism to override default Property through 
System property through new annotator in Property class. Only usage will 
2. Environment/Configuration
        a. The generated configuration files contain references to GC 
params that are specific to Sun JVM. In, the 
ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also 
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
        b. in bin/accumulo, get ClassNotFoundException due to 
specification of JAXP Doc Builder:

        The Sun implementation of Document Builder Factory does not exists 
in IBM JDK, so a ClassNotFoundException is thrown on running accumulo 
        c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl, 
Sun-speciifc GC params are passed as params to the java process (similar 
to section a. )
        Single proposal for solving all three above issues:
        Enhance with request to select Java vendor. 
Selecting this will set correct values for GC params (they differ between 
IBM and Sun), inclusion/ommision of JAXP setting. The 
MiniAccumuloClusterImpl can read the same env variable that was set in 
code for the GC Params, and use in the exec command.
 So far, my work has been focused on getting unit tests working for all 
Java vendors in a clean manner. I have not yet run intensive testing of 
real clusters following these changes, and would be happy to get pointers 
to what else might need treatment.
 I would also like to hear if these changes make sense, and if so, should 
I go ahead and create some JIRAs, and attach my patches for commit 
 Looking forward to hearing feedback!
 Hayden Marchant
 Software Architect
 IBM BigInsights, IBM
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message