accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vicky Kak <vicky....@gmail.com>
Subject Re: Running Accumulo on the IBM JVM
Date Thu, 19 Jun 2014 11:14:23 GMT
Hi Hayden,

Most of the recommendation looks okay to me since there are many change to
be done I think you should go ahead and create main JIRA which would have
multiple subtasks addressing all the changes.
I am almost sure that you might get into similar kind of issue if you run
other java based NoSql distributions i.e. HBase/Cassandra on IBM jdk, I
personally had surprises in api calls related to ordering in my application
a long back ago. Your observations looks reasonable to me.

Regards,
Vicky


On Thu, Jun 19, 2014 at 3:47 PM, Hayden Marchant <HAYDEN@il.ibm.com> wrote:

> Hi there,
>
> I have been working on getting Accumulo running on IBM JDK, as preparation
> of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop
> distribution). I have come across a number of issues, to which I have made
> some local fixes in my own environment. Since I'm a newbie in Accumulo, I
> wanted to make sure that the approach that I have taken for resolving
> these issues is aligned with the design intent of Accumulo.
>
> Some of the issues are real defects, and some are instances in which the
> assumption of Sun/Oracle JDK being the used JVM is hard-coded into the
> source-code.
>
> I have grouped the issues into 2 sections -  Unit test failures and
> Sun-specific dependencies (though there is an overlap)
>
> 1. Unit Test failures - should run consistently no matter which OS, Java
> vendor/version etc...
>         a.
>
> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
> . This fails on IBM JRE, since the test is asserting order of elements in
> a HashMap. This consistently passes on Sun , and consistently fails on
> Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to
> TreeMap
>
>         b.
>
> org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite.
>         This test assumes a max heap of about 1GB. This fails on IBM JRE,
> since the default max heap is not specified, and on IBM JRE this depends
> on the OS (see
>
> http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en
> ).
>         Proposal: add -Xmx1g to the surefire maven plugin reference in
> parent maven pom.
>
>         c. Both org.apache.accumulo.core.security.crypto.CrypoTest &
> org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due to
> calls to SEcureRandom with Random Number Generator Provider hard-coded as
> Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2
> issues - hard-coded calls to SecureRandom.getInstance(<algo>,"SUN") and
> also default value in Property class is "SUN".
>         Proposal: Add mechanism to override default Property through
> System property through new annotator in Property class. Only usage will
> be by Property.CRYPTO_SECURE_RNG_PROVIDER
>
>
> 2. Environment/Configuration
>         a. The generated configuration files contain references to GC
> params that are specific to Sun JVM. In accumulo-env.sh, the
> ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also
> in ACCUMULO_GENERAL_OPTS,
> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
>         b. in bin/accumulo, get ClassNotFoundException due to
> specification of JAXP Doc Builder:
>
> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
> .
>         The Sun implementation of Document Builder Factory does not exists
> in IBM JDK, so a ClassNotFoundException is thrown on running accumulo
> script
>
>         c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl,
> Sun-speciifc GC params are passed as params to the java process (similar
> to section a. )
>
>         Single proposal for solving all three above issues:
>         Enhance bootstrap_config.sh with request to select Java vendor.
> Selecting this will set correct values for GC params (they differ between
> IBM and Sun), inclusion/ommision of JAXP setting. The
> MiniAccumuloClusterImpl can read the same env variable that was set in
> code for the GC Params, and use in the exec command.
>
>
>  So far, my work has been focused on getting unit tests working for all
> Java vendors in a clean manner. I have not yet run intensive testing of
> real clusters following these changes, and would be happy to get pointers
> to what else might need treatment.
>
>  I would also like to hear if these changes make sense, and if so, should
> I go ahead and create some JIRAs, and attach my patches for commit
> approval?
>
>  Looking forward to hearing feedback!
>
>  Regards,
>  Hayden Marchant
>  Software Architect
>  IBM BigInsights, IBM
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message