accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hayden Marchant (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-2942) org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate failure
Date Tue, 24 Jun 2014 13:56:25 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hayden Marchant updated ACCUMULO-2942:
--------------------------------------

    Description: 
org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate .
This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This consistently
passes on Sun , and consistently fails on Oracle. 

The ShardedTableDistributionFormatter inherits from AggregatingFormatter which has 2 overriding
methods - aggregateStats and getStats. In the ShardedTableDistributionFormatter implementation,
the aggregateStats prepares a list based on the HashMap, and the getStats creates a string
by serializing values in the HashMap. 

Due to the unpredictability of Hash ordering in different Java versions (even same vendor,
different versions), the getStats() output is inconsistent. This is not a problem in itself.
However since we are asserting on the content of getStats, we we either make the getStatus
consistent or we do some refactoring and do 2 tests - one test on the structure that getStats
is serializing, and another test to assert the output of getStats based on a predictable structure.

Some people expressed concern for changing the underlying structure from a HashMap to TreeMap
due to performance considerations. Question is, is this code ever executed in such an environment
to be concerned about this?

Alternatively, we could just change the getStats method, which is after the 'heavy-lifting'
of iterating over all entries. The stats that are calculated are aggregates per day. Therefore
this will not be such a large structure, and could then be sorted before being output.


  was:
org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate .
This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This consistently
passes on Sun , and consistently fails on Oracle. 

Proposal: Change ShardedTableDistributionFormatter.countsByDay to TreeMap, or use non-ordered
comparison 


> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
failure
> ------------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-2942
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2942
>             Project: Accumulo
>          Issue Type: Sub-task
>          Components: tserver
>    Affects Versions: 1.6.0
>         Environment: IBM JVM
>            Reporter: Hayden Marchant
>             Fix For: 1.6.1, 1.7.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
. This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This
consistently passes on Sun , and consistently fails on Oracle. 
> The ShardedTableDistributionFormatter inherits from AggregatingFormatter which has 2
overriding methods - aggregateStats and getStats. In the ShardedTableDistributionFormatter
implementation, the aggregateStats prepares a list based on the HashMap, and the getStats
creates a string by serializing values in the HashMap. 
> Due to the unpredictability of Hash ordering in different Java versions (even same vendor,
different versions), the getStats() output is inconsistent. This is not a problem in itself.
However since we are asserting on the content of getStats, we we either make the getStatus
consistent or we do some refactoring and do 2 tests - one test on the structure that getStats
is serializing, and another test to assert the output of getStats based on a predictable structure.
> Some people expressed concern for changing the underlying structure from a HashMap to
TreeMap due to performance considerations. Question is, is this code ever executed in such
an environment to be concerned about this?
> Alternatively, we could just change the getStats method, which is after the 'heavy-lifting'
of iterating over all entries. The stats that are calculated are aggregates per day. Therefore
this will not be such a large structure, and could then be sorted before being output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message