lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SOLR-3688) Spellchecker or JSONTestUtil has a bug with Java 8 (or possibly Java 7u6+) - randomly failing spellchecker test
Date Mon, 30 Jul 2012 16:09:34 GMT

     [ https://issues.apache.org/jira/browse/SOLR-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yonik Seeley resolved SOLR-3688.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

The issue was a test bug in DummyCustomParamSpellChecker that iterated over map parameters
and generated dummy startOffset/endOffset numbers from the param order.

http://svn.apache.org/viewvc?view=revision&revision=1367125
                
> Spellchecker or JSONTestUtil has a bug with Java 8 (or possibly Java 7u6+) - randomly
failing spellchecker test
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-3688
>                 URL: https://issues.apache.org/jira/browse/SOLR-3688
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 4.0-ALPHA
>            Reporter: Uwe Schindler
>              Labels: Java8
>             Fix For: 4.0
>
>
> The Java 8 builds of trunk and branch4x fail >50% of all times with:
> {noformat}
> java.lang.RuntimeException: mismatch: '0'!='2' @ spellcheck/suggestions/bar/startOffset
> 	at __randomizedtesting.SeedInfo.seed([9B1635CDE22895EA:5C18C165EA4480A2]:0)
> 	at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:573)
> 	at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:521)
> 	at org.apache.solr.handler.component.SpellCheckComponentTest.testPerDictionary(SpellCheckComponentTest.java:102)
> {noformat}
> I am scared by JSONTestUtil so I have no idea what causes this. As a lot of other JSON-related
things were buggy with Java 8, I think this has same cause. The JSONTestUtil seems to depend
on the order of some HashMap/HashSet created by noggit. The order of hashed collections changed
in Java 8 in an incompatible way. Later versions of Java 7 (u6+) will do the same with larger
collections:
> {quote}
> From: Mike Duigou <mike.duigou@oracle.com>
> Date: Sat, Jul 14, 2012 at 1:09 AM
> Subject: Update on alternative hashing for String keys with hash-based Maps
> To: core-libs-dev Libs <core-libs-dev@openjdk.java.net>, jdk7u-dev@openjdk.java.net
> Hello all;
> About a month ago a significant change was made to the Java SE 7 and 8 hash based Map
implementations. This change was previously proposed and discussed on this list[1]. The change
affects all of the hashing based Map implementations (HashMap, Hashtable, WeakHashMap, LinkedHashMap,
ConcurrentHashMap), the derived Set implementations (HashSet, LinkedHashSet,
> etc.) and other classes which depend upon these classes (Properties, UIDefaults). The
change to hash based Maps improves performance when a large number of key hash code collisions
are encountered. This is accomplished by altering the handling of String keys to use a different
(better) hash function.
> As initially proposed, the alternative hashing behaviour was planned to apply to all
hash based Maps larger than 512 elements. Smaller maps would continue to use the existing
hashing approach.
> ConcurrentHashMap, because reasons related to the complexity of it's implementation,
would always use the improved approach regardless of map capacity. Providing capacity based
triggering of the alternative hashing is intended to improve compatibility by ensuring that
the order in which elements are iterated does not change. Specifically, at less than the threshold
capacity Map elements will be iterated in the same order as today.
> At or above the threshold, the iteration order will be different than the current order.
Testing and evaluation of existing Java applications has shown us that some applications have
explicit or implicit dependence upon the order that keys, values or entries will be iterated.
The vast majority of iteration order dependent cases involve small maps--once a map contains
hundreds of elements generally incorrect assumptions about iteration order will have already
been found and resolved. We believed that triggering the alternative hashing behaviour at
512 element capacity would protect iteration order in cases which depended upon the existing
iteration order.
> Java SE 7 and Java SE 8 have different policies. Java SE 8 is intended to always use
alternative hashing of String keys regardless of the capacity of the Map.
> After  integration a number of cases of iteration order dependence were encountered within
the OpenJDK code itself, in tests and in user code. Some of these faults were easily diagnosed
and correct. Some other cases, because iteration order is not consistent from run-to-run under
alternative hashing, proved difficult to isolate and resolve.
> Following this testing and and consultation with Java licensees and developers it was
decided disable the alternative hashing behaviour for Java SE 7.  To ensure the greatest degree
of compatibility with existing applications it seems best to only enable alternative hashing
by explicit control. In Java SE 7u6 it will be necessary to set the system property jdk.map.althashing.threshold
in order to enable alternative hashing. It is also still possible that the feature may be
enabled by default in future Java SE 7 releases but this will only happen if further testing
indicates compatibility can be reasonably assured.
> Because Java 8 is unreleased and we still wish to shake out iteration order dependencies
alternative hashing remains enabled in Java SE 8.
> Alternative hashing is very likely to be enabled by default in Java SE 8.
> Doug Lea has been investigating further improvements to handling of key hash code collisions
and his design is extremely likely to be the basis for all Java SE 8 hash based Map implementations.[2]
> Developers and deployers are strongly encouraged to test their applications by enabling
the alternative String key hashing feature in Java SE 7u6 or later and/or testing with Java
SE 8 builds. BE WARNED:
> it will probably not be possible to disable alternative hashing in Java SE 8. Applications
MUST remove dependencies upon iteration order before they can be deployed with Java SE 8.
> Thanks,
> Mike
> TL;DR:
> - Java SE 7 and 8 both now support alternative hashing for String keys with hash based
Maps
> - Alternative hashing improves performance when many String key hash codes collide
> - Alternative hashing impacts key, value and element iteration order
> - Alternative hashing is currently DISABLED by default for Java SE 7
> - Future Java SE 7 releases may enable alternative hashing for "large"
> (>512 capacity) maps
> - Developers can enable the eature in Java SE 7 for testing and deployment with a system
property
> - Alternative hashing is ENABLED for all maps in Java SE 8
> - It will probably not be possible to disable alternative hashing in Java SE
> 8
> - Hash map key, value and element iteration order WILL be different and unpredictable
in Java SE 8
> - Different implementation approaches are still being investigated for Java SE 8 and
remain subject to change
> [1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010238.html
> [2] http://cs.oswego.edu/pipermail/concurrency-interest/2012-June/009505.html
> {quote}
> If this is not the case, spellchecker has a problem with Java 8.
> There is nothing I am able to do because of the complexity of this test framework for
JSON, but I really would like to fix this bug! So please: Somebody please, please, please
give me a hint, I can then try to provide a patch!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message