lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fuad Efendi (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost
Date Mon, 28 Jul 2008 22:48:31 GMT

    [ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
] 

funtick edited comment on SOLR-665 at 7/28/08 3:46 PM:
-----------------------------------------------------------

concerns are probably because of misunderstanding of  some _contract_... 

{code}
    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry[] table;

    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable);
        table = newTable;
        threshold = (int)(newCapacity * loadFactor);
    }


    public V get(Object key) {
	if (key == null)
	    return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of smaller size
we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap;
it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
    /** 
     * Transfer all entries from current table to newTable.
     */
    void transfer(Entry[] newTable) {
        Entry[] src = table;
        int newCapacity = newTable.length;
        for (int j = 0; j < src.length; j++) {
            Entry<K,V> e = src[j];
            if (e != null) {
                src[j] = null;
                do {
                    Entry<K,V> next = e.next;
                    int i = indexFor(e.hash, newCapacity);  
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);
            }
        }
    }
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map interface-_contract_ -
should we avoid to use implementation then? I believe it is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=10000000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that double-inserts are real

- I checked catalina_out: no any NullPointerException, ArrayIndexOutOfBoundsException, and
etc.
- I don't think we should be worried _too much_ about possible change of Map implementation
by SUN :P... in this case we should use neither java.lang.String nor java.util.Date (some
are placed in wrong packages).
- about thread safety: some participants are wrongly guessing that making object access totally
synchronized will make their code thread-safe... deadlock.

      was (Author: funtick):
    concerns are probably because of misunderstanding of  some _contract_... 

{code}
    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry[] table;

    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable);
        table = newTable;
        threshold = (int)(newCapacity * loadFactor);
    }


    public V get(Object key) {
	if (key == null)
	    return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of smaller size
we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or LinkedHashMap;
it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
    /** 
     * Transfer all entries from current table to newTable.
     */
    void transfer(Entry[] newTable) {
        Entry[] src = table;
        int newCapacity = newTable.length;
        for (int j = 0; j < src.length; j++) {
            Entry<K,V> e = src[j];
            if (e != null) {
                src[j] = null;
                do {
                    Entry<K,V> next = e.next;
                    int i = indexFor(e.hash, newCapacity);  
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);
            }
        }
    }
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map interface-_contract_ -
should we avoid to use implementation then? I believe it is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=10000000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that double-inserts are real

- I checked catalina_out: no any NullPointerException, ArrayIndexOutOfBoundsException, and
etc.
- I don't think we should be worried _too much_ about possible change of Map implementation
by SUN :P... in this case we should use neither java.lang.String nor java.util.Date (some
are placed in wrong packages).
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> -------------------------------------------------------
>
>                 Key: SOLR-665
>                 URL: https://issues.apache.org/jira/browse/SOLR-665
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 1.3
>         Environment: JRockit R27 (Java 6)
>            Reporter: Fuad Efendi
>         Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that "reordering"/true (performance
bottleneck of LRU) is replaced to "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in a system,
http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name: 	 filterCache  
> class: 	org.apache.solr.search.LRUCache  
> version: 	1.0  
> description: 	LRU Cache(maxSize=10000000, initialSize=1000)  
> stats: 	lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message