ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Margorin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-640) Implement IgniteMultimap data structures
Date Tue, 01 Mar 2016 14:20:18 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173717#comment-15173717
] 

Konstantin Margorin edited comment on IGNITE-640 at 3/1/16 2:20 PM:
--------------------------------------------------------------------

Some thoughts about possible multimap  implementation.

According to above suggestion we shell use cache with specially constructed key:

{code}
class MultiKey<K> {
    @CacheAffinityMapped
    private K key;
 
    int index;
}
{code}
 
Method {{put(K key, V value)}} should created MultiKey for a new value. We should know new
index value for MultiKey construction in the {{put}} method. The simplest implementation will
use another {{Cache<K,int> indexCache}}   to keep count of items with key K.
 
So, pseudo-code for put will look like

{code} 
put(K key, V value)
{
    try (Transaction tx = Ignition.ignite().transactions().txStart()) { 
        newIndex = indexCache.get(key) + 1;
        indexCache.put(key, newIndex);
        multikeyCache.put(new MultiKey(key, newIndex), value);
        tx.commit()
    }
}
{code}
 
The problem here is that we use:

    - {{transaction}}
    - one {{get}}
    - two {{put}}

cache operations for one Multikey {{put}} operation. It looks like big overhead.

Dmitriy suggested another way:
{quote}
An alternative way would be to store Key to List<V>, i. e. to store Lists directly in
cache. Then we can use EntryProcessor to add/remove elements to the list. The only problem
here is that we will not be able to update the existing lists, but rather clone the lists,
so we can add/remove elements to it, but I still think it will be cheaper than using transactions.
{quote}


was (Author: ruskim):
Some thoughts possible multimap about implementation.

According to above suggestion we shell use cache with specially constructed key:

{code}
class MultiKey<K> {
    @CacheAffinityMapped
    private K key;
 
    int index;
}
{code}
 
Method {{put(K key, V value)}} should created MultiKey for a new value. We should know new
index value for MultiKey construction in the {{put}} method. The simplest implementation will
use another {{Cache<K,int> indexCache}}   to keep count of items with key K.
 
So, pseudo-code for put will look like

{code} 
put(K key, V value)
{
    try (Transaction tx = Ignition.ignite().transactions().txStart()) { 
        newIndex = indexCache.get(key) + 1;
        indexCache.put(key, newIndex);
        multikeyCache.put(new MultiKey(key, newIndex), value);
        tx.commit()
    }
}
{code}
 
The problem here is that we use:

    - {{transaction}}
    - one {{get}}
    - two {{put}}

cache operations for one Multikey {{put}} operation. It looks like big overhead.

Dmitriy suggested another way:
{quote}
An alternative way would be to store Key to List<V>, i. e. to store Lists directly in
cache. Then we can use EntryProcessor to add/remove elements to the list. The only problem
here is that we will not be able to update the existing lists, but rather clone the lists,
so we can add/remove elements to it, but I still think it will be cheaper than using transactions.
{quote}

> Implement IgniteMultimap data structures
> ----------------------------------------
>
>                 Key: IGNITE-640
>                 URL: https://issues.apache.org/jira/browse/IGNITE-640
>             Project: Ignite
>          Issue Type: Sub-task
>          Components: data structures
>            Reporter: Dmitriy Setrakyan
>            Assignee: Konstantin Margorin
>
> We need to add {{IgniteMultimap}} data structure in addition to other data structures
provided by Ignite. {{IgniteMultiMap}} should have similar API to {{java.util.Map}} class
in JDK, but support the semantics of multiple values per key, similar to [Guava Multimap|http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Multimap.html].

> However, unlike in Guava, our multi-map should work with Lists, not Collections. Lists
should make it possible to support the following methods:
> {code}
> // Gets value at a certain index for a key.
> V get(K, index);
> // Gets all values for a collection of keys at a certain index.
> Map<K, V> getAll(Collection<K>, index);
> // Gets values for specified indexes for a key.
> List<V> get(K, Iterable<Integer> indexes);
> // Gets all values for a collection of keys at specified indexes.
> Map<K, Collection<V>> getAll(Collection<K>, Iterable<Integer>
indexes);
> // Gets values for specified range of indexes, between min and max.
> List<V> get(K, int min, int max);
> // Gets all values for a collection of keys for a specified index range, between min
and max.
> Map<K, Collection<V>> getAll(Collection<K>, int min, int max);
> // Gets all values for a specific key.
> List<V> get(K);
> // Gets all values for a collection of keys.
> Map<K, List<V>> getAll(Collection<K>);
> // Iterate through all elements with a certain index.
> Iterator<Map.Entry<K, V>> iterate(int idx);
> // Do we need this?
> Collection<IgniteTuple<Integer V>> get(K, IgniteBiPredicate<Integer, V>)
> {code}
> Multimap should also support colocated and non-colocated modes, similar to [IgniteQueue|https://github.com/apache/incubator-ignite/blob/master/modules/core/src/main/java/org/apache/ignite/IgniteQueue.java]
and its implementation, [GridAtomicCacheQueueImpl|https://github.com/apache/incubator-ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/datastructures/GridAtomicCacheQueueImpl.java].
> h2. Design Details
> The most natural way to implement such map, would be to store every value under a separate
key in an Ignite cache. For example, let's say that we have a key {{K}} with multiple values:
{{V0, V1, V2, ...}}. Then the cache should end up with the following values {{K0, V0}}, {{K1,
V1}}, {{K2, V2}}, etc. This means that we need to wrap user key into our own, internal key,
which will also have {{index}} field. 
> Also note that we need to collocate all the values for the same key on the same node,
which means that we need to define user key K as the affinity key, like so:
> {code}
> class MultiKey<K> {
>     @CacheAffinityMapped
>     private K key;
>     int index;
> }
> {code}
> Look ups of values at specific indexes becomes very simple. Just attach a specific index
to a key and do a cache lookup. Look ups for all values for a key should work as following:
> {code}
> MultiKey key;
> V v = null;
> int index = 0;
> List<V> res = new LinkedList<>();
> do {
>     v = cache.get(MultiKey(K, index));
>     if (v != null)
>         res.add(v);
>     index++;
> }
> while (v != null);
> return res;
> {code}
> We could also use batching for performance reason. In this case the batch size should
be configurable.
> {code}
> int index = 0;
> List<V> res = new LinkedList<>();
> while (true) {
>     List<Key> batch = new ArrayList<>(batchSize);
>     // Populate batch.
>     for (; index < batchSize; index++)
>         batch.add(new MultiKey(K, index % batchSize);
>     Map<Key, V> batchRes = cache.getAll(batch);
>     // Potentially need to properly sort values, based on the key order,
>     // if the returning map does not do it automatically.
>     res.addAll(batchRes.values());
>     if (res.size() < batch.size())
>         break;
> }
> return res;
> {code}
> h2. Evictions
> Evictions in the {{IgniteMultiMap}} should have 2 levels: maximum number of keys, and
maximum number of values for a key. The maximum number of keys should be controlled by Ignite
standard eviction policy. The maximum number of values for a key should be controlled by the
implementation of the multi-map. Either eviction parameter should be configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message