ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amir Akhmedov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-640) Implement IgniteMultimap data structures
Date Sun, 09 Oct 2016 18:51:20 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15560441#comment-15560441

Amir Akhmedov commented on IGNITE-640:

I had a question about implementation details of this ticket.

In details it's said "The most natural way to implement such map, would be to store every
value under a separate key in an Ignite cache"

1. How about memory footprint in this approach? We need to create a separate cache key for
every value in multimap rather backing all the value for key under some data structure?
2. Retrieving the data by index for key will be much difficult in case of value removal e.g.
{{multimap.put(key, "one");
multimap.put(key, "two");
multimap.put(key, "three");
multimap.remove(key, "two");}}
In this case value "three" will have index 1 by Multimap API but internally (Ignite side)
it has index 2 and need to take extra steps to handle such cases which hurt performance

For me, backing structure for values in ArrayList will be optimal way to go. Please, advise?

> Implement IgniteMultimap data structures
> ----------------------------------------
>                 Key: IGNITE-640
>                 URL: https://issues.apache.org/jira/browse/IGNITE-640
>             Project: Ignite
>          Issue Type: Sub-task
>          Components: data structures
>            Reporter: Dmitriy Setrakyan
>            Assignee: Amir Akhmedov
> We need to add {{IgniteMultimap}} data structure in addition to other data structures
provided by Ignite. {{IgniteMultiMap}} should have similar API to {{java.util.Map}} class
in JDK, but support the semantics of multiple values per key, similar to [Guava Multimap|http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Multimap.html].

> However, unlike in Guava, our multi-map should work with Lists, not Collections. Lists
should make it possible to support the following methods:
> {code}
> // Gets value at a certain index for a key.
> V get(K, index);
> // Gets all values for a collection of keys at a certain index.
> Map<K, V> getAll(Collection<K>, index);
> // Gets values for specified indexes for a key.
> List<V> get(K, Iterable<Integer> indexes);
> // Gets all values for a collection of keys at specified indexes.
> Map<K, Collection<V>> getAll(Collection<K>, Iterable<Integer>
> // Gets values for specified range of indexes, between min and max.
> List<V> get(K, int min, int max);
> // Gets all values for a collection of keys for a specified index range, between min
and max.
> Map<K, Collection<V>> getAll(Collection<K>, int min, int max);
> // Gets all values for a specific key.
> List<V> get(K);
> // Gets all values for a collection of keys.
> Map<K, List<V>> getAll(Collection<K>);
> // Iterate through all elements with a certain index.
> Iterator<Map.Entry<K, V>> iterate(int idx);
> // Do we need this?
> Collection<IgniteTuple<Integer V>> get(K, IgniteBiPredicate<Integer, V>)
> {code}
> Multimap should also support colocated and non-colocated modes, similar to [IgniteQueue|https://github.com/apache/incubator-ignite/blob/master/modules/core/src/main/java/org/apache/ignite/IgniteQueue.java]
and its implementation, [GridAtomicCacheQueueImpl|https://github.com/apache/incubator-ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/datastructures/GridAtomicCacheQueueImpl.java].
> h2. Design Details
> The most natural way to implement such map, would be to store every value under a separate
key in an Ignite cache. For example, let's say that we have a key {{K}} with multiple values:
{{V0, V1, V2, ...}}. Then the cache should end up with the following values {{K0, V0}}, {{K1,
V1}}, {{K2, V2}}, etc. This means that we need to wrap user key into our own, internal key,
which will also have {{index}} field. 
> Also note that we need to collocate all the values for the same key on the same node,
which means that we need to define user key K as the affinity key, like so:
> {code}
> class MultiKey<K> {
>     @CacheAffinityMapped
>     private K key;
>     int index;
> }
> {code}
> Look ups of values at specific indexes becomes very simple. Just attach a specific index
to a key and do a cache lookup. Look ups for all values for a key should work as following:
> {code}
> MultiKey key;
> V v = null;
> int index = 0;
> List<V> res = new LinkedList<>();
> do {
>     v = cache.get(MultiKey(K, index));
>     if (v != null)
>         res.add(v);
>     index++;
> }
> while (v != null);
> return res;
> {code}
> We could also use batching for performance reason. In this case the batch size should
be configurable.
> {code}
> int index = 0;
> List<V> res = new LinkedList<>();
> while (true) {
>     List<Key> batch = new ArrayList<>(batchSize);
>     // Populate batch.
>     for (; index < batchSize; index++)
>         batch.add(new MultiKey(K, index % batchSize);
>     Map<Key, V> batchRes = cache.getAll(batch);
>     // Potentially need to properly sort values, based on the key order,
>     // if the returning map does not do it automatically.
>     res.addAll(batchRes.values());
>     if (res.size() < batch.size())
>         break;
> }
> return res;
> {code}
> h2. Evictions
> Evictions in the {{IgniteMultiMap}} should have 2 levels: maximum number of keys, and
maximum number of values for a key. The maximum number of keys should be controlled by Ignite
standard eviction policy. The maximum number of values for a key should be controlled by the
implementation of the multi-map. Either eviction parameter should be configurable.

This message was sent by Atlassian JIRA

View raw message