ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: Map-reduce proceesing
Date Wed, 20 Apr 2016 12:07:44 GMT

If you broadcast the job and want to iterate over cache inside it, then
please make sure that you iterate only over local entries (e.g.
IgniteCache.localEntries(), ScanQuery.setLocal(true), etc.). Otherwise your
jobs will duplicate work and performance will suffer.

Also please note that returned result set might be incomplete if one of the
nodes failed during job processing. If you care about it, you should either
implement some failover, or use Ignite's built-in queries (ScanQuery,
SqlQuery) which already take care of it.

Anyway, I strongly recommend you to focus on SqlQuery first. You can
configure indexes on cache and they could give you great boost, because
instead of iterating over the whole cache, Ignite will use indexes for fast
data lookup.


On Wed, Apr 20, 2016 at 12:31 PM, dmreshet <dmreshet@gmail.com> wrote:

> Yes, I know.
> I want to compare performance of SQL,  SQL with indexes and MapReduce job.
> I have found that I can use broadcast to garantie that my MapReduce job
> will
> be executed on each node exactly once.
> So now my job uses code:
> /Collection<List&lt;Person> result =
> ignite.compute(ignite.cluster()).broadcast((IgniteCallable<List&lt;Person>>)
> () -> {...});/
> And than I will reduce the result.
> Is that the best practise to implement MapReduce job in case that I should
> process data from cache?
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Map-reduce-proceesing-tp4357p4364.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

View raw message