ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neeraj Vaidya <neeraj.vai...@yahoo.co.in>
Subject Re: Real-time computing
Date Fri, 24 Feb 2017 00:08:11 GMT
Thanks Andrey,

1) If I understand correctly, the FailOver feature of the ComputeGrid is to mitigate SPOF
for the compute jobs i.e. the Callables/Runnables/Closures which are distributed to multiple
nodes. But my goal was also to mitigate the failure of the client node which is responsible
for reading external files and creating the compute jobs collection.Can the FailOverSpi handle
that as well ? My pseudo-code of the client node is as follows :
- Check for files in filesystem
- If present, then for each line present in the file, create a compute job. (If I understand
correctly, this is the piece of code which falls under the scope of FailOverSpi)
- Finally, loop back to wait for any more files.

2) Coming to my second question. Let's say I cache the CDR file records/entries into a certain
cache e.g. : "CDRFileCache". I then run 5 nodes each with a listener waiting for new entries
to be added to this cache. 
- If I stream 3 entries into this cache, one after another, will all listeners process all
these 3 entries ? i.e will entry1,2,3 be processed by listener1,2,3,4 and 5 ?
- Or is it that if listener1 is processing entry1, then entry1 will not be processed by any
other listener because listener1 has already started processing it ?

Regards,
Neeraj

--------------------------------------------
On Fri, 24/2/17, Andrey Mashenkov <andrey.mashenkov@gmail.com> wrote:

 Subject: Re: Real-time computing
 To: user@ignite.apache.org, "Neeraj Vaidya" <neeraj.vaidya@yahoo.co.in>
 Date: Friday, 24 February, 2017, 2:20 AM
 
 Hi Neeraj,
 1. Why you want
 to use Zookeeper to mitigating an SPOF instead of Ignite
 ComputeGrid failover features?
 2. If you need
 to reuse data then caching makes sense. For processing new
 entries you can use Events or Continuous
 queries. You are free in
 choosing number of nodes for your grid. You can choose what
 nodes will hold data and what nodes will be used for
 computations. 
 I'm not
 sure I understand last question. Would you please detail the
 last use case?
 
 On Thu, Feb 23, 2017 at
 3:23 AM, Neeraj Vaidya <neeraj.vaidya@yahoo.co.in>
 wrote:
 Hi,
 
 
 
 I have a use case where I need to perform computation on
 records in files (specifically files containing telecom
 CDRs).
 
 
 
 To this, I have a few questions :
 
 
 
 1) Should I have just one client node which reads these
 records and creates Callable compute jobs for each record ?
 With just 1 client node, I suppose this will be a
 single-point of failure. I could use Zookeeper to manage a
 cluster of such nodes, thus possibly mitigating an SPOF.
 
 
 
 2) Or should I stream/load these records using a client,
 into a cache and then have other cluster nodes read this
 cache for new entries and then let them perform the
 computation ? In this case, is there a way by which I can
 have only one node get hold of computing every record ?
 
 
 
 Regards,
 
 Neeraj
 
 
 
 
 -- 
 Best regards,
 Andrey V.
 Mashenkov
 
 

Mime
View raw message