Thanks for the clarification. 

That's an interesting approach. Reusing the same prototype could make sense here, as it introduces the dynamicity that you need. 

However from what I could reproduce one of the nodes simply runs into a deadlock situation: PE cannot place local event into local "processing" queue because it is full, and PE cannot take event from processing queue because it is trying to write into processing queue. 

This is due to a cycle in the PE graph, and there is no simple way to avoid the issue without losing the event. For S4 0.6 - which we are working on - we provide load shedding stream processing executors so that you can drop events automatically, and you have the possibility to override and finely tune this behaviour as well. You might want to give it a try, though we still need to push the updated doc to the web server. With S4 0.5 you'd need to have a design without that prevents from getting in that deadlock situation, and that probably means removing the cycle, thus differing from your original design.

Hope this helps, and thanks for the feedback!

Matthieu






On Feb 25, 2013, at 08:24 , Gowtham S wrote:

Hi matthieu,

that was a sample design of the system. but i m trying to build a dynamic design of system during the time of execution.. for eg if i give the initial number of ProcessingPE as 10 means it has to map dynamically from 10 -> 8 , 8 ->4 , 4 ->2 , 2 ->1. what ever may be the initial number of Processing PE given the code will manipulate and send to corresponding PE. but whether s4 supports this kind of dynamic mapping during the time of execution. the sorting example which i gave was a sample example with 4 -> 2, 2 ->1 .so finally i m trying to achieve log base2 (n) levels of a same PE(Processing PE)  given n at the Details Receiver PE stage. will s4 map this kind of system design.  




On Mon, Feb 25, 2013 at 2:53 AM, Matthieu Morel <mmorel@apache.org> wrote:
Hi,

I had a quick look and I wonder why you are not using different PE prototypes for the different layers in your system? That looks like a more adequate design, since I don't think you need to keep the state from layer 1 in layer 2 or layer 3 because instead you send aggregated info through messages.

Regards,

Matthieu

On Feb 24, 2013, at 12:14 , Gowtham S wrote:


this is the input file for the above application..


On Sun, Feb 24, 2013 at 12:25 AM, Gowtham S <seldomgowtham@gmail.com> wrote:
hi matthieu,

i have no problem in app. and also 3 nodes are processing the whole data while one has stopped after some time..

i have attached the sample project of our application. most of the description regarding the application is given in the attachment. if possible try checking it.. i ran it in a single machine with 4 nodes as said in previous message.


i will describe what is actually happening ...

i am having a record in which there are 18237 lines in which each line has 4 numbers. i m just trying to sort four numbers in each line based on the design of PE's as given in the document.

same result one node stopped printing in the terminal while the other three continues it work.

see this after reading the document..

after seeing the document u will see the PE graph... in that Processing PE with Key ID "1" and Processing PE with Key ID "10" are running in the same node..

 ProcessingPE with Key Id "10" should receive events from ProcessingPE of ID(0 and 1) but it mainly receives events from only PE with Key id "1" and not from Key Id  "0".

so i cant get alternate events from both PE's. but i should receive events from both PE's one after the other as i  should sort 4 numbers in a single line. so PE in one node is not sending events that much quick to the PE in other nodes than to the PE in same node..

so Processing PE with id "10" receives more events from PE with key ID "0" which is in the same node. so what i have to do in order to get events from both PE's with id "0" and "1" one after the other rather than majorly from "1" and then finally from "1"(i.e for e.g if "10" should receive 100 from both "0" and "1".. first 100 is received from "1" which is at the same node as "10" at that time only 2 or 3 received from "0". then the remaining from "0" is received. but i should receive 1 event from "0 " and 1 from "1" and so on ).

"0" ,"1" - are Processing PE with Key ID "0" and "1".







On Fri, Feb 22, 2013 at 7:03 PM, Matthieu Morel <mmorel@apache.org> wrote:
Thanks for the info,

with the cluster status that you show - which look ok - is there any problem with the app?

Also, I would again recommend to check whether all data is processed as expected, regardless of the location. Can you check that?

Thanks,

Matthieu


On Feb 22, 2013, at 14:02 , Gowtham S wrote:

sorry for the wrong thread and gimp format.. 

here i have attached the .png format of the screenshot which shows that all the 4 nodes are active .

and also i didnt modify the hash function. when ever i run it in 4 nodes, it always map the 4 different keyed PE to four different nodes. i didnt manually  change the hash function ..



On Thu, Feb 21, 2013 at 1:30 AM, Matthieu Morel <mmorel@apache.org> wrote:
Hi Jihyoun,

PE graphs with cycles can certainly get a bit tricky sometimes, but mainly because they can lead to deadlocks.

I don't see how you'd lose events in the case you present if you don't lose or reconfigure nodes. In the very specific case of sending an event from PE instance a1 to the same PE instance a1, the event is simply put in an in-memory blocking queue.

It would be great to understand in which conditions you can see this issue. Could you be more specific? (I can't reproduce that with simple tests, but I don't have your own settings/code/environment).

Thanks,

Matthieu

On Feb 20, 2013, at 03:04 , JiHyoun Park wrote:

Dear Matthieu

I also experienced the same problem with Gowtham.
Unlike him, I created a downstream to itself with only one key. (It was like a flag to be turned on/off.)
I tested it and found sometimes one PE instance does not receive the event.
I remember that it didn't depend on a specific node.

At the time, I also wondered why.
Since it's a very simple logic, you can test it by yourself without much efforts.
It would be really appreciated if you can tell us the solution or at least the reason.

Best Regards
Jihyoun



On Tue, Feb 19, 2013 at 11:38 PM, Matthieu Morel <mmorel@apache.org> wrote:
Hi,

Can you provide more detail about a node not running correctly? what does that mean to you, that it does not receive messages? Maybe that's normal because of the distribution of keys. Or maybe you only have 3 active nodes in the cluster. (you can check that with the s4 status tool)

The distribution of keys across nodes is a simple hash+mod , which you can override if needed.

I would recommend to check whether all data is processed as expected, regardless of the location.

Let us know,

thanks,

Matthieu


On Feb 19, 2013, at 16:27 , Gowtham S wrote:

Hi S4 community,

I have been working with s4 piper for 3 months. now i am just trying to make a downstream from a PE to the same PE with different Key Id. 

i just have a scenario in which i would have an unique PE whose downstream is also to that PE  but each time the Key Id of the targeted PE may vary. this continues until it satisfies a condition where it just stops.

i started implementing this in a single machine with 4 nodes, as i had 4 different keys for that unique PE(say 0,1,2,3). so now every node will have that PE clone with one individual key Id from the available Key Ids. one PE may send some events to PE in other  nodes , until certain condition is reached where the process is stopped.

now three nodes are running correctly except one node.. i wonder why .. because all the nodes have same computational part to execute. 

please help me as how s4 deals while assigning a single PE to different nodes with different key Id ..



 


On Tue, Feb 19, 2013 at 8:22 PM, Gowtham S <seldomgowtham@gmail.com> wrote:
Hi S4 community,






<s4image.png>



<textfile.txt>