Return-Path: X-Original-To: apmail-apex-users-archive@minotaur.apache.org Delivered-To: apmail-apex-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B411317748 for ; Thu, 28 Jan 2016 16:42:56 +0000 (UTC) Received: (qmail 41411 invoked by uid 500); 28 Jan 2016 16:42:56 -0000 Delivered-To: apmail-apex-users-archive@apex.apache.org Received: (qmail 41362 invoked by uid 500); 28 Jan 2016 16:42:56 -0000 Mailing-List: contact users-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@apex.incubator.apache.org Delivered-To: mailing list users@apex.incubator.apache.org Received: (qmail 41352 invoked by uid 99); 28 Jan 2016 16:42:56 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jan 2016 16:42:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 593E1C31A0 for ; Thu, 28 Jan 2016 16:28:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.13 X-Spam-Level: *** X-Spam-Status: No, score=3.13 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id B6Xnexo9Zrde for ; Thu, 28 Jan 2016 16:28:38 +0000 (UTC) Received: from mail-yk0-f177.google.com (mail-yk0-f177.google.com [209.85.160.177]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 2D5B7206F5 for ; Thu, 28 Jan 2016 16:28:38 +0000 (UTC) Received: by mail-yk0-f177.google.com with SMTP id k129so37018447yke.0 for ; Thu, 28 Jan 2016 08:28:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=LLuLLLG77EJPYvXymiaqv1+s3jnJLsvQ1KFimFLLfO8=; b=EdYP2qtNhWjPQFfNGIJ6ZvsIJLq5nxY6I+ZnmP9cUDmBbv7T/z0XE94Q+2pDAc+L/F kre+gH0VJ/bOycsBUQ+dy+qwSoQdHfRzCXTELoQ6Ypg447SL4Y9Xp0a2YkPV9yVFR4Ed ZsXGjj4K4AA3SIb1TS3K7mTEvPOrHqiHPj5C5joTsmgucNaNzq5ganxxhDceb23T0s/R qT1n9hl6+TfpxQKHUPgdF69Av5bUm18zXxX/17T8P2X757N0cCoAEJdbovOKvIwGVsEN pBRTNaAKJFJ8bshmNsm5qmxfkJ62zQrYCazbDpLgzAtydmkUJrbKlz11anHog9T9RB4M Hpmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=LLuLLLG77EJPYvXymiaqv1+s3jnJLsvQ1KFimFLLfO8=; b=Yt6r8NudwYVaPrIis6PhdB8PU9E1zCYjpJEyAEkW7XJPGd47FpacaJAM3qz//kUXTW g0GAfKXr1unIynl3Iv53cK7ZMYyYIaZZ/11VUgp4JbVN+aF0Qe3DxZPWASPb9+xpd3Oi syuARjYFGkcAqssOEORCR2qwYYEHQjgOV52qDJsifIXvLs1KsO8GfXKjdru+EI1dRRzw 11xmDruMyB1z0Rm1FaMGbrD/J5DcwthsmCPbp5wLx+KZbnpKVkduM8kPei7cnqqbCacu RjCZ/R59dDkzESorpUdqyTSwITmVj+erSRhBBwNsCsRx7QZfFmHVUrGDBglMY8OqhaaJ 109A== X-Gm-Message-State: AG10YOTbZ9B4g+knqDKOhYhoKYe4iAtJBMLN7Z9KhB+MGtv+mp1ddz+56NaSy7kzuQPQWO3GqVmihINU4z0Dow== X-Received: by 10.37.34.69 with SMTP id i66mr1977007ybi.181.1453998517457; Thu, 28 Jan 2016 08:28:37 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.210.209 with HTTP; Thu, 28 Jan 2016 08:28:17 -0800 (PST) In-Reply-To: References: From: Amit Shah Date: Thu, 28 Jan 2016 21:58:17 +0530 Message-ID: Subject: Re: What-if analysis with apex To: users@apex.incubator.apache.org Content-Type: multipart/alternative; boundary=001a114329e2231a45052a676961 --001a114329e2231a45052a676961 Content-Type: text/plain; charset=UTF-8 Please see my responses below >1. Loading values for unmodified cells > What is the source of these unmodified cells? Table values. Taking an e.g. from the diagram, assuming the user modifies cell with identifier (table 1, row 1, column 1) we would have to load values for unmodified cells (table 2, row 2, column 2) and (table 4, row 4, column 4) to recalculate the values of other cells > 3. Execute the cells in parallel (if possible) > Which cells you are referring to? Table1, row 1, column 1 - that is the > cells that are changed will trigger dependent cells recalculation or the > two dependent cells? The modification of the cell with identifier (table 1, row 1, column 1) would trigger recalculation of the cell values (table 3, row 3, column 3) and (table 6, row 6, column 6). In this example we cannot do parallel evaluations but you could imagine a case where there are parallel calculations that could be possible. Thanks, Amit. On Thu, Jan 28, 2016 at 9:20 PM, Sandeep Deshmukh wrote: > Thanks Amit. We have better understanding of your requirements now. > > It is not necessary that each cell will be one operator. Please don't get > biased by that assumption. > > Here are few more queries. > >1. Loading values for unmodified cells > What is the source of these unmodified cells? > > > 3. Execute the cells in parallel (if possible) > Which cells you are referring to? Table1, row 1, column 1 - that is the > cells that are changed will trigger dependent cells recalculation or the > two dependent cells? > > Regards > Sandeep > On 28-Jan-2016 8:20 pm, "Amit Shah" wrote: > >> Thanks Sandeep for the follow up. I have tried responding to your >> queries. Kindly let me know if that gives you an idea on what I am trying >> to achieve >> >> how you will be representing your dependencies in a graph >> >> >> Attached a sample dependency graph. I was assuming each cell to be >> represented as an operator in apex terms so that they could be executed in >> parallel >> >> How many such dependency graphs will be there? >> >> >> Total number of graphs would be approximately equal to the number of rows >> that could be modified by the user (considering the worst case). The number >> should be in 1000's. >> >> Do you have one graph per change of cell defining its dependent cells? So, >>> for the example you mentioned, do you define it as O1 dependent cells into >>> one graph? Then there is another graph which defines what values are >>> updated if some other cell O7 is updated. >> >> >> Yes approximately one graph per cell. The dependency graph I have tried >> presenting in the attached diagram could be executed if any of the cell >> values in table 1, 2 or 4 are updated. For simplicity I have picked up >> cells from distinct tables. >> >> In my view, once the user sees the tables on the UI, we could create the >> dependency graphs in the background. Once he/she updates a cell value, our >> application would figure out its corresponding dependency graph and start >> its execution by >> 1. Loading values for unmodified cells >> 2. Determine the cells (or operators) that are to be recalculated. For >> e.g. if the cell with identifier as table1, row 1, column 1 is updated, the >> application would determine that 2 cell values are to be updated. >> 3. Execute the cells in parallel (if possible) >> 4. Render the updated values in real time to the user. >> >> Thanks, >> Amit. >> >> On Thu, Jan 28, 2016 at 7:28 PM, Sandeep Deshmukh < >> sandeep@datatorrent.com> wrote: >> >>> Hi Amit, >>> >>> Your concern is that change of one cell is going to trigger update for >>> large number of cells and you are interested in doing this in parallel to >>> get real-time response. This can be very well achieved using Apex. >>> >>> I think we are still not very clear on your use case and hence what we >>> have proposed may not fit match what you are looking for. >>> >>> We would like to know how you will be representing your dependencies in >>> a graph. How many such dependency graphs will be there? Do you have one >>> graph per change of cell defining its dependent cells? So, for the example >>> you mentioned, do you define it as O1 dependent cells into one graph? Then >>> there is another graph which defines what values are updated if some other >>> cell O7 is updated. >>> >>> Once we fully understand your requirements, we should be able to guide >>> you better. >>> >>> >>> Regards, >>> Sandeep >>> >>> On Thu, Jan 28, 2016 at 2:56 PM, Amit Shah wrote: >>> >>>> Ashwin, Below are follow up queries that I have based on your response. >>>> >>>> The store I mentioned is just an abstraction. It can be in memory >>>>> store, or a cache backed lookup from a database. >>>> >>>> >>>> Yes I understand by the term store but I didn't follow the need of it. >>>> >>>> How does your UI interact with your server today? >>>> >>>> >>>> Our UI is built over angularjs so it communicates with the server >>>> through REST api's. >>>> >>>> You dont have to create a new DAG for each cell you are changing. You >>>>> can have a single DAG running and send across your query with the cell >>>>> changes in the schema you define. You can perform all corresponding changes >>>>> for other cells/table rows in the store operator. >>>> >>>> >>>> I was under the impression that by defining one operator per column >>>> index I could take the advantage of apex running individual operators on >>>> individual jvm's and hence parallel writes in real-time or near real-time >>>> response time. If we have single static DAG that accepts the cell >>>> identiifer (row Id, column index and table id) as parameters then we would >>>> not be able to concurrently updates cell values right? >>>> If your understanding is different from the flow I explained in my >>>> previous mail, what do I gain by using apex? >>>> >>>> >>>> Thanks, >>>> Amit. >>>> >>>> >>>> On Thu, Jan 28, 2016 at 12:51 AM, Ashwin Chandra Putta < >>>> ashwinchandrap@gmail.com> wrote: >>>> >>>>> Amit, >>>>> >>>>> The store I mentioned is just an abstraction. It can be in memory >>>>> store, or a cache backed lookup from a database. >>>>> >>>>> For the query/query response, when interacting with a UI - you can >>>>> send your queries to the query operator and listen for response from the >>>>> query response operator. Historically we have used json over websockets to >>>>> interact from browser. How does your UI interact with your server today? >>>>> >>>>> You dont have to create a new DAG for each cell you are changing. You >>>>> can have a single DAG running and send across your query with the cell >>>>> changes in the schema you define. You can perform all corresponding changes >>>>> for other cells/table rows in the store operator. >>>>> >>>>> If you still want to depend completely on your existing server for >>>>> loading initial data, then you can load it to a cache in store and do your >>>>> analysis on that data in memory. >>>>> >>>>> Regards, >>>>> Ashwin. >>>>> >>>>> On Wed, Jan 27, 2016 at 7:42 AM, Amol Kekre >>>>> wrote: >>>>> >>>>>> >>>>>> Amit, >>>>>> Here are some answers >>>>>> - Logic that you want to run can be coded as an utility, that is then >>>>>> invoked by any other operator >>>>>> - PopulateDAG() is today part of roll out of the app, i.e it is >>>>>> similar to "compileTime" and not "runTime". You could do runTime, but then >>>>>> you will need to go through dtcli. Today runTime changes via dtcli will >>>>>> need a lot more coding. A very early version of runTime changes (based on >>>>>> system metrics) exist, but the ask is for changes based on application >>>>>> data. That ask is in the roadmap of module rollout (phase II?) and others >>>>>> can comment on the roadmap for runtTime populateDAG. >>>>>> - Outputs of many operators can be streamed as input to one operator >>>>>> in following ways >>>>>> - Each output having different schema will mean different input >>>>>> ports on that operator as port schema is fixed. This is fine, but will >>>>>> clutter the DAG >>>>>> - If the schema of these output ports is same, there is a merge >>>>>> operator that does that ( >>>>>> https://github.com/apache/incubator-apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/stream/StreamMerger.java). >>>>>> You can write one for Nx1 merge by extending the above class. >>>>>> >>>>>> Thks, >>>>>> Amol >>>>>> >>>>>> >>>>>> On Wed, Jan 27, 2016 at 6:03 AM, Amit Shah >>>>>> wrote: >>>>>> >>>>>>> Thanks Ashwin for the follow up. >>>>>>> I am not sure if I completely follow the query -> store -> query >>>>>>> pattern. What does query mean here? Why would we need a in-memory store? >>>>>>> Trying to list down the flow I came up with below points >>>>>>> >>>>>>> 1. We need to build a DAG after we get to know the cell (table, >>>>>>> row and column index) that is modified by the user. >>>>>>> 2. Once we receive user input (i.e. once the user modifies a >>>>>>> value in a table) the populateDAG() method should be called. >>>>>>> 3. The populateDAG() implementation would >>>>>>> 1. Determine what cells should be updated across all tables >>>>>>> 2. Create an Operator per cell that is affected by the >>>>>>> change. From the demo code I see dag.addOperator method >>>>>>> instantiating an operator. Since the logic to update an cell >>>>>>> would be the same across tables how do we create new operators per cell to >>>>>>> have a graph that looks what Bhupesh envisioned in his last email reply? In >>>>>>> my view the graph would like >>>>>>> >>>>>>> O1 (for user modified cell) -> O2 (table X, row >>>>>>> Y, column index 2) -> O5 (table E, row F, column index 10000) >>>>>>> O3 >>>>>>> (table M, row N, column index 3) >>>>>>> -> O6 (update UI) >>>>>>> O4 >>>>>>> (table P, row Q, column index 1) >>>>>>> >>>>>>> 3. We want the DAG to be evaluated instantly once the >>>>>>> populateDAG() method finishes. How do we do it? >>>>>>> 4. Can outputs from many operators be streamed as an >>>>>>> input to one operator? From the above example outputs from O3, O4 >>>>>>> and O5 need to go to O6. >>>>>>> >>>>>>> I appreciate your inputs on this. >>>>>>> >>>>>>> Thanks, >>>>>>> Amit. >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 27, 2016 at 1:49 PM, Ashwin Chandra Putta < >>>>>>> ashwinchandrap@gmail.com> wrote: >>>>>>> >>>>>>>> Amit, >>>>>>>> >>>>>>>> Thanks for the response. You can use the query --> store --> query >>>>>>>> result pattern to do the real time updates and lookups for what-if analysis. >>>>>>>> >>>>>>>> And you can also ingest your real time input data to the store >>>>>>>> operator. input --> store. >>>>>>>> >>>>>>>> That way, you can keep ingesting your data into the store operator >>>>>>>> where you will keep your OLAP dimensions and measures. >>>>>>>> >>>>>>>> For the query/query result pattern example, see this demo: >>>>>>>> >>>>>>>> >>>>>>>> https://github.com/apache/incubator-apex-malhar/blob/master/demos/mobile/src/main/java/com/datatorrent/demos/mobile/Application.java >>>>>>>> >>>>>>>> Regards, >>>>>>>> Ashwin. >>>>>>>> >>>>>>>> On Tue, Jan 26, 2016 at 9:52 PM, Amit Shah >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Appreciate the discussion we are having on this topic. >>>>>>>>> >>>>>>>>> Bhupesh, If I understand the flow correctly, we would have to >>>>>>>>> define one DAG per cell in the table that could be modified by the user. >>>>>>>>> Given this, it would be right to define the DAG only when the table is >>>>>>>>> presented to the user on the UI (not at definition time since there would >>>>>>>>> be many tables). Would it be possible to define DAG at runtime i.e. >>>>>>>>> defining & wiring the operators at runtime? >>>>>>>>> >>>>>>>>> >>>>>>>>> Ashwin, I am glad to answer these questions >>>>>>>>> >>>>>>>>> 1. We are extending our OLTP based application by introducing >>>>>>>>> analytical features that includes what-if kind of analysis. Other features >>>>>>>>> do include performing OLAP kind of operations like aggregation, slice & >>>>>>>>> dice, drill down/up, pivoting. Our first milestone is to target what-if >>>>>>>>> kind of analysis. We don't have any implementation so far. We are exploring >>>>>>>>> out solutions to these requirements >>>>>>>>> 2. The technical challenges we have include having an in-memory >>>>>>>>> calculation engine system that supports parallel writes and provides real >>>>>>>>> time or near real time response time. >>>>>>>>> >>>>>>>>> Hope that answers your queries. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Amit. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jan 25, 2016 at 10:26 PM, Ashwin Chandra Putta < >>>>>>>>> ashwinchandrap@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Amit, >>>>>>>>>> >>>>>>>>>> I have a couple of questions if its not much. >>>>>>>>>> >>>>>>>>>> 1. What is the current implementation? >>>>>>>>>> 2. What are the challenges you are facing? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Ashwin. >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> I am trying to evaluate apache apex for building an application >>>>>>>>>> that supports what-if analysis support to users. This co-relates closed >>>>>>>>>> with excel kind of functionality where changing a value in one cell >>>>>>>>>> triggers changes in other cell values. In our case we would have multiple >>>>>>>>>> rows in various tables getting updated when the user changes a row value. >>>>>>>>>> The response needs to be in real-time or near real-time. >>>>>>>>>> >>>>>>>>>> Does Apex fit such an use-case? If so, what would be some of >>>>>>>>>> initial steps to evaluate it for this use case? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Regards, >>>>>>>> Ashwin. >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Regards, >>>>> Ashwin. >>>>> >>>> >>>> >>> >> --001a114329e2231a45052a676961 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Please see my responses below

>1. Loading values for = unmodified cells
What is the sou= rce of these unmodified cells?

Table= values. Taking an e.g. from the diagram, assuming the user modifies cell w= ith identifier (table 1, row 1, column 1) we would have to load values for = unmodified cells (table 2, row 2, column 2) =C2=A0and (table 4, row 4, colu= mn 4) to recalculate the values of other cells=C2=A0

> 3. Execut= e the cells in parallel (if possible)
Which cells you are referring to? Table1, row 1, column 1=C2=A0- that= is the cells that are changed will trigger dependent cells recalculation o= r the two dependent cells?

The modif= ication of the cell with identifier (table 1, row 1, column 1) would trigge= r recalculation of the cell values (table 3, row 3, column 3) and (table 6,= row 6, column 6). In this example we cannot do parallel evaluations but yo= u could imagine a case where there are parallel calculations that could be = possible.

Thanks,
Amit.

On Thu, Jan 28, 2016 a= t 9:20 PM, Sandeep Deshmukh <sandeep@datatorrent.com> = wrote:

Thanks Amit. We hav= e better understanding of your requirements now.

It is not necessary that each cell will be one operator. Ple= ase don't get biased by that assumption.

Here are few more queries.
>1. Loading values for unmodified cells
What is the source of these unmodified cells?

> 3. Execute the cells in parallel (if p= ossible)
Which cells you are referring to? Table1, row 1, column 1=C2=A0- that is th= e cells that are changed will trigger dependent cells recalculation or the = two dependent cells?

Regards
Sandeep

On 28-Jan-2016 8:20 pm, "Amit Shah" &l= t;amits.84@gmail.co= m> wrote:
Thanks Sandeep for the follow up. I have tried responding to = your queries. Kindly let me know if that gives you an idea on what I am try= ing to achieve

how you will be representing your dependencies in a graph

Attached a sample dependency graph. I was assumin= g each cell to be represented as an operator in apex terms so that they cou= ld be executed in parallel

How many such dependency graphs will be there?=

Total number of graphs would be approximat= ely equal to the number of rows that could be modified by the user (conside= ring the worst case). The number should be in 1000's.=C2=A0
<= br>
Do you have one = graph per change of cell defining its dependent cells?=C2=A0So, for the example you mentioned, do you define i= t as O1 dependent cells into one graph? Then there is another graph which d= efines what values are updated if some other cell O7 is updated.

Yes approximately one graph per cell. The depe= ndency graph I have tried presenting in the attached diagram could be execu= ted if any of the cell values in table 1, 2 or 4 are updated. For simplicit= y I have picked up cells from distinct tables.

In = my view, once the user sees the tables on the UI, we could create the depen= dency graphs in the background. Once he/she updates a cell value, our appli= cation would figure out its corresponding dependency graph and start its ex= ecution by
1. Loading values for unmodified cells
2. De= termine the cells (or operators) that are to be recalculated. For e.g. if t= he cell with identifier as table1, row 1, column 1 is updated, the applicat= ion would determine that 2 cell values are to be updated.=C2=A0
3= . Execute the cells in parallel (if possible)
4. Render the updat= ed values in real time to the user.

Thanks,
<= div>Amit.

On Thu, Jan 28, 2016 at 7:28 PM, Sandeep Deshmukh <= ;sandeep@datat= orrent.com> wrote:
Hi Amit,

Your concern is that change of one cel= l is going to trigger update for large number of cells and you are interest= ed in doing this in parallel to get real-time response. This can be very we= ll achieved using Apex.

I think we are still not v= ery clear on your use case and hence what we have proposed may not fit matc= h what you are looking for.

We would like to know = how you will be representing your dependencies in a graph. How many such de= pendency graphs will be there? Do you have one graph per change of cell def= ining its dependent cells? So, for the example you mentioned, do you define= it as O1 dependent cells into one graph? Then there is another graph which= defines what values are updated if some other cell O7 is updated.

Once we fully understand your requirements, we should be a= ble to guide you better.

Regards,Sandeep

On Thu, Jan 28, 2016 at 2:56 PM, Amit Shah <= span dir=3D"ltr"><amits.84@gmail.com> wrote:
Ashwin, Below are follow up queries that I have based on= your response.

The store I mentioned is just an abstraction. It can be in = memory store, or a cache backed lookup from a database.
=

Yes I understand by the term store but I didn= 9;t follow the need of it. =C2=A0

How does your UI interact with your serve= r today?
=C2=A0
Our UI is built ov= er angularjs so it communicates with the server through REST api's.

Yo= u dont have to create a new DAG for each cell you are changing. You can hav= e a single DAG running and send across your query with the cell changes in = the schema you define. You can perform all corresponding changes for other = cells/table rows in the store operator.

<= /span>
I was under the impression that by defining one operator per col= umn index I could take the advantage of apex running individual operators o= n individual jvm's and hence parallel writes in real-time or near real-= time response time. If we have single static DAG that accepts the cell iden= tiifer (row Id, column index and table id) as parameters then we would not = be able to concurrently updates cell values right?
If your unders= tanding is different from the flow I explained in my previous mail, what do= I gain by using apex?


Thanks,
Amit.


On Thu, Jan 28, 2016 at 12= :51 AM, Ashwin Chandra Putta <ashwinchandrap@gmail.com> wrote:
Amit,

=
The store I mentioned is just an abstraction. It can be in memor= y store, or a cache backed lookup from a database.

For the query/query response, when interacting with a UI - you can send yo= ur queries to the query operator and listen for response from the query res= ponse operator. Historically we have used json over websockets to interact = from browser. How does your UI interact with your server today?
<= br>
You dont have to create a new DAG for each cell you are chang= ing. You can have a single DAG running and send across your query with the = cell changes in the schema you define. You can perform all corresponding ch= anges for other cells/table rows in the store operator.

If you still want to depend completely on your existing server for lo= ading initial data, then you can load it to a cache in store and do your an= alysis on that data in memory.

Regards,
= Ashwin.

On Wed, Jan 27, 2016 at 7:42 AM, Amol Kekre &= lt;amol@datatorre= nt.com> wrote:

Amit,
Here are some answers
- Logic that = you want to run can be coded as an utility, that is then invoked by any oth= er operator
- PopulateDAG() is today part of roll out of the app,= i.e it is similar to "compileTime" and not "runTime". = You could do runTime, but then you will need to go through dtcli. Today run= Time changes via dtcli will need a lot more coding. A very early version of= runTime changes (based on system metrics) exist, but the ask is for change= s based on application data. That ask is in the roadmap of module rollout (= phase II?) and others can comment on the roadmap for runtTime populateDAG.<= /div>
- Outputs of many operators can be streamed as input to one opera= tor in following ways
=C2=A0 =C2=A0- Each output having different= schema will mean different input ports on that operator as port schema is = fixed. This is fine, but will clutter the DAG
=C2=A0 =C2=A0- If t= he schema of these output ports is same, there is a merge operator that doe= s that (https://github.com/apache/incubator-apex-malhar/blob/master/= library/src/main/java/com/datatorrent/lib/stream/StreamMerger.java). Yo= u can write one for Nx1 merge by extending the above class.=C2=A0

Thks,
Amol


On Wed, Jan 27, 201= 6 at 6:03 AM, Amit Shah <amits.84@gmail.com> wrote:
Thanks Ashwin for the foll= ow up.
I am not sure if I completely follow the query -> store ->= ; query pattern. What does query mean here? Why would we need a in-memory s= tore?
Trying to list down the flow I came up with below points
    =
  1. We need to build a DAG after we get to know the cell (table, row and co= lumn index) that is modified by the user.=C2=A0
  2. Once we receive= user input (i.e. once the user modifies a value in a table) the populateDA= G() method should be called.
  3. The populateDAG() implementation w= ould=C2=A0
    1. Determine what cells should be updated across al= l tables
    2. Create an Operator per cell that is affected by the change= . From the demo code I see=C2=A0dag.= addOperator method instantiating an operator.=C2=A0Since the logic to update an cell would be the same acro= ss tables how do we create new operators per cell to have a graph that look= s what Bhupesh envisioned in his last email reply? In my view the graph would like
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 O1 (for user modified cell) -> O2 (table X, row Y, column ind= ex 2) -> O5=C2=A0(table E, row F= , column index 10000)=C2=A0<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0= =C2=A0O3 (table M, row N, column in= dex 3) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 -> =C2=A0O6 (update UI)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0=C2=A0O4 (table P, row Q, column index 1)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 3. We want the DAG to be evaluated instantly once the populate= DAG() method finishes. How do we do it?
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 4. <= font color=3D"#ff0000">Can outputs from many operators be streamed as an in= put to one operator? From the above= example outputs from O3, O4 and O5 need to go to O6.

I appreciate your inputs on this.

Thanks,
Amit.


On Wed, Jan 27, 2016 at 1= :49 PM, Ashwin Chandra Putta <ashwinchandrap@gmail.com> wrote:
Amit,

Thanks for the response. You can use the query --> store --= > query result pattern to do the real time updates and lookups for what-= if analysis.

And you can also ingest your real time input data to the store opera= tor. input --> store.

That way, you can keep ingesting your data into the stor= e operator where you will keep your OLAP dimensions and measures.=C2=A0

For the q= uery/query result pattern example, see this demo:


Regards,
Ashwin.

On Tue, Jan 26, 2016 at 9:52 PM, Amit Shah = <amits.84@gmail.= com> wrote:
Appreciate t= he discussion we are having on this topic.

Bhupesh, If I= understand the flow correctly, we would have to define one DAG per cell in= the table that could be modified by the user. Given this, it would be righ= t to define the DAG only when the table is presented to the user on the UI = (not at definition time since there would be many tables). Would it be poss= ible to define DAG at runtime i.e. defining & wiring the operators at r= untime?


Ashwin, I am glad to answer= these questions

1. We are extending our OLTP base= d application by introducing analytical features that includes what-if kind= of analysis. Other features do include performing OLAP kind of operations = like aggregation, slice & dice, drill down/up, pivoting. Our first mile= stone is to target what-if kind of analysis. We don't have any implemen= tation so far. We are exploring out solutions to these requirements
2. The technical challenges we have include having an in-memory calculat= ion engine system that supports parallel writes and provides real time or n= ear real time response time.

Hope that answers you= r queries.=C2=A0

Thanks,
Amit.


On Mon, Jan 25, 2016 at 10:26 PM, Ashwin Chandra Putta <= ashwinchandrap@gmail.com> wrote:

Amit,

I have a couple of questions if its not much.

1. What is the current implementation?
2. What are the challenges you are facing?

Regards,
Ashwin.

Hello,

I am trying to evaluate apa= che=C2=A0apex=C2=A0for building an application that supports w= hat-if analysis support to users. This co-relates closed with excel kind of= functionality where changing a value in one cell triggers changes in other= cell values. In our case we would have multiple rows in various tables get= ting updated when the user changes a row value. The response needs to be in= real-time or near real-time.

Does=C2=A0Apex=C2=A0fit such= an use-case? If so, what would be some of initial steps to evaluate it for= this use case?

Thanks!




<= /div>--

Regards,
Ashwin.
=





<= /div>--

Regards,
Ashwin.<= /span>




--001a114329e2231a45052a676961--