apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chinmay Kolhatkar <chin...@datatorrent.com>
Subject Re: APEXCORE-1972: ExpressionEvaluator - quasi-Java Expression Language
Date Thu, 28 Jan 2016 15:52:01 GMT
Hi Sandeep/Vlad,

1. The purpose of {} or ${} is to give hints to parser about what is the
operand and what elements should be resolved to its accessible format.
I suggest for first iteration, we change the ExpressionEvaluator to support
{}, i.e. syntax similar to POJOUtils.

2. +1 for what sandeep said. We surely can add support for SQL in next
iteration. This will ofcourse be an evolving utility.

3. +1 for no default import. I'll remove the existing imports.

Finally, I want to say that this is the first iteration of Expression
Evaluator and Expression Parser.
The interface that Tim suggested allows us to take things forward now as
well as keep the doors open for further improvement.
We can keep improving in this utility as per usecase basis.

Thanks,
Chinmay.


On Thu, Jan 21, 2016 at 8:52 PM, Sandeep Deshmukh <sandeep@datatorrent.com>
wrote:

> After going through the mail chain, here are my thoughts on respective
> discussion points.
>
> 1. Removing ${} could be explored and as Vlad suggested, the expression
> then can be externally validated. +1 on this.
>
> 2. Supporting SQL is possible by extending the interface. Many of the Apex
> users are going to be Java developers and hence supporting quasi-Java
> expression language could be useful to them. Additionally, support for SQL
> for non-java developers can be considered in the next iteration of
> expression evaluator.
>
> 3. No default import to be provided by evaluator. +1 on that.
>
> 4. We can have a use case where multiple POJOs are required. For example,
> the user need to identify out of order tuples. The expression would be :
> current.id < previous.id , current and previous are place holders for the
> current tuple and the tuple processed just before the current one
> respectively.
>
> 5. Expression evaluator interface as suggested by Tim is an excellent
> feedback. This enables the user to implement the interface for SQL or for
> that matter, any other language. If we attempt to extend POJOUtils, this
> door may be closed. In addition to this, if we go ahead with POJOUtils,
> using that syntax will contradict point 1 to remove ${} as POJOUtils uses
> {} . Modifying POJOUtils may land us into backward compatibility issue.
>
> Regards
> Sandeep
>
>
>
> Regards,
> Sandeep
>
> On Tue, Jan 19, 2016 at 11:00 PM, Vlad Rozov <v.rozov@datatorrent.com>
> wrote:
>
> > 1. [CK] Please see my first mail on this thread. If there is a single
> > object type registered with parser, one can directly call field1 and not
> > need to say it input1.field1
> > [VR] My concern is not using "input1.field1", but the mandatory
> > requirement to use ${...}, so both ${input1.field1} and ${field1} have
> the
> > same problem, namely lack of IDEs support and ability to validate/debug
> > errors in the expressions. Without ${...} it will be possible to test
> Java
> > expression outside of Apex and copy/paste it to an operator for the
> > execution inside Apex platform.
> >
> > 2. [CK] Also, for a developer its easier to write an
> > expression as it is really a java language.
> > [VR] Problem is that it is not actually java language and the
> > functionality more likely to be used by not java developers.
> >
> > 3. [CK] I'm open to narrowing down the exact list of default
> classes/static
> > methods to import.
> > [VR] I vote to require all imports to be specified explicitly or delegate
> > the decision of default imports to an operator.
> >
> > 4. [CK] SO here is the thought process I had when I added multiple POJO
> > support. More than one object can come together to create a new object.
> All
> > I wanted to do was to expose such a functionality where the data is
> present
> > in two different object and expression               can work on both the
> > object to create some new field. I think this is a good addition to
> > expression language. Please let me know if there is any technical reason
> to
> > remove this.
> > [VR] Don't forget that it is not a single tuple/POJO. Operators consume
> > stream of tuples or sets (taking window into consideration) of tuples. An
> > expression defines a function or a transformation that should be applied
> to
> > each tuple. I don't see how expression that operates on two or more
> > streams/sets of tuples can be defined without defining how streams/sets
> are
> > joined together and for joins SQL is a de-facto standard language. In
> case
> > join creates a pair, triple or any similar object, it is still single
> tuple
> > and I don't need multiple POJO syntax to refer to (pair.key, pair.value)
> or
> > (triple.field1, triple.field2, triple.field3).
> >
> > 5. [CK] I think it might be early to say whether ExpressionEvaluator is
> > extension to PojoUtils. PojoUtils is being used at a number of places
> and I
> > did not want to disturb that usage. Do you think it is feasible to
> consume
> > the ExpressionEvaluator now and maybe in the future merge PojoUtils and
> > ExpressionEvaluator, if that's what you're talking about?
> > [VR] I prefer to have a single class that covers extended functionality
> > over multiple classes with overlapping functionality. Other than support
> > for multiple POJOs, PojoUtils may be extended to support expression
> > evaluation and actually it already supports expressions.
> >
> >
> > On 1/19/16 06:54, Chinmay Kolhatkar wrote:
> >
> >> Hi Vlad,
> >>
> >> Thank you for your feedback. My answers/comments inline further.
> >>
> >> Thanks,
> >> Chinmay.
> >>
> >> On Tue, Jan 19, 2016 at 4:09 AM, Vlad Rozov <v.rozov@datatorrent.com>
> >> wrote:
> >>
> >> Hi Chinmay,
> >>>
> >>> Thank you for providing answers. Please see my comments in-line.
> >>>
> >>> Thank you,
> >>>
> >>> Vlad
> >>>
> >>>
> >>>
> >>> On 1/18/16 01:00, Chinmay Kolhatkar wrote:
> >>>
> >>> Hi Vlad,
> >>>>
> >>>> Please find the answers inline.
> >>>>
> >>>> Thanks,
> >>>> Chinmay.
> >>>>
> >>>>
> >>>> On Mon, Jan 18, 2016 at 1:26 AM, Vlad Rozov <v.rozov@datatorrent.com>
> >>>> wrote:
> >>>>
> >>>> I have few concerns regarding the proposal, so my vote is -0 till they
> >>>> are
> >>>>
> >>>>> resolved
> >>>>>
> >>>>> 1. Usage of quasi-Java proprietary language. Even though most of
the
> >>>>> syntax adheres to Java, usage of ${} invalidates ability to use
> general
> >>>>> IDEs to develop expressions. Debugging and verifying correctness
of
> any
> >>>>> complex expression becomes a non trivial task for application
> >>>>> developers.
> >>>>>
> >>>>>      [CK] ${} is used so as to identify the operands and replace
it
> >>>>> when
> >>>>>
> >>>> correct variables/getter methods. Though, the interface is created in
> >>>> such
> >>>> a way that one can override this. So top level interfaces are:
> >>>>       Expression - Object of executable expression.
> >>>>       ExpressionParser - One which parses expression.  Comcrete
> example
> >>>> is
> >>>> JavaExpressionParser
> >>>>       ExpressionEvaluator - One which converts parsed expression into
> an
> >>>> executable expression. ExpressionParser goes as a plugin to
> >>>> ExpressionEvaluator. Concrete impl is JavaExpressionEvaluator.
> >>>>
> >>>>      Because of this interface one is free to override the parsing and
> >>>> evaluation logic.
> >>>>      Also JavaExpressionParser has a setter method using which one can
> >>>> override the regex for ${}.
> >>>>
> >>>> [VR] My concerns are mostly for concrete Expression language proposal
> -
> >>> namely quasi-Java expression language as I did not have a chance to
> look
> >>> into ExpressionParser/ExpressionEvaluator interface proposal.
> >>>
> >>> Is it mandatory to have ${...} to identify the operands? I understand
> >>> that
> >>> in some cases it is necessary to provide hints to a parser, but in all
> >>> your
> >>> examples usage of ${...} seems to be unnecessary. For example it should
> >>> be
> >>> equally possible to parse input1.field1 and ${input1.field1}. In
> >>> addition,
> >>> if there is no need to support multiple POJOs, "input1" is not
> necessary
> >>> as
> >>> well.
> >>> [CK] Yes, even I want to remove the use of ${..}... I'll see how I can
> >>> get
> >>> it done. The expression language supports s number of ways to reference
> >>> operands. Please see my first mail on this thread. If there is a single
> >>> object type registered with parser, one can directly call field1 and
> not
> >>> need to say it input1.field1.
> >>>
> >>
> >>
> >> 2. What other languages were considered for expressions? Why quasi-Java
> >>>> was
> >>>>
> >>>> selected in favor of for example SQL syntax?
> >>>>>
> >>>>>     [CK] The aim of choosing quasi-Java language is to provide and
> >>>>>
> >>>> complete
> >>>> functionality as well as performance.
> >>>>       As per the interface defined above, one can very well chose to
> >>>> override
> >>>> the ExpressionParser and ExpressionEvalator to even have a SQL based
> >>>> implementation for expression.
> >>>>
> >>>> [VR] OK, let me rephrase the question. Why are we starting with
> >>> quasi-Java
> >>> language implementation and not with SQL like language?
> >>>
> >>      [CK] quasi-Java language gives a quick way to parse the expression
> >> and
> >> compile using janino. Also, for a developer its easier to write an
> >> expression as it is really a java language. But as I said, because of
> the
> >> way interface is created, we can create a SQL type language. I'll be
> happy
> >> to work on SQL type language as a next iteration.
> >>
> >> 3. Separating imports from expressions seems as inconvenience to
> >>>>
> >>>>> expression developers to me. Why imports can't be specified as part
> of
> >>>>> expression definition?
> >>>>>
> >>>>>     [CK] The purpose of imports is to have some public static method
> >>>>> to be
> >>>>>
> >>>> part of expression. For eg. java.util.Math.* is statically imported
by
> >>>> default. This gives expression writer a freedom to use the static
> >>>> methods
> >>>> present in java.util.Math  out of the box in expression.
> >>>>        One can very well chose to have imports in expression itself.
> >>>>        For eg. Following is a valid expression and respective import
> in
> >>>> expression will also be considered:
> >>>>
> >>>>        import org.apache.x.y.z;
> >>>>        <expression>
> >>>>
> >>>> [VR] What was the selection criteria to include certain
> classes/packages
> >>> into default import? Why for example
> org.apache.commons.lang3.StringUtils
> >>> is included, but org.apache.commons.lang3.math.NumberUtils is not. Note
> >>> that this is just an example and for any set of default imports
> somebody
> >>> will always raise a question unless there is clear selection criteria.
> >>>
> >>      [CK] I agree, the list of defaults are mainly to include the
> default
> >> basic functionality. This is mainly influenced from spreadsheet
> >> applications to include some common function in expression. StringUtils
> &
> >> StringEscapeUtils are included to provide some basic string based
> >> functionality. NumberUtils are not included because method names in
> >> NumberUtils overlap with java.lang.Math and because they're imported as
> >> static imports the compilation failure of expression happens when the
> >> method is used.
> >>       I'm open to narrowing down the exact list of default
> classes/static
> >> methods to import.
> >>
> >>
> >> 4. Where multiple POJOs come from? If assumption is that different POJOs
> >>>
> >>>> come from different input ports, how tuples from different ports are
> >>>>> synchronized?
> >>>>>
> >>>>>     [CK] Honestly, I don't have a usecase where multiple POJOs will
> >>>>> become
> >>>>>
> >>>> part of a single expression. I've done this for a purpose of
> completion
> >>>> of
> >>>> expression language. Of course, I would love to get some feedback from
> >>>> community if there is a possible usecase for this.
> >>>>
> >>>> [VR] OK, so I guess we don't need this functionality at least for now.
> >>>
> >>      [CK] SO here is the thought process I had when I added multiple
> POJO
> >> support. More than one object can come together to create a new object.
> >> All
> >> I wanted to do was to expose such a functionality where the data is
> >> present
> >> in two different object and expression               can work on both
> the
> >> object to create some new field. I think this is a good addition to
> >> expression language. Please let me know if there is any technical reason
> >> to
> >> remove this.
> >>
> >> 5. Are we supporting both POJOUtils and ExpressionEvaluator? If not,
> what
> >>>>
> >>>>> is the plan to deprecate POJOUtils? If yes, syntax of POJOUtils
and
> >>>>> ExpressionEvaluator don't seem to be compatible.
> >>>>>
> >>>>>     [CK] I haven't though about PojoUtils deprecation yet. In my
> >>>>> opinion,
> >>>>>
> >>>> it
> >>>> should not be done. PojoUtils has its own purpose. It deals with POJOs
> >>>> and
> >>>> simple operation which require faster interface to access variables
> from
> >>>> POJO. If that is the usecase one should use PojoUtils.
> >>>> ExpressionEvaluator
> >>>> is to support expression based execution.
> >>>>      But I'm open to thoughts here.
> >>>>      About incompatibility, we can make it compatible. Can you please
> >>>> share
> >>>> where it is different? Is it the operand syntax you're talking about
> ${}
> >>>> of
> >>>> Expression vs {}  of PojoUtils. I'll happy to make Expression to be
> >>>> compatible with PojoUtils for the same.
> >>>>
> >>>> [VR] PojoUtils provide similar functionality and I don't see much
> >>> difference in a way how PojoUtils and ExpressionEvaluator may be used.
> >>> They
> >>> clearly have an overlapping use cases and overlapping functionality.
> IMO,
> >>> it will be better to have the same syntax for both, or to be more
> precise
> >>> extend PojoUtils to support not only getters and setters, but also
> >>> expressions.
> >>> [CK] We can make the syntax similar for both. But that look
> contradictory
> >>> to me with your last comment in first questions i.e. removing ${...}
> and
> >>> accessing operands just by input1.field1 OR field1.
> >>>
> >>>       I think it might be early to say whether ExpressionEvaluator is
> >> extension to PojoUtils. PojoUtils is being used at a number of places
> and
> >> I
> >> did not want to disturb that usage. Do you think it is feasible to
> consume
> >> the ExpressionEvaluator now and maybe in the future merge PojoUtils and
> >> ExpressionEvaluator, if that's what you're talking about?
> >>
> >>
> >> 6. POJOUtils supports both getters and setters, what is the plan to
> >>>> support
> >>>>
> >>>> setters in ExpressionEvaluator?
> >>>>>
> >>>>>     [CK]  I'm not sure functionality of ExpressionEvalutor can be
> >>>>>
> >>>> encompassed
> >>>> within getter/setter. ExpressionEvaluator lets one execute a given
> >>>> expression and provide a value returned by Expression.
> >>>>
> >>>>
> >>>> Thank you,
> >>>>
> >>>>> Vlad
> >>>>>
> >>>>> On 1/13/16 22:52, Chinmay Kolhatkar wrote:
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>>> I'm working on APEXCORE-1972 which adds support for a quasi-Java
> >>>>>> Expression
> >>>>>> Language and its expression evaluator.
> >>>>>>
> >>>>>> All the detailed functionality and design details along with
> examples
> >>>>>> are
> >>>>>> present in Jira.
> >>>>>>
> >>>>>> I've summarized the ExpressionEvaluator feature & Expression
> language
> >>>>>> below.
> >>>>>>
> >>>>>> The pull request created for this at here:
> >>>>>> https://github.com/apache/incubator-apex-malhar/pull/170
> >>>>>>
> >>>>>> Please share your thought on this.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Chinmay.
> >>>>>>
> >>>>>>
> >>>>>> *Summary of functionality of ExpressionEvaluator:*
> >>>>>>
> >>>>>>       1. Support quasi-Java Expression which defines a single
line
> >>>>>> executable
> >>>>>>       expression
> >>>>>>       2. Support for quasi-Java expression based function code,
> which
> >>>>>> will
> >>>>>> be
> >>>>>>       compiled on the fly and made available for execution.
> >>>>>>       3. Should support accessing multiple fields from multiples
> input
> >>>>>> POJOs
> >>>>>>       while addressing the conversion of private variables to
public
> >>>>>> getter
> >>>>>>       method for all levels.
> >>>>>>       4. Should support nested field support
> >>>>>>       5. quasi-Java expressions should support operands to be
> >>>>>> mentioned
> >>>>>> in
> >>>>>>       following ways:
> >>>>>>          - ${input.fieldName}
> >>>>>>             - Access fieldName via a object name.
> >>>>>>          - ${fieldName}
> >>>>>>             - Accessing fieldName directly when a single object
is
> >>>>>> registered
> >>>>>>             for operation.
> >>>>>>          - ${input}
> >>>>>>             - Accessing object variable directly
> >>>>>>          - ${input.fieldName.internalField}
> >>>>>>             - Accessing nested fields
> >>>>>>          6. There should be some predefined function provided
to
> >>>>>> expression
> >>>>>>       writer which one can directly use in expression for invoking
> >>>>>> certain
> >>>>>>       functionality.
> >>>>>>       7. These are simple String based, Date time based etc
> functions.
> >>>>>>       8. On-need-basic one should be able to easily update
> Expression
> >>>>>>       Evaluator to easily add new predefined functions to be
made
> >>>>>> available
> >>>>>> for
> >>>>>>       expression writer.
> >>>>>>       9. User of ExpressionEvaluator should be able to add a
custom
> >>>>>> method
> >>>>>>       externally to be made available to in expression.
> >>>>>>       10. Though operands are defined, placeholder for the operand
> in
> >>>>>>       expression should be allowed to be overridden. By default,
> >>>>>> expression
> >>>>>>       language should support bash type syntax for operand -
{…}
> >>>>>>       11. The library should not introduce any serialization
related
> >>>>>> issues.
> >>>>>>       12. All the java operators should be supported.
> >>>>>>
> >>>>>> ​
> >>>>>>
> >>>>>> *The examples of quasi-Java Expression:*
> >>>>>>
> >>>>>>       - ${inp.field1}
> >>>>>>          - Will return value of field1 from registered input
POJO.
> >>>>>>       - ${inp.field1} + ${inp.field2}
> >>>>>>          - Will return sum of field1 & field2 from given
POJO
> >>>>>>       - ${field1} + ${field2}
> >>>>>>          - Equivalent to above
> >>>>>>       - ${inpA.field1} > ${inpA.field2} ? ${inpA.field3}
:
> >>>>>> ${inpB.field3}
> >>>>>>          - Executes ternary expression and returns value
> accordingly.
> >>>>>> Works
> >>>>>> on
> >>>>>>          2 POJOs. inpA & inpB are two placeholder registered
for
> given
> >>>>>> POJO
> >>>>>> with
> >>>>>>          ExpressionEvaluator library.
> >>>>>>       - pow(${inpA.field1}, ${inpB.field2})
> >>>>>>          - Executes pow function coming from java.lang.Math
library.
> >>>>>> This
> >>>>>> and
> >>>>>>          other with lot other basic functions is available to
> >>>>>> expression
> >>>>>> writer out
> >>>>>>          of the box to use in expression.
> >>>>>>       - ${inpA.field1} > 0 ? ${inpB.innerPOJO.field3} :
> >>>>>>       ${inpA.innerPOJO.field3}
> >>>>>>          - Shows how nested POJOs can be accessed in expression.
The
> >>>>>> variables
> >>>>>>          will evaluate to correct public getter method is required.
> >>>>>>       - ${inp.firstname} + “ “ + ${inp.lastname}
> >>>>>>          - Generate the full name as per given expression from
> >>>>>> firstname
> >>>>>> and
> >>>>>>          lastname field.
> >>>>>>       - long retVal=1; for (int i=0; ${inpF.value1}; i++) {retVal
=
> >>>>>> retVal *
> >>>>>>       ${inpF.value1}; } return retVal;
> >>>>>>          - This tells a complete method content to
> >>>>>> ExpressionEvaluator.
> >>>>>> The
> >>>>>>          library create an executable and compiled method using
this
> >>>>>> expression.
> >>>>>>
> >>>>>> ​
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message