cayenne-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrus Adamchik <and...@objectstyle.org>
Subject Re: Extracting tokens from an expression and matching an object against that expression without parsing twice
Date Mon, 17 Nov 2014 12:13:09 GMT
> It's not easy to explain properly why I need the tokens; the general reason is that the
preexisting application, written long ago by several other persons, is designed to use them,
and changing its design would be too big an undertaking.

Yeah, I still don't understand why would the code care to poke inside the parser and deal
directly with tokens.

> I will see if I can use Andrus' pointers to extract the tokens from the Expression instance.

I am afraid you won't find any *tokens* in an Expression instance. Expression is just a tree
of objects that can be used to evaluate stuff. If you need it to match something, you can.
But a parsed expression is devoid of any links to the original lexical structure. 

Andrus



> On Nov 17, 2014, at 11:46 AM, Davide Vecchi <dv@amc.dk> wrote:
> 
> Thanks for your inputs.
> 
> I'm probably showing my technological age here, but I certainly admit that I have this
tendency to avoid repeating complex operations as a matter of principle when it's known in
advance that the second process will produce exactly the same result as the first one. When
I catch myself doing that I always feel that my design is not OK.
> 
> However in this case I am quite sure I need to get rid of the double parsing, although
I did not demonstrate in a particularly strict way that that's the cause of the slowdown.
It's more like a qualified (in my opinion) guess, reinforced by the fact that method Expression.fromString(String)
has a TODO saying "TODO: cache expression strings, since this operation is pretty slow" (I'm
using version 3.0.2). So it looks like the Cayenne coders too had reasons to worry to some
extent about optimization in this area.
> 
> I just used JVisualVM to profile the execution and two of the methods where by far most
of the time is spent are Expression.fromString(String) and ExpressionParser.getNextToken()
. Since I have to cut down the processing time I do have to focus on them first.
> 
> The situation here is that I modified a preexisting application which was doing some
basic parsing, and after creating the tokens from the parsing it was using them to match the
expression against objects. That parsing is basic in that it can only parse simple expressions,
f.ex. it doesn't support parentheses grouping.
> 
> My changes consisted of removing that parsing code from the application and replacing
it with calls to Cayenne, because we need real parsing. Of course the parsing done by Cayenne
is way more powerful and that might be the real and fair reason why it takes longer, but even
if this is the case it's important for me not to do that parsing twice.
> 
> It's not easy to explain properly why I need the tokens; the general reason is that the
preexisting application, written long ago by several other persons, is designed to use them,
and changing its design would be too big an undertaking. Since all that needs to be improved
is the parsing and matching I thought I'd just use a powerful tool to replace only those parts.
> 
> I will see if I can use Andrus' pointers to extract the tokens from the Expression instance.
> 
> 
> 
> -----Original Message-----
> From: Andrus Adamchik [mailto:andrus@objectstyle.org] 
> Sent: Sunday, November 16, 2014 14:57
> To: user@cayenne.apache.org
> Subject: Re: Extracting tokens from an expression and matching an object against that
expression without parsing twice
> 
> I second John's assessment. 
> 
> BTW, what are the tokens for? Do you actually need to have access to the lexical structure
of the String? As of course parsed Expression object is a tree itself and gives you access
to its own structure either directly ('getOperand(int)') or via 'traverse' and 'transform'
methods.
> 
> Andrus
> 
>> On Nov 14, 2014, at 9:54 PM, John Huss <johnthuss@gmail.com> wrote:
>> 
>> This looks like a serious micro optimization.  Is the performance for 
>> this really that critical?  Have you demonstrated that this is your 
>> application's crucial hot spot?
>> 
>> On Fri, Nov 14, 2014 at 7:35 AM, Davide Vecchi <dv@amc.dk> wrote:
>> 
>>> Hi all,
>>> 
>>> I have an expression in a string, and I use Cayenne to parse the 
>>> expression into tokens, which are needed for a specific purpose.
>>> 
>>> However in addition to having the tokens I also need to evaluate an 
>>> object against that expression, to see if that object matches the expression.
>>> 
>>> My problem is that the way I'm doing it causes the parsing to be done 
>>> twice on the same expression, and I would like to avoid to parse the 
>>> same expression twice.
>>> 
>>> The token creation I'm doing it like this:
>>> 
>>> -----------------------------------
>>> String where = "myField=0";
>>> 
>>> Reader reader = new StringReader(where);
>>> 
>>> ExpressionParser parser = new ExpressionParser(reader);
>>> 
>>> List<Token> tokens = new ArrayList<>();
>>> 
>>> Token token = parser.getNextToken();
>>> 
>>> while (token != null) {
>>> 
>>>    tokens.add(token);
>>> 
>>>    token = parser.getNextToken();
>>> }
>>> -----------------------------------
>>> 
>>> The object matching I'm doing it like this:
>>> 
>>> -----------------------------------
>>> String where = "myField=0";
>>> 
>>> Expression expression = Expression.fromString(where);
>>> 
>>> boolean matches = expression.match(object);
>>> -----------------------------------
>>> 
>>> The call to Expression.fromString made in the object matching 
>>> operation performs a parsing, but the parsing of the same expression 
>>> had already been done in the token creation operation.
>>> 
>>> Is there a way to redesign this process in order to get the tokens 
>>> and also match an object against the expression without parsing the 
>>> same expression twice ?
>>> 
>>> For example, I believe that the call to Expression.fromString must 
>>> have created the tokens, because it has parsed the string. So I 
>>> thought I could reverse the order and do the object matching first, 
>>> keep the Expression instance created in that process and use it to 
>>> extract the tokens. But I can't see how to extract the tokens from an 
>>> Expression instance instead of from an ExpressionParser instance as I'm currently
doing.
>>> 
>>> Or another possibility could be that I keep creating the tokens 
>>> first, and then I match my object against them, instead of against 
>>> the string expression that generated those tokens. But I can't see 
>>> how to match an object against tokens.
>>> 
>>> So I'm looking for some ideas.
>>> 
>>> Thanks in advance.
>>> 
>>> Davide Vecchi
>>> 
> 
> 


Mime
View raw message