db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: R: Using derby to parse an SQL statement
Date Fri, 14 Nov 2008 15:25:40 GMT
Hi Flavio,

I have attached a TreeWalker class to 
https://issues.apache.org/jira/browse/DERBY-3946 This may help you 
explore the AST.

Hope this helps,
-Rick

Rick Hillegas wrote:
> Hi Flavio,
>
> I don't think that we have a good primer on the AST nodes. However, 
> you can get some sense of what the nodes mean by building the Derby 
> javadoc and looking at the javadoc for the package 
> org.apache.derby.impl.sql.compile.
>
> Now that you've gotten over the hard hurdle of building the Derby 
> classes, building the javadoc is easy:
>
>  ant -quiet javadoc
>
> This will build the javadoc into the directory javadoc/engine. Once 
> you have built the engine javadoc, browse to the 
> org.apache.derby.impl.sql.compile package. The header comments on the 
> classes are usually pretty helpful. I also recommend taking a look at 
> the tree view of that package. You will see that the AST nodes are all 
> of the classes indented under QueryTreeNode.
>
> Understanding the AST graph takes some patience. There are two useful 
> techniques for figuring out how the nodes snap together into a graph:
>
> 1) The nodes themselves implement 
> org.apache.derby.iapi.sql.compile.Visitable. That interface has one 
> method, accept(). Not all AST nodes directly implement accept() so you 
> may have to inspect their superclasses. The accept() method shows you 
> what each node thinks its subnodes are. You can write your own tool to 
> explore the graph by coding your implementation of the graph walker, 
> org.apache.derby.iapi.sql.compile.Visitor.
>
> 2) The nodes also implement a pretty-printing method, treePrint(). The 
> treePrint() method is another example of what a node thinks its 
> subnodes are. This is the method which ASTParser calls on the top node.
>
> That should get you started. The devil is in the details, but keep 
> posting questions and I think you'll get to the bottom of it.
>
> Hope this helps,
> -Rick
>
> Flavio Palumbo wrote:
>> Hi guys,
>>
>> I followed this post finding it very exciting, cause I post a similar
>> question a while ago.
>>
>> Now I'd like to test the work submitted by Rick but I'm not so inside 
>> Derby
>> to catch he whole job I have to do.
>>
>> I understood that I have to :
>>
>> - download Derby sources (from where ?)
>> - apply the patch suggested by Rich (seen the classes but not catch the
>> lines of code)
>> - recompile Derby (what libraries/jars I need to)
>> - use ASTParser as suggested
>>
>> Can somebody give me a hint   Thanks a lot
>>
>> Flavio
>>
>>  
>>> -----Messaggio originale-----
>>> Da: news [mailto:news@ger.gmane.org]Per conto di Christian Riedel
>>> Inviato: giovedì 13 novembre 2008 6.37
>>> A: derby-user@db.apache.org
>>> Oggetto: Re: Using derby to parse an SQL statement
>>>
>>>
>>> Hi Rick,
>>>
>>> at first thank you very much for your efforts so far. At a first glance
>>> your changes to the code seem to be exactly what we want. I will try if
>>> it works asap.
>>>
>>> To test it with the derby libs I'd have to work on the current trunk 
>>> and
>>> apply the patch an then compile derby manually, right?
>>>
>>> I'll keep you updated
>>>
>>> Thanks for your help
>>>
>>> Christian
>>>
>>> Rick Hillegas schrieb:
>>>    
>>>> Hi Christian,
>>>>
>>>> I have created a JIRA to track this issue:
>>>> https://issues.apache.org/jira/browse/DERBY-3946
>>>>
>>>> I have attached to the JIRA a small patch which exposes the AST
>>>>       
>>> produced
>>>    
>>>> by the parser. I have also attached a simple program, ASTParser, which
>>>> shows how to retrieve the AST from Derby. I am inclined to check this
>>>> patch in to the trunk. Please let me know if you find this
>>>>       
>>> useful and if
>>>    
>>>> you would like me to port this patch to another Derby branch.
>>>>
>>>> Hope this helps,
>>>> -Rick
>>>>
>>>> Christian Riedel wrote:
>>>>      
>>>>> Hi Rick,
>>>>>
>>>>> first of all thanks for your answer ... now the relations have become
>>>>> a lot clearer ...
>>>>>
>>>>> Your are right, there is a lot of things to be done that we probably
>>>>> don't want to go through. You asked why we cannot take the whole 
>>>>> derby
>>>>> engine and use it ... well there is nor real reason not to do so. The
>>>>> only "problem" I see is, that derby is a dbms - if I am not 
>>>>> mistaken -
>>>>> and we only have an SQL statement that we extract from a text file 
>>>>> and
>>>>> want tot analyze it to extract some metadata from it.
>>>>>
>>>>> So if we take the derby engine as it is, how can I prevent that we
>>>>> have to set up a "dummy" DB in order to be able to actually use thje
>>>>> parsing feature ....
>>>>>
>>>>> I hope you see my point.
>>>>>
>>>>> We could live with setting up a dummy DB ... and I do think that the
>>>>> derby AST offers all information we need. It's just that I don't see
>>>>> how we can set this thing up. So having a dummy DB is necessary to be
>>>>> able to intercept the parsing process to get hold of the AST? Can we
>>>>> actually access the AST if we choose to set up a dummy DB? I think
>>>>> that would be something we could live with ;-)
>>>>>
>>>>>
>>>>> Thanks for your support
>>>>>
>>>>> Christian
>>>>>
>>>>>
>>>>> Rick Hillegas schrieb:
>>>>>        
>>>>>> Hi Christian,
>>>>>>
>>>>>> I think you will have difficulty isolating the Parser from the rest
>>>>>> of the SQL interpreter. In theory, you should be able to isolate
the
>>>>>> compiler from the execution engine and the storage layer--but 
>>>>>> that is
>>>>>> an untested theory.
>>>>>>
>>>>>> The Parser wants to turn out abstract syntax trees (AST). Ideally,
>>>>>> the Parser would just need to ask a NodeFactory for AST nodes and

>>>>>> you
>>>>>> could supply your own NodeFactory. But I think that there is a fair
>>>>>> amount of coupling between the Parser and Derby's concrete
>>>>>> implementation of NodeFactory. I think that you could uncouple the
>>>>>> two, but you may not want to spend your time on that.
>>>>>>
>>>>>> So the Parser is going to force you to pull in the AST nodes. Once
>>>>>> you do that, you will end up with the whole compiler. In particular,
>>>>>> the AST nodes (and the Parser itself) expect that you will supply
an
>>>>>> implementation of LanguageConnectionContext, the master state
>>>>>> variable for the whole SQL interpreter. Untangling that requirement
>>>>>> is another chunk of work you may not want to do.
>>>>>>
>>>>>> Then there is the Monitor. It has been a while since I was in that
>>>>>> code but I seem to recall that fairly early on the Monitor wants
to
>>>>>> fault in a storage layer. In theory you ought to be able to supply
>>>>>> the Monitor a list of modules that doesn't include a storage layer.
>>>>>> But since no-one runs in this configuration, there are probably a

>>>>>> lot
>>>>>> of undocumented surprises that you may not want to fix either.
>>>>>>
>>>>>> Can I ask you what breaks if you just pull in the whole Derby 
>>>>>> engine?
>>>>>> Are you concerned that you will fault in too much code that you
>>>>>> barely use? Are you concerned that you'll end up with a dummy
>>>>>> database that you don't need? Are Derby's AST nodes not a usable
>>>>>> representation of statement syntax?
>>>>>>
>>>>>> Thanks,
>>>>>> -Rick
>>>>>>
>>>>>> Christian Riedel wrote:
>>>>>>          
>>>>>>> Hi there,
>>>>>>>
>>>>>>> we are working on a small project where we need to analyze an
SQL
>>>>>>> statement that can be of any kind: very simple, with inner selects,
>>>>>>> complex join etc.
>>>>>>>
>>>>>>> We figured it inappropriate to start to write our own parser
when
>>>>>>> there are other projects, like derby, out there that can do it
much
>>>>>>> better than we would possibly do ... so this was our idea:
>>>>>>>
>>>>>>> Can we use derby to create an instance of Parser
>>>>>>> (org.apache.derby.iapi.sql.compile.Parser.class) and let our
SQL
>>>>>>> statement be parsed by calling the parse() method on this instance?
>>>>>>> What we want to have is a syntax tree of the statement that allows
>>>>>>> us to see which tables and which fields are accessed / included
in
>>>>>>> the statement (including any possibly done "renames" á la SELECT
>>>>>>> street AS "ADDRESS" FROM USER_DATA ).
>>>>>>>
>>>>>>> The problem is, that we are stuck ... we spent several days now
to
>>>>>>> try to find the proper way to create an instance of the Parser.
Is
>>>>>>> it possible at all without having to set up a running derby system?
>>>>>>>
>>>>>>> Is the Monitor class the right entry point? How can we create
a
>>>>>>> CompilerContext so that a Parser instance can be created?
>>>>>>>
>>>>>>>
>>>>>>> This sure is off-topic but we don't see any way through all this.
>>>>>>> Can you help us?
>>>>>>>
>>>>>>>
>>>>>>> Thanks in advance
>>>>>>>
>>>>>>> Christian
>>>>>>>
>>>>>>>             
>>>>>>           
>>>>       
>>> -- 
>>> To reply to this posting directly use the following address and
>>> remove the 'NO-SPAM' part: Riedel.Christian.NO-SPAM@gmx.net
>>>
>>>     
>>
>>
>> -----------------------------------------------------------
>> Il presente messaggio non costituisce un impegno contrattuale tra 
>> SILMA S.r.l. ed il destinatario.
>> Le opinioni ivi espresse sono quelle dell'autore.
>> SILMA S.r.l. non assume alcuna responsabilita riguardo al contenuto 
>> del presente messaggio.
>> Il messaggio è destinato esclusivamente al destinatario.
>> Il contenuto e gli allegati sono da considerarsi di natura confidenziale
>>
>> Nel caso abbiate ricevuto il presente messaggio per errore siete 
>> pregati di comunicarlo
>> alla casella segreteria@silmasoftware.com.
>>
>>   
>


Mime
View raw message