db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Using derby to parse an SQL statement
Date Wed, 12 Nov 2008 21:36:17 GMT
Hi Christian,

I have created a JIRA to track this issue: 

I have attached to the JIRA a small patch which exposes the AST produced 
by the parser. I have also attached a simple program, ASTParser, which 
shows how to retrieve the AST from Derby. I am inclined to check this 
patch in to the trunk. Please let me know if you find this useful and if 
you would like me to port this patch to another Derby branch.

Hope this helps,

Christian Riedel wrote:
> Hi Rick,
> first of all thanks for your answer ... now the relations have become 
> a lot clearer ...
> Your are right, there is a lot of things to be done that we probably 
> don't want to go through. You asked why we cannot take the whole derby 
> engine and use it ... well there is nor real reason not to do so. The 
> only "problem" I see is, that derby is a dbms - if I am not mistaken - 
> and we only have an SQL statement that we extract from a text file and 
> want tot analyze it to extract some metadata from it.
> So if we take the derby engine as it is, how can I prevent that we 
> have to set up a "dummy" DB in order to be able to actually use thje 
> parsing feature ....
> I hope you see my point.
> We could live with setting up a dummy DB ... and I do think that the 
> derby AST offers all information we need. It's just that I don't see 
> how we can set this thing up. So having a dummy DB is necessary to be 
> able to intercept the parsing process to get hold of the AST? Can we 
> actually access the AST if we choose to set up a dummy DB? I think 
> that would be something we could live with ;-)
> Thanks for your support
> Christian
> Rick Hillegas schrieb:
>> Hi Christian,
>> I think you will have difficulty isolating the Parser from the rest 
>> of the SQL interpreter. In theory, you should be able to isolate the 
>> compiler from the execution engine and the storage layer--but that is 
>> an untested theory.
>> The Parser wants to turn out abstract syntax trees (AST). Ideally, 
>> the Parser would just need to ask a NodeFactory for AST nodes and you 
>> could supply your own NodeFactory. But I think that there is a fair 
>> amount of coupling between the Parser and Derby's concrete 
>> implementation of NodeFactory. I think that you could uncouple the 
>> two, but you may not want to spend your time on that.
>> So the Parser is going to force you to pull in the AST nodes. Once 
>> you do that, you will end up with the whole compiler. In particular, 
>> the AST nodes (and the Parser itself) expect that you will supply an 
>> implementation of LanguageConnectionContext, the master state 
>> variable for the whole SQL interpreter. Untangling that requirement 
>> is another chunk of work you may not want to do.
>> Then there is the Monitor. It has been a while since I was in that 
>> code but I seem to recall that fairly early on the Monitor wants to 
>> fault in a storage layer. In theory you ought to be able to supply 
>> the Monitor a list of modules that doesn't include a storage layer. 
>> But since no-one runs in this configuration, there are probably a lot 
>> of undocumented surprises that you may not want to fix either.
>> Can I ask you what breaks if you just pull in the whole Derby engine? 
>> Are you concerned that you will fault in too much code that you 
>> barely use? Are you concerned that you'll end up with a dummy 
>> database that you don't need? Are Derby's AST nodes not a usable 
>> representation of statement syntax?
>> Thanks,
>> -Rick
>> Christian Riedel wrote:
>>> Hi there,
>>> we are working on a small project where we need to analyze an SQL 
>>> statement that can be of any kind: very simple, with inner selects, 
>>> complex join etc.
>>> We figured it inappropriate to start to write our own parser when 
>>> there are other projects, like derby, out there that can do it much 
>>> better than we would possibly do ... so this was our idea:
>>> Can we use derby to create an instance of Parser 
>>> (org.apache.derby.iapi.sql.compile.Parser.class) and let our SQL 
>>> statement be parsed by calling the parse() method on this instance? 
>>> What we want to have is a syntax tree of the statement that allows 
>>> us to see which tables and which fields are accessed / included in 
>>> the statement (including any possibly done "renames" รก la SELECT 
>>> The problem is, that we are stuck ... we spent several days now to 
>>> try to find the proper way to create an instance of the Parser. Is 
>>> it possible at all without having to set up a running derby system?
>>> Is the Monitor class the right entry point? How can we create a 
>>> CompilerContext so that a Parser instance can be created?
>>> This sure is off-topic but we don't see any way through all this. 
>>> Can you help us?
>>> Thanks in advance
>>> Christian

View raw message