Return-Path: Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 78001 invoked from network); 4 Sep 2003 20:36:21 -0000 Received: from unknown (HELO c000.snv.cp.net) (209.228.32.66) by daedalus.apache.org with SMTP; 4 Sep 2003 20:36:21 -0000 Received: (cpmta 6611 invoked from network); 4 Sep 2003 13:35:41 -0700 Received: from 68.170.78.210 (HELO ehatchersolutions.com) by smtp.hatcher.net (209.228.32.66) with SMTP; 4 Sep 2003 13:35:41 -0700 X-Sent: 4 Sep 2003 20:35:41 GMT Date: Thu, 4 Sep 2003 16:35:41 -0400 Subject: Re: Lucene app to index Java code Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v552) From: Erik Hatcher To: "Lucene Users List" Content-Transfer-Encoding: 7bit In-Reply-To: <3F5776A5.9080207@newsmonster.org> Message-Id: <59C80DAA-DF17-11D7-9C0C-000393A564E6@ehatchersolutions.com> X-Mailer: Apple Mail (2.552) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N On Thursday, September 4, 2003, at 01:30 PM, Kevin A. Burton wrote: >> - XDoclet could be used to sweep through Java code and build a >> text/XML file as richly as you'd like from the information there >> (complete with JavaDoc tags, which Zapata will miss :)), and then run >> Lucene on the generated files. On a related note, the XDoclet2 >> architecture would streamline this even further by eliminating the >> middle textual representation (QDox/XJavadoc reads Java as a "meta >> data provider" and then a Lucene "plugin" indexes things). It could >> be done without the intermediate text representation even in XDoclet >> 1.2, but it would require coding a custom subtask and be slightly out >> of the norm for XDoclet subtasks (but would work just fine). > > It would be faster to write a native doclet as this would remove the > XML parse overhead... The whole point of this thing is that it needs > to be fast! Do you mean the Ant build file parsing? That would be the only XML parsing in the equation I'm proposing, unless you did it the clunkiest XDoclet 1.2 way of having an intermediate XML file. As for speed.... QDox, I've heard, is the fastest option. javadoc is the slowest parsing of the three I know of (javadoc, xjavadoc, qdox). Erik