db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel John Debrunner <...@debrunners.com>
Subject Re: Query Regarding use of lucene with derby database
Date Thu, 29 Sep 2005 16:39:45 GMT
abhijeet mahesh wrote:

> Hello Sir/Ma'm,
> I am greatly influenced by Apache Derby Project and its features. I am
> using it with Eclipse 3.0.
> I am quiet satisfied with its performance.
> I would like to explore its functionality more so that i am integrating
> it with *Lucene API*.
> I am finding some difficulty so i would like to know how to integrate
> Apache Derby and lucene?

I've been thinking about this, I previously had a similar text search
engine loosely integrated with Cloudscape through its virtual table
interface. I'm moving towards adding the framework to support virtual
tables in Derby, that is anyone could provide an implementation of a
virtual table that could be plugged into Derby. This would use the
java.sql.ResultSet and PreparedStatement interfaces. An application
would implement these classes to present a data source in terms of a
ResultSet. The framework would then allow such tables to be used in
queries. See http://issues.apache.org/jira/browse/DERBY-571.

The idea is then a virtual table using lucene would present its results,
in the form of a ResultSet, including attributes like rank, document etc.

With lucene (or any text index) there are several things to think about:

A loose integration would use lucence and triggers to add items to the
lucene index. Is this sufficient or is a more integrated approach
required? With the loose integration there are issues with transactions,
though that may not be such a major issue for such an index.

Lucene implies that it can store its index using JDBC, can this work
with Derby and can the lucene SQL statements be in the same transaction
as the statement requiring the change?

What is the user api to lucene using Derby? If I have a table T with a
CLOB columns summary and article, what is the query?

select l.rank, t.summary, t.article from T, lucene_search(T) as l where
l.rank < 10; -- need some join here?

What would you use in lucence for the document id from Derby, the
primary key of the table, something else?

I would love Derby to support text indexing provided lucene and am
willing to help out on this effort. A staged effort might be best,
starting with loosely integrated and from that seeing how a fully
integrated approach should be taken.

One initial thing would be to take lucene and see if you can get it to
store its text indexes in Derby via JDBC.


View raw message