db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-590) How to integrate Derby with Lucene API?
Date Tue, 22 Oct 2013 15:41:43 GMT

    [ https://issues.apache.org/jira/browse/DERBY-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801935#comment-13801935
] 

Rick Hillegas commented on DERBY-590:
-------------------------------------

Thanks for that additional information, Andrew. Concerning i, here are some thoughts:

1) I see that this new additional tool is being parked next to the existing DBMDWrapper and
ForeignDBViews classes. I do agree that all of these tools should be neighbors. Just a heads-up,
however: I think I put DBMDWrapper and ForeignDBViews in the wrong jar file to begin with.
I now think those tools really ought to go into the engine jar, for reasons which I gave in
a 2013-06-13 comment on https://issues.apache.org/jira/browse/DERBY-6256. To summarize: the
point of derbytools.jar is to hold code which runs both client-side and server-side. None
of these tools run client-side. My point is this: I'm not faulting the patch for following
the existing pattern; instead, I'm saying that I put the original tools in the wrong place.
And at some point I may file a follow-on JIRA to move all of these tools into the engine jar.

2) If I understand the patch correctly, the indexTable() procedure really indexes a column.
You could run the procedure multiple times on the same table in order to index different columns.
I think that createDocumentIndex() would be a better name for this procedure.

3) Similarly, I'm not keen on the name luceneUpdateDocument() for several reasons. I think
that the "lucene" prefix can be assumed from the name of the schema which holds this procedure.
I'd also like the procedure names to express the fact that luceneUpdateDocument() refreshes
the index created by indexTable(). So I'd recommend something like updateDocumentIndex(),
akin to createDocumentIndex().

Some more thoughts follow:

4) In my experience, every insert function needs to be matched by corresponding update and
delete functions. Developers expect that. This tool provides an insert function (createDocumentIndex())
and a corresponding update function (updateDocumentIndex()) but no corresponding delete function.
The delete function is really important as developers hack out their schemas in the laboratory.
So I recommend adding a dropDocumentIndex() procedure. I understand your reservations about
deleting a whole directory, but I think that the first enhancement request we'll get is "give
me a way to delete these things."

5) The non-transactional behavior of these procedures needs to be clearly understood by users.
That's probably a documentation issue. But users need to understand that they can't rollback
some of the important effects of calls to createDocumentIndex() and dropDocumentIndex().

6) Developers hacking out a schema in the laboratory will also want tools for introspecting
which columns have been indexed and how current the indexes are. Maybe the best solution would
be a view wrapping a table function which, in turn, exposes metadata from Lucene and/or the
file system. If that's not possible, a table like the following could be useful:

LuceneSupport.documentIndexes
(
    id int generated always as identity,
    tableID char( 36 ) not null,
    columnNumber int,
    lastupdated timestamp,
    unique( tableID, columnNumber )  
);

I prefer the table function solution because it makes it harder for the user to mess up and
accidentally delete this metadata.

7) At this point, I don't see a need for the syntactic sugar of new SQL statements. I think
that the optional tool approach is fine for this first increment. I recommend deferring any
parser work until later, after we've cleared up the transactional consistency issues.

Thanks!
-Rick


> How to integrate Derby with Lucene API?
> ---------------------------------------
>
>                 Key: DERBY-590
>                 URL: https://issues.apache.org/jira/browse/DERBY-590
>             Project: Derby
>          Issue Type: Improvement
>          Components: Documentation, SQL
>            Reporter: Abhijeet Mahesh
>              Labels: derby_triage10_11
>         Attachments: lucene_demo.diff
>
>
> In order to use derby with lucene API what should be the steps to be taken? 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message