lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <>
Subject [jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes
Date Wed, 03 Dec 2008 19:27:44 GMT


Jason Rutherglen commented on LUCENE-1473:

"This is a hard problem."

I disagree.  It's completely manageable.  Doesn't Hadoop handle versioning inside of Writeable

ScoreDocComparator javadoc "sortValue(ScoreDoc i) Returns the value used to sort the given
document. The object returned *must implement the* interface. This is
used by multisearchers to determine how to collate results from their searchers."

This kind of statement in the code leads one to believe that Lucene supports Serialization.
 Maybe it should be removed from the Javadocs.  

"Thrift and ProtocolBuffers" don't support dynamic class loading.  If one were to create their
own Query class with custom code, serializing is the only way to represent the Query object
and have Java load the additional implementation code.  One easy to see use case is if Analyzer
were made Serializable then indexing over the network and trying different analyzing techniques
could be accomplished with ease in a grid computing environment.  

"representations for queries independent of Lucene's Query, and map this to Lucene's Query.
Is that not workable in this case?"  

Mike wrote "if we add field X to a class implementing Serializable,
and must bump the SUID, that's a hard break on back compat. "

There needs to be "if statements" in readExternal to handle backwards compatibility.  Given
the number of classes, and the number of fields this isn't very much work.  Neither are the
test cases.  I worked on RMI and Jini at Sun and elsewhere.  I am happy to put forth the effort
to maintain and develop this functionality.  It is advantageous to place this functionality
directly into the classes because in my experience many of the Lucene classes do not make
all of the field data public, and things like dedicated serialization such as the XML query
code are voluminous.  Also the half support of serialization right now seems to indicate there
really isn't support for it.  

Hoss wrote: "sort of mythical "Lucene powerhouse" 
Lucene seems to run itself quite differently than other open source Java projects.  Perhaps
it would be good to spell out the reasons for the reluctance to move ahead with features that
developers work on, that work, but do not go in.  The developer contributions seem to be quite
low right now, especially compared to neighbor projects such as Hadoop.  Is this because fewer
people are using Lucene?  Or is it due to the reluctance to work with the developer community?
 Unfortunately the perception in the eyes of some people who work on search related projects
it is the latter.  

Many developers seem to be working outside of Lucene and choosing *not* to open source in
order to avoid going through the current hassles of getting code committed to the project.

> Implement Externalizable in main top level searcher classes
> -----------------------------------------------------------
>                 Key: LUCENE-1473
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: LUCENE-1473.patch
> To maintain serialization compatibility between Lucene versions, major classes can implement
Externalizable.  This will make Serialization faster due to no reflection required and maintain
backwards compatibility.  

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message