lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wolf Siberski (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1473) Implement standard Serialization across Lucene versions
Date Fri, 12 Dec 2008 09:57:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655944#action_12655944
] 

Wolf Siberski commented on LUCENE-1473:
---------------------------------------

Thanks to Doug and Jason for your constructive feedback. Let me first clarify the purpose
and scope of the patch. IMHO, the discussion about Serialization in Lucene is not clear-cut
at all. My opinion is that moving all distribution-related code out of the core leads to a
cleaner separation of concerns and thus is better design. On the other hand with removing
Serializable we limit the Lucene application space at least a bit (e.g., no support for dynamic
class loading), and abandon the advantages default Java serialization offers. Therefore the
patch is to be taken as contribution to explore the design space (as Michaels patch on custom
readers explored the Serializable option), and not as a full-fledged solution proposal.

> [Doug] The removal of Serializeable will break compatibility, so must be well-advertised.
Sure. I removed Serializable to catch all related errors; this was not meant as proposal for
a final patch.

>  [Doug] The Searchable API was designed for remote use and does not include HitCollector-based
access.
Currently Searchable does include a HitCollector-based search method, although the comment
says that 'HitCollector-based access to remote indexes is discouraged'. The only reason to
provide an implementation is that I wanted to keep the Searchable contract. Is remote access
the only purpose of Searchable/MultiSearcher? Is it ok to break compatibility with respect
to these classes? IMHO a significant fraction of the current clumsiness in the remote package
stems from my attempt to fully preserve the Searchable API.
 
>  [Doug] Weighting, and hence ranking, does not appear to be implemented correctly by
this patch. 
True, I was a bit too fast here. We could either solve it along the line you propose, or revert
to pass the Weight again instead of the Query. The issue IMHO is orthogonal to the Serializable
discussion and more related to the question how a good remote search interface and protocol
should look like.

> [Jason] Restricting people to XML will probably not be suitable though.
The patch does not limit serialization to XML. It just requires that encoding to and decoding
from String is implemented, no matter how. I used XML/XStream as proof-of-concept implementation,
but don't propose to make XML mandatory. The main reason for introduction of the Serializer
interface was to emphasize that XML/XStream is just one implemantation option. Actually, the
current approach feels like at least one indirection more than required; for a final solution
I would try to come up with a better design.

> [Jason] It seems the alternative solutions to serialization simply shift the problem
around but do not really solve 
> the underlying issues (speed, versioning, writing custom serialization code, and perhaps
dynamic classloading).
In a sense, the problem is indeed 'only' shifted around and not yet solved. The good thing
about this shift is that Lucene core becomes decoupled from these issues. The only real limitation
I see is that dynamic classloading can't be realized anymore. 

With respect to speed, I don't think that encoding/decoding is a significant performance factor
in distributed search, but this would need to be benchmarked. With respect to versioning,
my patch still keeps all options open. What is more important, Lucene users can now decide
if they need compatibility between different versions, and roll their own encoding/decoding
if they need it. Of course, if they are willing to contribute and maintain custom serializers
which preserve back compatibility, they can do it in contrib as well as they could have done
it in the core. Custom serialization is still possible although the standard Java serialization
framework can't be used anymore for that purpose, and I admit that this is a disadvantage.

> Implement standard Serialization across Lucene versions
> -------------------------------------------------------
>
>                 Key: LUCENE-1473
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1473
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: custom-externalizable-reader.patch, LUCENE-1473.patch, LUCENE-1473.patch,
LUCENE-1473.patch, LUCENE-1473.patch, lucene-contrib-remote.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> To maintain serialization compatibility between Lucene versions, serialVersionUID needs
to be added to classes that implement java.io.Serializable.  java.io.Externalizable may be
implemented in classes for faster performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message