lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marvin Humphrey (JIRA)" <>
Subject [jira] Updated: (LUCY-104) Similarity placeholder stub classes
Date Sat, 05 Jun 2010 21:43:56 GMT


Marvin Humphrey updated LUCY-104:

    Attachment: Similarity.bp

As discussed on the Java Lucene dev list a little while back, the role of
Similarity will diverge in Lucene and Lucy, and thus so will its location
within the class hierarchy. In Lucene, Similarity lives under; in Lucy, it will live under Lucy::Index.

In Lucene, KinoSearch, and Ferret, the Similarity class has an impact on index
data via its lengthNorm() and encodeNorm() methods.  However, Lucene trunk has
recently moved to storing raw data rather than encoded norms/boosts, instead
calculating them on-the-fly at search-time when opening IndexReaders.  As of
now, Lucene's Similarity has zero impact on index format.

Lucy will be moving the opposite direction, writing all data at index time and
then mmap'ing the resulting data structures at search-time, as on-the-fly
calculation of norms/boosts is incompatible with Lucy's cheap-searcher model.
Instead, Lucy will seek to be *more* aggressive in lossy compression of index
data, as our fixed-field-spec schema model allows us to know more about what
we can throw away for certain fields.

> Similarity placeholder stub classes
> -----------------------------------
>                 Key: LUCY-104
>                 URL:
>             Project: Lucy
>          Issue Type: New Feature
>          Components: Core - Index
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: Similarity.bp, Similarity.c
> Add placeholder stubs for the abstract class Lucy::Index::Similarity, the
> default implementation LuceneSimilarity, and the test-only class
> DummySimilarity.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message