lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-940) TrieRange support
Date Wed, 25 Feb 2009 17:45:01 GMT

    [ https://issues.apache.org/jira/browse/SOLR-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676717#action_12676717
] 

Uwe Schindler commented on SOLR-940:
------------------------------------

{quote}
bq. Just an idea (that came to me...): How about creating a TokenStream that returns the results
of TrieUtils.trieCode[Long|Int]() with TokenIncrement 0. You should be able to search this
with TrieRangeFilter (using the same field name for the highest and lower precision trie fields).

The difficulty is in identifying what type of tokenizer was used (TrieInt, TrieLong etc.)
to index the field. The user will need to use the localparam syntax explicitly for us to use
IntTrieRangeFilter e.g fq={trieint}tint:[10 TO 100]. I would like to avoid the use of such
syntax as far as possible. Creating the field type may be more work than this option, but
it can help us use the correct Filter and SortField automatically.
{quote}

Now I understand the problem, Yonik had with the original TrieRange implementation and wanted
to change the API. Your problem is, that you must be able to not just map the numerical value
to *one* field and token. You have to index *one* numeric value to more than one token before
indexing them.

My idea was, to just use create a FieldType subclass for indexing TrieRangeFilter and overwrite
the getAnalyzer() and getQueryAnalyzer() methods. The analyzer would get the numerical value
and create tokens from it. Normally, it would be only *one* token for numerical values that
is converted using the toXXXX methods in FieldType. But now you have to create more than one
token (one for each precision). This could be done by the analyzer that is returned by FieldType.
This analyzer does really nothing, only returns a Tokenizer that does not really tokenize,
it just returns Tokens containing the prefix encoded values of the given String converted
to the numeric value in different precisions (using TrieUtils.trieCodeLong()).

> TrieRange support
> -----------------
>
>                 Key: SOLR-940
>                 URL: https://issues.apache.org/jira/browse/SOLR-940
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> We need support in Solr for the new TrieRange Lucene functionality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message