lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-2202) Money FieldType
Date Wed, 27 Oct 2010 19:05:20 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925511#action_12925511
] 

Uwe Schindler commented on SOLR-2202:
-------------------------------------

bq. I guess I should clarify my comment re: TrieField. I guess I'm wondering if it is more
expensive to perform a Trie-based query against a large portion of the value's range instead
of a direct fieldcache based range query. My assumption (which might be incorrect) is that
trie-based range queries across the entire span of values are more expensive than non-Trie
full-span range queries. If this isn't the case then it makes sense to do as you suggest and
use Trie ranges even though often they will span the entire range of values. 

That exactly the trick behind the trie field. A query that spans all values is as fast as
a query which spans only less values (ok it still depends on the number of documents, but
the part that selects the terms to match is very effective). The trick behin trie is to reduce
the number of terms by using multiple indexed values in the same field and only choose those
that match best. Please read the docs about Lucene's NumericRangeQuery. If the range matches
only some values on a sparse index, you loose lots of time on iterating the FieldCache.

And FieldCache (in 3.x) has a big disadvantage: It supports only one value per document and
it cannot detect NULL values.

> Money FieldType
> ---------------
>
>                 Key: SOLR-2202
>                 URL: https://issues.apache.org/jira/browse/SOLR-2202
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>    Affects Versions: 1.5
>            Reporter: Greg Fodor
>         Attachments: SOLR-2202-lucene-1.patch, SOLR-2202-solr-1.patch, SOLR-2202-solr-2.patch
>
>
> Attached please find patches to add support for monetary values to Solr/Lucene with query-time
currency conversion. The following features are supported:
> - Point queries (ex: "price:4.00USD")
> - Range quries (ex: "price:[$5.00 TO $10.00]")
> - Sorting.
> - Currency parsing by either currency code or symbol.
> - Symmetric & Asymmetric exchange rates. (Asymmetric exchange rates are useful if
there are fees associated with exchanging the currency.)
> At indexing time, money fields can be indexed in a native currency. For example, if a
product on an e-commerce site is listed in Euros, indexing the price field as "10.00EUR" will
index it appropriately. By altering the currency.xml file, the sorting and querying against
Solr can take into account fluctuations in currency exchange rates without having to re-index
the documents.
> The new "money" field type is a polyfield which indexes two fields, one which contains
the amount of the value and another which contains the currency code or symbol. The currency
metadata (names, symbols, codes, and exchange rates) are expected to be in an xml file which
is pointed to by the field type declaration in the schema.xml.
> The current patch is factored such that Money utility functions and configuration metadata
lie in Lucene (see MoneyUtil and CurrencyConfig), while the MoneyType and MoneyValueSource
lie in Solr. This was meant to mirror the work being done on the spacial field types.
> This patch has not yet been deployed to production but will be getting used to power
the international search capabilities of the search engine at Etsy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message