lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Payloads and TrieRangeQuery
Date Thu, 11 Jun 2009 13:20:06 GMT
From: Michael McCandless [mailto:lucene@mikemccandless.com]
> On Wed, Jun 10, 2009 at 6:07 PM, Yonik Seeley<yonik@lucidimagination.com>
> wrote:
> 
> > Really goes into Solr land... my pref for Lucene is to remain a core
> > expert-level full-text search library and keep out things that are
> > easy to do in an application or at another level.
> 
> I think this must be the crux of our disagreement.
> 
> I feel, instead, that Lucene should stand on its own, as a useful
> search library, with a consumable API, good defaults, etc.  Lucene is
> more than "the expert level search API that's embedded in
> Solr". Lucene is consumed directly by apps other than Solr.

This is my opinion, too

> In fact, I think there are many things in Solr that naturally belong
> in Lucene (and over time we've been gradually slurping them down).
> The line/criteria has always been rather blurry...

There is currently also some overlapping, like functions queries implemented
in both projects in different versions. Because the function queries were
moved to lucene in the past, but they are still alive in Solr. There is also
some overlap in the Analyzers. I would really like to have this very generic
configureable analyzer in Lucene, so I do not need to create a new sub class
for each analyzer.

For other projects (like my panFMP), this would be really great, to just
create an Analyzer instance and add some filters into a list with addFilter
or something like that.

In my opinion, solr and lucene should exchange technology much more. Solr
should concentrate on the "search server" and lucene should provide the
technology. All additional implementations inside solr like faceting and so
on, could also live in lucene. I would have nice usages for it (I do not use
Solr, but have my own Lucene framework, that I cannot give up because of
various reasons). But e.g. Solr's facetting, Solr's analyzers and so on
would be great in lucene as "modules".

> In Lucene, we should be able to add a NumericField to a document,
> index it, and then create RangeFilter or Sort on that field and have
> things "just work".
> 
> Unfortunately, we are far from being able to do so today.  We can (and
> should, for 2.9) make baby steps towards this, by aborbing trie* into
> core, but you're still gonna have to "just know" to make the specific
> FieldCache parser and specific NumericRangeQuery to match how you
> indexed.  It's awkward and not consumable, but I think it'll just have to
> do for now...

I will have time at the weekend (currently I am working hard for PANGAEA not
related to Lucene at the moment) and create a core-trierange like defined in
LUCENE-1673. The usage pattern will be similar to contrib., now, only with a
revised API, making it simplier to instantiate the TokenStreams and
RangeQueries with the different data types (I get a lot of questions all the
time, how to do this and that, because the people don't understand, why they
must first map the float to an int. If they do it, the next question is:
"why does this work, I do not want to loose precision" and so on. I will do
it with TrieTokenStream

In my opinion, the classes should stay Trie* and not Numeric*. Maybe we have
different implementations for numeric Range queries in future. In this case,
I think Yonik is right. The class name should describe how it works if there
may be different and incompatible implementations in future.

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message