lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject I just don't get wildcards at all.
Date Fri, 07 Apr 2006 14:06:40 GMT
OK, I know I'm asking you to write my code for me (or at least point me to
an example), but I'm at my wits end, so please rescue me....

This is a reprise of TooManyClauses. We have a large amount of text, and a
requirement to do a wildcard query. Of course, it's waaaay too big to use
Wildcard or the other "expanding" queries. They frighten me anyway.....

y'all pointed me at the ConstantScoreRangeQuery (CSRQ), but actually using
it is not making sense to me.

I just don't get how, for instance,  CSRQ helps me that much. Say I want to
search for big*er. I can use a CSRQ to get all the docs that include this
term, just by using biga and bigz as my min/max terms. But then I'm stuck. I
could iterate through all the docs returned, but that seems inefficient. Not
to mention that the HitCollector (?) class warns against this due to "an
order of magnitude" decrease in response time.

What I *want* is a way to, for each doc in the CSRQ, get to answer whether
it's a match. Really, on the order of a callback with the value that worked
for the CSRQ and the ability to return a yes/no or a ranking. Again, I can
interate all the docs matched, but this seems expensive.

Using filters doesn't really seem to do the trick for me either. If I
understand them properly, they allow me to set up a bitset for all the
documents that should be searched. All 1,000,000 of them? Or am I thinking
about this completely backwards? I have LIA, but I'm also wondering if
there's something in 1.9 that I haven't found yet.

Now, given how easy the rest of Lucene is to use, I assume that I'm
approaching this poorly, but I sure am stumped.

All that said, I'm quite Java-naieve, so please bear with me if this
question demonstrates my ignorance painfully.....

Thanks
Erick

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message