lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chuck Williams" <ch...@manawiz.com>
Subject RE: Need help with filtering
Date Mon, 22 Nov 2004 18:49:45 GMT
It sounds like you need to pad your numbers with leading zeroes, i.e.
use the same type of encoding as is required by RangeQuery's.  If you
query with 000005 instead of 5 do you get what you expect?  If all your
document id's are fixed length, then string comparison will be
isomorphic to integer comparison.

Chuck

  > -----Original Message-----
  > From: Edwin Tang [mailto:emttang@yahoo.com]
  > Sent: Monday, November 22, 2004 10:34 AM
  > To: Lucene Users List
  > Subject: Re: Need help with filtering
  > 
  > Hello again,
  > 
  > I've modified DateFilter to filter out document IDs as suggested.
All
  > seems to
  > be running well until I tried a specific test case. All my documents
  > have IDs
  > in the 400,000 range. If I set my lower limit to 5, nothing comes
back.
  > After
  > examining the code, I found the issue to be at the following line:
  > TermEnum enumerator = reader.terms(new Term(field, start));
  > 
  > Is there a way to retrieve a set of documents with IDs using a
Integer
  > comparison versus a String comparison? If I set "start" to 0, I get
  > everything,
  > but that's not very efficient.
  > 
  > Thanks in advance,
  > Ed
  > 
  > --- Paul Elschot <paul.elschot@xs4all.nl> wrote:
  > 
  > > On Wednesday 17 November 2004 01:20, Edwin Tang wrote:
  > > > Hello,
  > > >
  > > > I have been using DateFilter to limit my search results to a
certain
  > date
  > > > range. I am now asked to replace this filter with one where my
  > search
  > > results
  > > > have document IDs greater than a given document ID. This
document ID
  > is
  > > > assigned during indexing and is a Keyword field.
  > > >
  > > > I've browsed around the FAQs and archives and see that I can
either
  > use
  > > > QueryFilter or BooleanQuery. I've tried both approaches to limit
the
  > > document
  > > > ID range, but am getting the BooleanQuery.TooManyClauses
exception
  > in both
  > > > cases. I've also tried bumping max number of clauses via
  > > setMaxClauseCount(),
  > > > but that number has gotten pretty big.
  > > >
  > > > Is there another approach to this? ...
  > >
  > > Recoding DateFilter to a DocumentIdFilter should be
straightforward.
  > >
  > > The trick is to use only one document enumerator at a time for all
  > > terms. Document enumerators take buffer space, and that is the
  > > reason why BooleanQuery has an exception for too many clauses.
  > >
  > > Regards,
  > > Paul
  > >
  > >
  > >
---------------------------------------------------------------------
  > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > > For additional commands, e-mail:
lucene-user-help@jakarta.apache.org
  > >
  > >
  > 
  > 
  > __________________________________________________
  > Do You Yahoo!?
  > Tired of spam?  Yahoo! Mail has the best spam protection around
  > http://mail.yahoo.com
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message