lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joachim Rösener <m...@joachim-roesener.de>
Subject RangeQuery over many indexed documents seems to be buggy
Date Wed, 09 Nov 2005 13:43:07 GMT
Hello,

I am currently developing a singles dating service with
lucene-1.4.3 as search engine.

Due to the limitation that ages < 1970-01-01 cannot be indexed with
Field.Keyword(String name, Date value) (produces a RuntimeException
("time too early")), the age indexing is done via Field.Text(String
name, String value). The value for the name "birthday" is a long
timestamp formatted with SimpleDateFormat("yyyyMMdd").

So if you seek for women born between specific dates, you should be able
to find them by a simple RangeQuery. This works fine so far -
theoretically, but there seems to be a severe bug that occurs when
searching via RangeQuery over many indexed documents
(I do not know exactly the limit).

For example, women born between 1980-01-01 and 1981-01-01 can be found
with the following query:

"sex:female AND birthday:[19800101 TO 19810101]"

This gives the following results:
1980-1981: found 424 women.
1981-1982: found 329 women.
1982-1983: found 237 women.
1983-1984: found 232 women.
1984-1985: found 175 women.

To proof if it works, a search between 1980-01-01 and 1982-01-01 gives
a result of 752 women
(one less than 753 because of range limit inclusion @ 1981-01-01).

BUT: Though you should expect that if you seek for women between 1980
and 1985 should give you 1397 females, the result is: 0! - without any
Exception ...  :-/

This failure can be reproduced if you seek other entities via RangeQuery
(e.g. weights or heights), too.

Can you explain, maybe fix this?

Yours sincerely,
Joachim Roesener.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message