lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morus Walter <>
Subject Re: Combining Lucene and database functionality
Date Wed, 22 Sep 2004 06:50:55 GMT
Marco Schmidt writes:
> I'm trying to find out whether Lucene is an option for a project of 
> mine. I have texts which also have a date and a list of numbers 
> associated with each of them. These numbers are ID values which connect 
> the article to certain categories. So a particular article X might 
> belong to categories 17, 49 and 112. A search for all articles 
> containing "foo bar" and belonging to categories 100 to 140 should 
> return X (because it also contains "foo bar"). Is it possible to do this 
> with Lucene and if it is, how? I've read about the concept of fields in 
> Lucene, but it seems to me that you can only store text in them, not 
> integers, let alone list of integers. None of the tutorials I've seen 
> deals with more complex queries like that. Basically what I want to 
> accomplish could be done nicely with databases with full text search 
> capability, if that full text search wasn't so awful.
Where's the problem?
100 is a text as well as an integer (one has to keep in mind, that treating
it as text changes sort order, which may require leading 0 to compensate).
Lucene does not understand the "words" you index anyway.

So if a document has a field `category' with content '017 049 112' and 
some `text' field with content 'bla fasel foo bar' and you do a range 
query 100 - 140 on category (search all documents containing any word, 
that is alphanumerically sorted between 100 and 140) and a apropriate 
query on text it will find, what you want.

There are some caveats like choosing an apropriate analyzer or considering
the maximum number of terms the range query covers, but in principle there
is no difference between a text field containing words and a category 
field containing categories.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message