lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Pratt <>
Subject Searching/sorting strategy for many properties for semantic web app
Date Wed, 22 Feb 2006 03:20:14 GMT
Hi there. I am new to Lucene and I have been developing a semantic 
application for a while and it appears to me Lucene could help me to get 
a much needed search with reasonable speed. I have some general question 
to start:

1) Since my app is virtually all metadata, what should I store in the 
indexes if anything?
2) Should I only index the most common properties that people will 
search and combine the rest (and index this combined text as a field)?
3) I would like to sort and filter results but am concerned this could 
be very memory intensive
4) Some general guidance on organizing indexes in an app would be 

My schema is fairly large but I generally expect people to search on 
about 6 to 8 properties for the most part. I have the data stored in an 
sql database but not in a conventional way. I am willing to accept a 
slower advanced search on less common properties (accomodating this with 
sql search) but I really want some speed for the main properties with 
full text search.

Pretty much everything in the app is metadata so I am most interested in 
  focussing on the 6-8 properties that people will use to search on for 
the most part. I am thinking of combining the text of the remaining 
properties (quite a number) into a single description type field so that 
essentially all information gets indexed and ranked. Is this a 
reasonable approach?

I see that there are advanced possibilities with the indexes to sort and 
filter. How advisable is using sort for large record sets. For example, 
say you have got 20000 records returned from your search. Because this 
will have a web interface I will only be showing first 20 likely so I 
will be batching results. Is the sorting filtering highly memory intensive?

Hopefully, someone can provide some initial advice. Many thanks.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message