lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Hunecke <>
Subject Indexing and Searching linked files
Date Tue, 19 Jan 2010 12:57:23 GMT
I have been working with Lucene for a while now. So far, I found helpful tips on this list,
so I hope somebody can help me with my problem:

In our app information is grouped in so-called cards. Now, it should be made possible to also
search on files linked to the cards. You can link arbitrarily many files to a card and the
size of the files is also not restricted.
So, as far as I can see, there are two ways to do this:

1. Add the content of the files to the search index of the card. First, I thought that I could
just have an additional field in the index which contains the content of all the files. But
then, if the files are very big, I could hit the field size limit, and would possibly not
get the content of all files indexed. So, I would need one field per file. The problem I have
then is that I don't know how many files I have and how large the index would get. This is
risky, because some customers have a lot of data.

2. Create a separate index for files. The documents in this index would contain one file each,
so I would not have the problem that I don't know how many fields I have. But then, the searching
is a problem:
I would need to search on both the card and the document index, and somehow merge the results
together. I sort by score always, but, as I understand it, the scores of the results of two
different indexes are not comparable.

So, which way do you think is better?


Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails. 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message