lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Hunecke <annahune...@yahoo.de>
Subject Indexing and Searching linked files
Date Tue, 19 Jan 2010 12:57:23 GMT
Hi!
I have been working with Lucene for a while now. So far, I found helpful tips on this list,
so I hope somebody can help me with my problem:

In our app information is grouped in so-called cards. Now, it should be made possible to also
search on files linked to the cards. You can link arbitrarily many files to a card and the
size of the files is also not restricted.
So, as far as I can see, there are two ways to do this:

1. Add the content of the files to the search index of the card. First, I thought that I could
just have an additional field in the index which contains the content of all the files. But
then, if the files are very big, I could hit the field size limit, and would possibly not
get the content of all files indexed. So, I would need one field per file. The problem I have
then is that I don't know how many files I have and how large the index would get. This is
risky, because some customers have a lot of data.

2. Create a separate index for files. The documents in this index would contain one file each,
so I would not have the problem that I don't know how many fields I have. But then, the searching
is a problem:
I would need to search on both the card and the document index, and somehow merge the results
together. I sort by score always, but, as I understand it, the scores of the results of two
different indexes are not comparable.

So, which way do you think is better?

Best,
Anna

__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails.

http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message