lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Albert Vila <...@imente.com>
Subject Multiple fields vs one field
Date Mon, 06 Aug 2007 12:12:14 GMT
Hi all

  My data looks like:

	Document 1
		code, title, content, type, language, date, ...
	Document 2
		code, title, content, type, language, date, ...
	...
	Document n
		code, title, content, type, language, date, ...

  Now all document types share the same fields, but in a future we  
need to add more document types with specific fields. I allways sort  
documents by date. I have 200.000 new documents each day and 130  
million documents. The janaury index size is 4.2Gb (the data size is  
about 10Gb).

  I was wondering how to index the new document types.
	Option 1: One index for each document type. Each index will have its  
fields.
		Problem: I will have to perfom a search for each index, and sort  
results by date.
	Option 2: One big index containing all fields. A field could be  
empty if the field is not applicable for that document type.
	Option 3: One big index containing all common fields, and adding and  
extra field named metadata. Inside this field I will add all specific  
fields (field1:value1 field2:value2).

Comments, pros and contras will be appreciated. I don't know exactly  
the diference between option 2 and option 3.

Thanks

Albert

  

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message