lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Performance Results on changing the way fields are stored
Date Thu, 07 Jan 2010 16:43:21 GMT
You could try Avro instead of JSON/XML/Java Serialization.  It's compact (and new).

http://hadoop.apache.org/avro/

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Paul Taylor <paul_t100@fastmail.fm>
> To: java-user@lucene.apache.org
> Sent: Tue, January 5, 2010 7:44:21 AM
> Subject: Performance Results on changing the way fields are stored
> 
> So currently in my index I index and store a number of small fields, I need both 
> so I can search on the fields, then I use the stored versions to generate the 
> output document (which is either an XML or JSON representation), because I read 
> stored and index fields are dealt with completely seperately I tried another 
> tact only storing one field which was a serialized version of the output 
> documentation. This solves a couple of issues I was having but I was 
> disappointed that both the size of the index increased and the index build  time 
> increased, I thought that if all the stored data was held in one field that the 
> resultant index would be smaller, and I didn't expect index time to increase by 
> as much as it did. I was also suprised that Java serilaization was slower and 
> used more space than both JSON and XML serialization.
> 
> Results as Follows
> 
> Type:                                                             Time : Index 
> Size
> Only indexed  no norms                                                          
>           105   : 38 MB
> Only indexed                                                                    
>                  111   : 43 MB
> Same fields written as Indexed and Stored  (current Situation)           115   : 
> 83 MB
> Fields Indexed, One JAXB classed Stored using JSON Marshalling 140   : 115 MB
> Fields Indexed, One JAXB classed Stored using XML Marshalling  189   : 198 MB
> Fields Indexed, One JAXB classed Stored using Java Serialization   305   : 485 
> MB
> 
> Are these results to be expected, could anybody suggest anything else I could do
> 
> 
> Paul
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message