lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuliya Palchaninava ...@solute.de>
Subject AW: Lucene 2.9 and 3.0: Optimized index is thrice as large as the not optimized index
Date Fri, 08 Jan 2010 12:55:55 GMT
Thanks Michael.

You are probably wright.

Not optimized size is 4.1G, optimized index is about 15G.

Yes, our documents do have many different indexed fields and norms are enabled.
Nr of fields: 559
Nr of documents: 20845906
Nr of terms: 25615389

Could you please give me a more detailled explanation, how the storage of norms effects the
size of an index.
What do you mean exactly with "norms are not stored sparsely"?

Thanks,
Yuliya

> -----Ursprüngliche Nachricht-----
> Von: Michael McCandless [mailto:lucene@mikemccandless.com] 
> Gesendet: Donnerstag, 7. Januar 2010 18:00
> An: java-user@lucene.apache.org
> Betreff: Re: Lucene 2.9 and 3.0: Optimized index is thrice as 
> large as the not optimized index
> 
> Do your documents have many different indexed fields?  If you 
> do, and norms are enabled, that could be the cause (norms are 
> not stored sparsely).
> 
> But: what actual sizes are we talking about?
> 
> Mike
> 
> On Thu, Jan 7, 2010 at 11:50 AM, Yuliya Palchaninava 
> <yp@solute.de> wrote:
> > Otis,
> >
> > thanks for the answer.
> >
> > Unfortunatelly the index *directory* remains larger *after" 
> the optimization.
> > In our case the otimization was/is completed successfully 
> and, as you 
> > say, there is only one segment in the directory.
> >
> > Some other ideas?
> >
> > Thanks,
> > Yuliya
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> >> Gesendet: Donnerstag, 7. Januar 2010 17:35
> >> An: java-user@lucene.apache.org
> >> Betreff: Re: Lucene 2.9 and 3.0: Optimized index is thrice 
> as large 
> >> as the not optimized index
> >>
> >> Yuliya,
> >>
> >> The index *directory* will be larger *while* you are optimizing.  
> >> After the optimization is completed successfully, the 
> index directory 
> >> will be smaller.  It is possible that your index directory is 
> >> large(r) because you have some left-over segments (e.g. from some 
> >> earlier failed/interrupted optimizations) that are not 
> really a part 
> >> of the index.  After optimizing, you should have only 1 
> segment, so 
> >> if you see more than 1 segment, look at the ones with older 
> >> timestamps.  Those can be (re)moved.
> >>
> >>  Otis
> >> --
> >> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
> >>
> >>
> >>
> >> ----- Original Message ----
> >> > From: Yuliya Palchaninava <yp@solute.de>
> >> > To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
> >> > Sent: Thu, January 7, 2010 11:23:08 AM
> >> > Subject: Lucene 2.9 and 3.0: Optimized index is thrice as
> >> large as the
> >> > not optimized index
> >> >
> >> > Hi,
> >> >
> >> > According to the api documentation: "In general, once 
> the optimize 
> >> > completes, the total size of the index will be less than
> >> the size of
> >> > the starting index. It could be quite a bit smaller (if 
> there were 
> >> > many pending deletes) or just slightly smaller". In our
> >> case the index
> >> > becomes not smaller but larger, namely thrice as large.
> >> >
> >> > The not optimized index doesn't contain compressed fields,
> >> what could
> >> > have caused the growth of the index due to the 
> otimization. So we 
> >> > cannot explain what happens.
> >> >
> >> > Does someone have an explanation for the index growth due
> >> to the optimization?
> >> >
> >> > Thanks,
> >> > Yuliya
> >> >
> >> >
> >> >
> >> 
> ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >> 
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message