lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: Wanting batch update to avoid high disk usage
Date Tue, 24 Aug 2010 03:18:36 GMT
Don't bother calling expunge deletes so often, makes no sense. Instead, call
it once at the end, though, you are calling the optimize method in the end
anyways so should take care of itself. there shouldn't be any difference
(but degradation in performance) on adding a call to expungedeletes().

--
Anshum Gupta
http://ai-cafe.blogspot.com


On Tue, Aug 24, 2010 at 4:38 AM, Justin <crynax@yahoo.com> wrote:

> In an attempt to avoid doubling disk usage when adding new fields to all
> existing documents, I added a call to IndexWriter::expungeDeletes. Then my
> colleague pointed out that Lucene will rewrite the potentially large
> segment
> files each time that method is called.
>
>
>  reader = writer.getReader();
>  for (int i=0; i<n; i++) {
>    Term idTerm = new Term("id", i);
>    TermDocs termDocs = reader.termDocs(idTerm);
>    if (termDocs != null && termDocs.next()) {
>      Document doc = reader.document(termDocs.doc());
>      doc.add(myfield, value);
>      writer.updateDocument(idTerm, doc);
>      //writer.expungeDeletes(true); // BAD: rewrites segment files each
> time
>    }
>  }
>  reader.close();
>  writer.commit();
>  writer.optimize(true);
>  writer.close();
>
>
> The following Lucene FAQ response suggests that disk space from deleted
> documents will be reclaimed. Is this true and is the savings worthwhile to
> update an existing index (followed by optimizing out the deleted documents)
> instead of simply creating a new index?
>
>
> http://wiki.apache.org/lucene-java/LuceneFAQ#If_I_decide_not_to_optimize_the_index.2C_when_will_the_deleted_documents_actually_get_deleted.3F
>
>
> Thanks for your help,
> Justin
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message