lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bogaert <boggu...@gmail.com>
Subject Re: lots of .cfs (compound files) in the index directory
Date Thu, 15 Mar 2012 14:30:10 GMT
Hi Mike,

Segment format: FORMAT_3_1 [Lucene 3.1+]
NumBadSegments: 0
NumSegments: 5111
Clean?: true
TotLoseDocCount: 0
Partial: false


I reindexed and let the feed come in again...The logs of the result are
below (I just copied a few) after 10 minutes :
Segment format: FORMAT_3_1 [Lucene 3.1+]
NumBadSegments: 0
NumSegments: 11
Clean?: true
TotLoseDocCount: 0
Partial: false

What I noticed in the log was that _1a.cfs is getting deleted, but it seems
to be still in the directory :

ls -al
-rw-r--r--   1 web      web      1444172 Mar 15 15:13 _17.fdt
-rw-r--r--   1 web      web        88804 Mar 15 15:13 _17.fdx
-rw-r--r--   1 web      web          142 Mar 15 15:13 _17.fnm
-rw-r--r--   1 web      web      4352933 Mar 15 15:13 _17.frq
-rw-r--r--   1 web      web       177604 Mar 15 15:13 _17.nrm
-rw-r--r--   1 web      web      5931105 Mar 15 15:13 _17.prx
-rw-r--r--   1 web      web        18159 Mar 15 15:13 _17.tii
-rw-r--r--   1 web      web      1421338 Mar 15 15:13 _17.tis
-rw-r--r--   1 web      web           36 Mar 15 15:14 _17_1.del
-rw-r--r--   1 web      web       379400 Mar 15 15:13 _18.cfs
-rw-r--r--   1 web      web       424376 Mar 15 15:13 _19.cfs
-rw-r--r--   1 web      web       463177 Mar 15 15:13 _1a.cfs
-rw-r--r--   1 web      web       108875 Mar 15 15:13 _1b.cfs
-rw-r--r--   1 web      web         4481 Mar 15 15:14 _1c.cfs
-rw-r--r--   1 web      web         4465 Mar 15 15:15 _1d.cfs
-rw-r--r--   1 web      web         4534 Mar 15 15:17 _1e.cfs
-rw-r--r--   1 web      web         3252 Mar 15 15:18 _1f.cfs
-rw-r--r--   1 web      web         3314 Mar 15 15:21 _1g.cfs
-rw-r--r--   1 web      web         2340 Mar 15 15:25 _1h.cfs
-rw-r--r--   1 web      web         3902 Mar 15 15:27 _1i.cfs
-rw-r--r--   1 web      web         1687 Mar 15 15:28 _1j.cfs
-rw-r--r--   1 web      web           20 Mar 15 15:28 segments.gen
-rw-r--r--   1 web      web         2735 Mar 15 15:28 segments_1h


Here is (part of) the log :

50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_h(3.5):c1 size=0.007 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_b(3.5):c1 size=0.007 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_g(3.5):c1 size=0.006 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_c(3.5):c1 size=0.006 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_11(3.5):c1 size=0.005 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_12(3.5):c1 size=0.005 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
seg=_6(3.5):c1 size=0.005 MB [merging] [floored]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP:
allowedSegmentCount=1 vs count=15 (eligible count=5) tooBigCount=0
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: merge time 46
msec for 10 docs
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now flush at close
waitForMerges=true
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: CMS:   merge
thread: done
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now trigger flush
reason=explicit flush
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: CMS: set
priority of merge thread Lucene Merge Thread #3 to 6
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]:   start flush:
applyAllDeletes=true
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]:   index before flush
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _16(3.5):c10 _6(3.5):c1
_18(3.5):c10 _17(3.5):c10 _b(3.5):c1 _c(3.5):c1 _d(3.5):c1 _g(3.5):c1
_h(3.5):c1 _11(3.5):c1 _12(3.5):c1
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: DW: flush: no docs;
skipping
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: apply all deletes
during flush
BD 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: applyDeletes: no
deletes; skipping
BD 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: prune
sis=org.apache.lucene.index.SegmentInfos@1cad1e6 minGen=0 packetCount=0
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: clearFlushPending
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: CMS: now merge
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: CMS:   index:
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _16(3.5):c10 _6(3.5):c1
_18(3.5):c10 _17(3.5):c10 _b(3.5):c1 _c(3.5):c1 _d(3.5):c1 _g(3.5):c1
_h(3.5):c1 _11(3.5):c1 _12(3.5):c1
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: CMS:   no more
merges pending; now return
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: waitForMerges
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: merge store
matchedCount=10 vs 10
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: merged
segment size=0.076 MB vs estimate=0.103 MB
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: commitMerge:
_2(3.5):C23 _3(3.5):c7 _d(3.5):c1 _h(3.5):c1 _b(3.5):c1 _g(3.5):c1
_c(3.5):c1 _11(3.5):c1 _12(3.5):c1 _6(3.5):c1 into _19 index=_0(3.5):C73
_1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _16(3.5):c10 _6(3.5):c1 _18(3.5):c10
_17(3.5):c10 _b(3.5):c1 _c(3.5):c1 _d(3.5):c1 _g(3.5):c1 _h(3.5):c1
_11(3.5):c1 _12(3.5):c1
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]:
commitMergeDeletes _2(3.5):C23 _3(3.5):c7 _d(3.5):c1 _h(3.5):c1 _b(3.5):c1
_g(3.5):c1 _c(3.5):c1 _11(3.5):c1 _12(3.5):c1 _6(3.5):c1 into _19
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: after commit:
_0(3.5):C73 _1(3.5):C70 _19(3.5):C38 _16(3.5):c10 _18(3.5):c10 _17(3.5):c10
IFD [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: now checkpoint
"segments_15" [6 segments ; isCommit = false]
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: merge time 61
msec for 38 docs
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: waitForMerges done
IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: CMS:   merge
thread: done
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now call final
commit()
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: start
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: enter lock
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: already
prepared
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit:
pendingCommit != null
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: wrote
segments file "segments_16"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now checkpoint
"segments_16" [42 segments ; isCommit = true]
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: deleteCommits: now
decRef commit "segments_15"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "segments_15"
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: done
IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: at close:
_0(3.5):C73 _1(3.5):C70 _19(3.5):C38 _16(3.5):c10 _18(3.5):c10 _17(3.5):c10
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.tii"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.tis"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_18.cfs"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.nrm"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.fnm"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.fdt"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_17.cfs"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.prx"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_16.cfs"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.fdx"
IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.frq"
Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: applyDeletes: no deletes;
skipping
BD 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: prune
sis=org.apache.lucene.index.SegmentInfos@59d9fe minGen=0 packetCount=0
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: clearFlushPending
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: CMS: now merge
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: CMS:   index:
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _1a(3.5):C16 _17(3.5):c10 _19(3.5):c10
_18(3.5):c10
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: CMS:   no more
merges pending; now return
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: waitForMerges
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: waitForMerges done
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: now call final
commit()
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: start
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: enter lock
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: already
prepared
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit:
pendingCommit != null
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: wrote
segments file "segments_17"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: now checkpoint
"segments_17" [43 segments ; isCommit = true]
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: deleteCommits: now
decRef commit "segments_16"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "segments_16"
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: done
IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: at close:
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _1a(3.5):C16 _17(3.5):c10 _19(3.5):c10
_18(3.5):c10
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.fdx"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.tis"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.frq"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.tii"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.fnm"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.fdt"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_18.cfs"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.nrm"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_19.cfs"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_17.cfs"
IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.prx"

IW 53 [Thu Mar 15 15:25:38 MET 2012; Lucene Merge Thread #3]: CMS:   merge
thread: done
IW 53 [Thu Mar 15 15:25:38 MET 2012; Lucene Merge Thread #0]: CMS: set
priority of merge thread Lucene Merge Thread #3 to 6
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now flush at close
waitForMerges=true
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now trigger flush
reason=explicit flush
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]:   start flush:
applyAllDeletes=true
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]:   index before flush
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _18(3.5):c10 _1b(3.5):c10
_1a(3.5):c10 _19(3.5):c10
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: DW: flush: no docs;
skipping
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: apply all deletes
during flush
BD 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: applyDeletes: no
deletes; skipping
BD 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: prune
sis=org.apache.lucene.index.SegmentInfos@e91b11 minGen=0 packetCount=0
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: clearFlushPending
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: CMS: now merge
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: CMS:   index:
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _18(3.5):c10 _1b(3.5):c10
_1a(3.5):c10 _19(3.5):c10
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: CMS:   no more
merges pending; now return
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: waitForMerges
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: waitForMerges done
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now call final
commit()
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: start
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: enter lock
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: already
prepared
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit:
pendingCommit != null
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: wrote
segments file "segments_18"
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now checkpoint
"segments_18" [44 segments ; isCommit = true]
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: deleteCommits: now
decRef commit "segments_17"
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "segments_17"
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: done
IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: at close:
_0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _18(3.5):c10 _1b(3.5):c10
_1a(3.5):c10 _19(3.5):c10
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_1b.cfs"
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_18.cfs"
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_19.cfs"
IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_1a.cfs"

Kind regards,
Bogguard

On Thu, Mar 15, 2012 at 1:21 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Hmm, that's odd...
>
> Can you set IndexWriter's infoStream and then capture the output while
> doing the small writes every few seconds and post back?
>
> If you run CheckIndex on the index does it also report ~3000 segments?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Mar 15, 2012 at 7:14 AM, Tim Bogaert <bogguard@gmail.com> wrote:
> > Hi,
> >
> > We migrated our project to Lucene 3.5 and used the TieredMergePolicy with
> > all its defaults.
> > Our project contains a relatively small index (+-90MB) and gets small
> > writes every few seconds.
> > The behaviour we see is that when we reindex everything works great and
> > performant,
> > but when we keep the index running for some days we see a lot (+-3000) of
> > very small .cfs files (between 1KB and 10KB) are being generated and the
> > performance drops significantly.
> > Reïndexing solves the problem, but this shouldn't be the solution.
> >
> > Is there a way to change the behaviour of the .cfs files. I noticed you
> can
> > fully disable it, but I read that this isn't adviced.
> >
> > Kind regards,
> > Bogguard
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message