lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3218) Make CFS appendable
Date Mon, 20 Jun 2011 21:08:48 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052222#comment-13052222
] 

Michael McCandless commented on LUCENE-3218:
--------------------------------------------

Patch looks cool!

So the CFW will take the first output opened against it and let it write
directly into the "actual" CFS file, and then if another file is
opened while that first one is still open, the 2nd file will write to
separate file and then will copy in on close.  We may want to delegate
the separate files too?  So that on close they copy themselves into
the CFS and remove the original?  This way IW won't have to separately
create CFS in the end.

Somehow we need IW to add the biggest sub-file first...

s/compund/compound

CFW.close should assert currentOutput != null (and, if we delegate sep
entries, that they are also all closed)?

You might need to sync the CompoundFileWriter.this.currentOutput test
/ setting to null?  Though... Lucene is always single threaded in
writing files for the same segment, today anyway.

Can we make a separate createCompoundOutput?  (Ie, instaed of passing
OpenMode to openCompoundInput).  And: I'm assuming a given compound
output can only be opened once, appended to / separate files copied
into, closed and then never opened again for writing?  (Ie, still
"write once" at the file level).


> Make CFS appendable  
> ---------------------
>
>                 Key: LUCENE-3218
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3218
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3218.patch
>
>
> Currently CFS is created once all files are written during a flush / merge. Once on disk
the files are copied into the CFS format which is basically a unnecessary for some of the
files. We can at any time write at least one file directly into the CFS which can save a reasonable
amount of IO. For instance stored fields could be written directly during indexing and during
a Codec Flush one of the written files can be appended directly. This optimization is a nice
sideeffect for lucene indexing itself but more important for DocValues and LUCENE-3216 we
could transparently pack per field files into a single file only for docvalues without changing
any code once LUCENE-3216 is resolved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message