lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1043) Speedup merging of stored fields when field mapping "matches"
Date Fri, 02 Nov 2007 19:58:50 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539688
] 

Yonik Seeley commented on LUCENE-1043:
--------------------------------------

re bulk copying: Ideally, read a group of docs into a buffer big enough that it triggers the
IndexInput to read directly into it, and write directly from it.  The field index needs to
be done int by int, but it's just adding a constant to all of them and probably isn't worth
optimizing (trying to not fully encode/decode).... just loop over them ahead of time, fixing
them up.  The total size of the stored fields to write is simply the difference between the
indicies (need to slightly special case the end of the index of course...)

{quote}Another idea: we can almost just concatenate the posting lists
(frq/prx) for each term, because they are "delta coded" (we write the
delta between docIDs)
{quote}

Nice!  new JIRA issue?

> Speedup merging of stored fields when field mapping "matches"
> -------------------------------------------------------------
>
>                 Key: LUCENE-1043
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1043
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.2
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1043.patch
>
>
> Robert Engels suggested the following idea, here:
>   http://www.gossamer-threads.com/lists/lucene/java-dev/54217
> When merging in the stored fields from a segment, if the field name ->
> number mapping is identical then we can simply bulk copy the entire
> entry for the document rather than re-interpreting and then re-writing
> the actual stored fields.
> I've pulled the code from the above thread and got it working on the
> current trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message