lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2575) Concurrent byte and int block implementations
Date Fri, 01 Oct 2010 15:33:34 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916914#action_12916914
] 

Jason Rutherglen commented on LUCENE-2575:
------------------------------------------

bq. I fear a copy-on-write check per-term is going to be a sizable perf hit.

For indexing?  The byte[] buffers are also using a page based system.  I think we'll need
to measure the performance difference.  We can always shift the cost to getreader by copying
from a writable (indexing based) tf array into a per-reader tf of paged-ints.  While this'd
be a complete iteration the length of the terms, the CPU cache could make it extremely fast
(because each page would be cached, and we'd be iterating sequentially over an array, methinks).

The other cost is the lookup of the upto when iterating the postings, however that'd be one
time per term-docs instantiation, ie, negligible.  



> Concurrent byte and int block implementations
> ---------------------------------------------
>
>                 Key: LUCENE-2575
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2575
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: Realtime Branch
>            Reporter: Jason Rutherglen
>             Fix For: Realtime Branch
>
>         Attachments: LUCENE-2575.patch, LUCENE-2575.patch, LUCENE-2575.patch, LUCENE-2575.patch
>
>
> The current *BlockPool implementations aren't quite concurrent.
> We really need something that has a locking flush method, where
> flush is called at the end of adding a document. Once flushed,
> the newly written data would be available to all other reading
> threads (ie, postings etc). I'm not sure I understand the slices
> concept, it seems like it'd be easier to implement a seekable
> random access file like API. One'd seek to a given position,
> then read or write from there. The underlying management of byte
> arrays could then be hidden?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message