lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3424) Return sequence ids from IW update/delete/add/commit to allow total ordering outside of IW
Date Wed, 26 Oct 2011 20:15:32 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13136302#comment-13136302
] 

Simon Willnauer commented on LUCENE-3424:
-----------------------------------------

thanks mike for taking the time, this stuff is hairy.

bq. The seqID should never be the same for any 2 ops, even across threads,
right? Will it ever have "holes" (ie, skip a given value), or must
all values be accounted for?

one seqID will never be assigned twice. the seq ID is always taken from the current tail of
the queue and is final once the tails next pointer is assigned. Yet, in the current patch
there is a possibility for holes ie. some seq. ids are not used at all. Currently when I do
a full flush (NRT reopen or commit) I need to cut over to the new deletequeue which means
that two delete queues are active for a short amount of time. The old queue might be still
in use by some DWPT (currently in flight) and the new queue is used for incoming threads.
what I do to prevent double assignments is that I use the current old queues max seq id and
increment it by the number of active thread states (ie. the max number of possible dwpt in
flight). Deletes are no problem at that point since its synced on DW just like flushAllThreads().
I need to think about how we could close those gaps but I think we need to block ie. non-blocking
/ swap DWPT will not work though.

bq. Commit doesn't incr the seqID right? It just returns the max seqID
in the commit point, right? If you commit having made no "actual"
changes (eg say you just called optimize), what seqID comes back?

right, it would return the the same seq id or possibly a higher one due to the gaps I explained
above.

bq. When an exc occurs is a seqID allocated and then skipped? (Maybe only
for certain exceptions?).

its allocated as basically the last op in DWPT#updateDocument so yes if an exc occurs after
that which breaks the DWPT ie. is aborting the ids are skipped. if an exc happens in the same
thread ie. during flush it will stay assigned. This could be a problem though but if an exc
occurs we are in invalid state anyway, right?

bq. if an aborting-exc is hit... will we "lose" a bunch of seqIDs right?
Like the next op against the IW will assign a previously used seqID?

no previously assigned seqID should not be assigned again. The del queue is global so once
you assigned it its gone - once an item is in the queue it should not change

bq. seqIDs have nothing to do with flushing? Ie, the app sees no change
in the returned seqIDs just because a flush occurred under the hood?

right, except of the full flush I mentioned above.

bq. In general can you give a different name if the seqID was "coded" (<<
1) vs not? (maybe codedSeqID or something)? Just to reduce chance of
future errors...

yeah good point. I tried to not introduce a short living object here so I figured piggy-packing
the seq. id is fine but yeah we should name that differently. 

bq. If the perf hit is negligible I don't think we need to add an IWC
option?

its just like an update but we save the delete handling - some extra cpu cycles but since
the other work is so much heavier I think its ok though.




                
> Return sequence ids from IW update/delete/add/commit to allow total ordering outside
of IW
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3424
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3424
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3424.patch
>
>
> Based on the discussion on the [mailing list|http://mail-archives.apache.org/mod_mbox/lucene-dev/201109.mbox/%3CCAAHmpki-h7LUZGCUX_rfFx=q5-YkLJei+piRG=oic8D1pNRquQ@mail.gmail.com%3E]
IW should return sequence ids from update/delete/add and commit to allow ordering of events
for consistent transaction logs and recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message