phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1954) Reserve chunks of numbers for a sequence
Date Mon, 11 May 2015 21:26:59 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538672#comment-14538672
] 

Jesse Yates commented on PHOENIX-1954:
--------------------------------------

In the original discussion that came up, we came up with the same syntax, but with the follow
problem. What if the client first gets a sequence (which batches by 100), so they reserve
sequence {{0-99}} and get the value 0. Then, to reserve a sequence they use {{NEXT 1000 VALUE
for seq}}, which bumps the external next id to {{1100}. Then when they next do {{NEXT VALUE
FOR seq}} what should the next value be? 

There are a couple possible solutions:
* They get value 1. Then if they call it 99 more times, they would get 2,3,...99, 1100. Which
skips the reserved sequence. This is however a bit odd and why Lars proposed the different
syntax, so the client is aware that the next sequence is unmanaged
* The get value 1100. This would 'throw away' the client cache of {{0-99}} and just get the
next logical element of the sequence. Simpler and reserves the number space
* They get 1, followed by 2,3,...99,100, 101,...1099. However, this would conflict with the
idea of a 'reserved' space which is allocated as needed from the client's perspective.

The reserved ID space is somewhat separate from the client's standard sequence logic, but
in many cases, needs to interroperate in the same sequence. For instance, batch generating
UUIDs (reserving an appropriately sized block) interleaving with stream/on-demand generation
of UUIDs.

{{ALLOCATE}} differentiates the above cases since it somewhat decouples the client's two usages.

> Reserve chunks of numbers for a sequence
> ----------------------------------------
>
>                 Key: PHOENIX-1954
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1954
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>
> In order to be able to generate many ids in bulk (for example in map reduce jobs) we
need a way to generate or reserve large sets of ids. We also need to mix ids reserved with
incrementally generated ids from other clients. 
> For this we need to atomically increment the sequence and return the value it had when
the increment happened.
> If we're OK to throw the current cached set of values away we can do
> {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return
the value it incremented from (i.e. it has to throw the current cache away, and return the
next value it found at the server).
> Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does
the same, but does not invalidate the cache.
> Note that in either case we won't retrieve the reserved set of values via {{NEXT VALUE
FOR}} because we'd need to be idempotent in our case, all we need to guarantee is that after
a call to {{RESERVE VALUES FOR <seq>, <N>}}, which returns a value <M> is
that the range [M, M+N) won't be used by any other user of the sequence. My might need reserve
1bn ids this way ahead of a map reduce run.
> Any better ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message