flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Farkas <timothytiborfar...@gmail.com>
Subject Re: In AbstractRocksDBState, why write a byte 42 between key and namespace?
Date Fri, 15 Jul 2016 17:20:16 GMT
I've faced a similar issue when serializing data two a key value store. Not
sure how helpful it is for this case but two possible solutions I've used
for persisting keys and values under different namespaces to the same key
value store are:

- have all namespaces be the same number of bytes and prefix each key with
its namespace.
- Include the number of bytes in the name space and key. So the bytes would
look like this:

[name space num bytes] [ name space] [key num bytes] [key]

Thanks,
Tim

On Fri, Jul 15, 2016 at 9:45 AM, Stephan Ewen <sewen@apache.org> wrote:

> Every serializer should know how many bytes to consume. The key serializer
> should not need to look for 42 to know where to terminate.
>
> Otherwise this would be a problem case:
> key[42, 42] - 42 - namespace [42, 42, 42]
> key[42, 42, 42] - 42 - namespace [42, 42]
>
>
>
> On Fri, Jul 15, 2016 at 5:38 PM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
>
> > I left that in on purpose to protect against cases where the combination
> > of key and namespace can be ambiguous. For example, these two
> combinations
> > of key and namespace have the same written representation:
> > key [0 1 2] namespace [3 4 5] (values in brackets are byte arrays)
> > key [0 1] namespace [2 3 4 5]
> >
> > having the "magic number" in there protects against such cases.
> >
> > On Fri, 15 Jul 2016 at 16:31 Stephan Ewen <sewen@apache.org> wrote:
> >
> >> My assumption is that this was a sanity check that actually just stuck
> in
> >> the code.
> >>
> >> It can probably be removed.
> >>
> >> PS: Moving this to the dev@flink.apache.org list...
> >>
> >>
> >>
> >> On Fri, Jul 15, 2016 at 11:05 AM, 刘彪 <mmyy1110@gmail.com> wrote:
> >>
> >> > In AbstractRocksDBState.writeKeyAndNamespace():
> >> >
> >> > protected void writeKeyAndNamespace(DataOutputView out) throws
> >> IOException
> >> > {
> >> > backend.keySerializer().serialize(backend.currentKey(), out);
> >> > out.writeByte(42);
> >> > namespaceSerializer.serialize(currentNamespace, out);
> >> > }
> >> >
> >> > Why write a byte 42 between key and namespace? The keySerializer and
> >> > namespaceSerializer know their lengths. It seems we don't need this
> >> byte.
> >> >
> >> > Could anybody tell me what it is for?  Is there any situation that we
> >> must
> >> > have this separator?
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message