hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf
Date Thu, 09 Feb 2017 01:16:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858821#comment-15858821
] 

Chris Douglas commented on HDFS-11026:
--------------------------------------

[~ehiggs], a comment in the .proto explaining the constraint on field numbers would also be
a good idea.

bq. I'd prefer an equality check for a definitive magic byte that can't occur or represents
something so large it can't/won't occur . Perhaps something like -1 – I haven't checked
if that's impossible for the varint.

The current patch relies on the record including least one field with an ID less than 16 ([required|https://developers.google.com/protocol-buffers/docs/encoding#order]
to be written sequentially, ordered by field ID). We could require the first field, and match
on its ID and type explicitly, rather than allowing it to float.

An explicit check could be removed as part of deprecating the field, but... this is becoming
harder to maintain, for a pretty marginal gain over the guarantees we get by exploiting the
gap between encodings. If we moved to an equality check, then we can't deprecate the first
field while we support backwards compatibility with *3.x*. That's true even if we add the
magic bytes as a constant, optional PB field. I agree, that's morally equivalent to not deprecating
15 fields in a way, but we're more likely to get the latter correct accidentally, after this
JIRA is forgotten.

There's one more level of defensible paranoia: add code to {{writeLegacy}} that refuses to
write an expiryTime that would confuse the switch, and a unit test affirming the same. That
should at least cause noisy failures in the server attempting to issue confusing tokens, but
if we change the semantics of expiryTime then we've already broken backwards compatibility
with 2.x. Though someone may read this later and curse me, checking this for each block token
seems like unnecessary overhead.

Thoughts? We're down to debating the merits of approaches that are unlikely to be distinguished
in practice, so with all the above being said, I'd be fine with any of the three approaches
in this order:

# *current patch* Rely on gap between encodings (expiryTime > 128, PB field number \| type
< 16)
# Fix the first field and match its number and type exactly
# Define a new PB field, {{magic_number}} that has a constant encoding and match on it (adding
a version field for a PB record, really)

> Convert BlockTokenIdentifier to use Protobuf
> --------------------------------------------
>
>                 Key: HDFS-11026
>                 URL: https://issues.apache.org/jira/browse/HDFS-11026
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: hdfs, hdfs-client
>    Affects Versions: 2.9.0, 3.0.0-alpha1
>            Reporter: Ewan Higgs
>            Assignee: Ewan Higgs
>             Fix For: 3.0.0-alpha3
>
>         Attachments: blocktokenidentifier-protobuf.patch, HDFS-11026.002.patch, HDFS-11026.003.patch,
HDFS-11026.004.patch, HDFS-11026.005.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} (basically a {{byte[]}})
and manual serialization to get data into and out of the encrypted buffer (in {{BlockKeyProto}}).
Other TokenIdentifiers (e.g. {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf.
The {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded more easily
and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able to decipher
{{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message