hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2045) Data persisted in NM should be versioned
Date Fri, 11 Jul 2014 16:24:09 GMT

    [ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058952#comment-14058952

Jason Lowe commented on YARN-2045:

Thanks for the patch, Junping!

Is the schema version something that's appropriate to be at the state store interface?  To
me the schema seems specific to a particular state store implementation, and version 1.1 of
a leveldb implementation may mean something totally different to a mysqldb or PosixSharedMemSegmentStore,
etc.  One of those implementations may have decided to change its layout (e.g.: to fix a bug
or make the store more efficient) which is a schema change that shouldn't be exposed outside
of the implementation.  IMHO it's the responsibility of the state store implementation to
marshal the data being conveyed via the state store interface, and something sitting above
the implementation layer (i.e.: interacting with NMStateStoreService) ideally shouldn't have
to deal with schema layout changes.  Is there an example scenario where code agnostic of the
state store implementation needs to check the schema version?  If there is a need to convey
high-level interface changes via a schema version then arguably there needs to be separate
versions, one for the high-level interface changes and possibly an implementation-specific
schema version.  Ideally we'd just need the latter.

Other comments on the patch:

- I think we should have a fallback in loadVersion to check for the original string-based
schema.  Given there's only ever been '1.0' maybe we can check for an exact match of that
before trying to parse it as a protobuf
- Nit: the PBImpl is unnecessary, as this will never be sent via RPC (especially if it's state
store specific).  The PBImpl is extra boilerplate that isn't buying us anything, and the code
would be simpler just using NMDBSchemaVersionProto directly.

> Data persisted in NM should be versioned
> ----------------------------------------
>                 Key: YARN-2045
>                 URL: https://issues.apache.org/jira/browse/YARN-2045
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.4.1
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: YARN-2045-v2.patch, YARN-2045.patch
> As a split task from YARN-667, we want to add version info to NM related data, include:
> - NodeManager local LevelDB state
> - NodeManager directory structure

This message was sent by Atlassian JIRA

View raw message