hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2045) Data persisted in NM should be versioned
Date Mon, 14 Jul 2014 14:44:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060694#comment-14060694

Jason Lowe commented on YARN-2045:

bq. If any incompatible changes happen (not matter layout change, Proto changes) in future,
the version check here can detect and stop the further wrong behavior.

Ah I see, for tracking protobuf schema changes for any protobufs passed via the NMStateStoreService
interface directly.  That's not quite the same concept as the layout of the state store itself
(i.e.: implementation-specific schema).  For example, if leveldb decided to rework the way
it uses keys in the database that would be a schema change not reflected in this version,

I can see wanting to expose a common schema version if code independent of the state store
implementation wants to marshal the data from one schema to another.  I just found it odd
that the state store _implementation_ code is checking this value yet it's exposed in a state
store interface which is not implementation specific.  That's what felt wrong to me.  For
this kind of schema version, the code to marshal data between compatible versions should be
in a common place, not in each state store implementation, and either all state stores end
up supporting the compatible versions or they don't (because the code to handle any conversions
is common).  If a state store implementation can't handle the difference between two different
versions but others can that seems like a state store specific schema version conflict.  If
each state store implementation is going to separately check the schema version and separately
determine what versions are compatible then this is an implementation-specific artifact that
should not be exposed in the interface, IMHO.  Again, maybe a specific example where it would
be appropriate would help me see things clearly.  My apologies if I'm completely missing the

bq. I agree that PBImpl here is not too much use, but just make interface looks more clear,
i.e. I can call some handy method like: isCompatibleTo() rather than manipulate proto object

I'm not advocating for the removal of the NMDBSchemaVersion class, rather just the PBImpl
class.  All we really need from the PBImpl is getMajorVersion and getMinorVersion, and the
interface between PBImpl and the raw protobuf is the same in that regard.   The PBImpl isn't
adding any value here, since it's not the one providing the useful isCompatibleTo() method
or any other methods that are easier to use than the raw protobuf.  We could just have NMDBSchemaVersion
class wrap the raw protobuf and have the same easy-to-use interface without all the extra
PBImpl code.  And as a bonus, NMDBSchemaVersion's hashCode and equals methods can delegate
to the raw protobuf methods to remove even more code.

> Data persisted in NM should be versioned
> ----------------------------------------
>                 Key: YARN-2045
>                 URL: https://issues.apache.org/jira/browse/YARN-2045
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.4.1
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: YARN-2045-v2.patch, YARN-2045.patch
> As a split task from YARN-667, we want to add version info to NM related data, include:
> - NodeManager local LevelDB state
> - NodeManager directory structure

This message was sent by Atlassian JIRA

View raw message