hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Paduano (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
Date Tue, 12 Apr 2016 22:22:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238107#comment-15238107
] 

Matthew Paduano commented on HADOOP-12563:
------------------------------------------

re 1:   I added doc to CommandsManual.md and also javadoc for DtFileOperations in the latest
patch.

re 2:   The types of TOKEN_STORAGE_VERSION and OLD_TOKEN_STORAGE_VERSION are byte.
          They need to be byte as they are passed directly to stream write() methods.
          They are private fields and there should not be very many of them.  I think an enum
is overkill.

          I changed this code to use those symbols directly and avoid the bare literals 0
and 1.

re 3:   java strings are UTF16 and there is a change of encoding twice in first example. 
 there is a copy 
          from one UTF8 buffer (io.Text) to a UTF16 buffer (String) and then back to a UTF8
buffer (ByteString).   
          In the other case, the byte[] from the io.Text object is directly copied to the
byte[] of the ByteString object 
          (which is interned, like java Strings).  So there is just one copy in the copyFrom
case, and no encoding 
          switch.   This is what one should prefer and is what is being used in other proto
code in hadoop.

re 4:   I see examples around the code base using both forms.  I think they are the same.
 
          I changed to the shorter form here.

> Updated utility to create/modify token files
> --------------------------------------------
>
>                 Key: HADOOP-12563
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12563
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 3.0.0
>            Reporter: Allen Wittenauer
>            Assignee: Matthew Paduano
>         Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, HADOOP-12563.03.patch,
HADOOP-12563.04.patch, HADOOP-12563.05.patch, HADOOP-12563.06.patch, HADOOP-12563.07.patch,
HADOOP-12563.07.patch, HADOOP-12563.08.patch, HADOOP-12563.09.patch, dtutil-test-out, dtutil_diff_07_08,
example_dtutil_commands_and_output.txt, generalized_token_case.pdf
>
>
> hdfs fetchdt is missing some critical features and is geared almost exclusively towards
HDFS operations.  Additionally, the token files that are created use Java serializations which
are hard/impossible to deal with in other languages. It should be replaced with a better utility
in common that can read/write protobuf-based token files, has enough flexibility to be used
with other services, and offers key functionality such as append and rename. The old version
file format should still be supported for backward compatibility, but will be effectively
deprecated.
> A follow-on JIRA will deprecrate fetchdt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message