hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7602) HDFS file utility
Date Mon, 12 Jan 2015 16:33:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273743#comment-14273743
] 

Allen Wittenauer commented on HDFS-7602:
----------------------------------------

Luckily, magic pre-dates Linux by a decade+ so we should be able to leverage off of the BSDs
or Illumos in order to avoid any license difficulties.

> HDFS file utility
> -----------------
>
>                 Key: HDFS-7602
>                 URL: https://issues.apache.org/jira/browse/HDFS-7602
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client, tools
>    Affects Versions: 2.5.0
>            Reporter: James Kinley
>            Priority: Minor
>
> Provide a utility to determine HDFS file formats and compression types, akin to Linux's
file utility.
> There is no easy way to do this today, short of downloading a file and running Linux's
file utility on it for at least some intelligence. Although, Linux's magic file does not contain
any information to identify the leading bytes of Hadoop's common file formats, for example:
'S', 'E', 'Q' for SequenceFiles, or 'P', 'A', 'R', '1' for Parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message