hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei (Eddy) Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5952) Create a tool to run data analysis on the PB format fsimage
Date Thu, 02 Oct 2014 22:08:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157281#comment-14157281

Lei (Eddy) Xu commented on HDFS-5952:

Hey, [~haoch] and [~wheat9]

Firstly, thank you for the work you have done. 

I've been looking at writing a tool that uses external db (e.g., leveldb) to process the new-style
protobuf-based fsimage. Using leveldb can remove the RAM limitations (i.e., loading all inodes
into RAM first). This would be more convenient for people who don't want to lose the information
in the new image (such as xattrs), but who want delimited output. It would be great that I
can follow  [~haoch]'s work and of course I would love to help to get this patch in.

What do you think about this?

> Create a tool to run data analysis on the PB format fsimage
> -----------------------------------------------------------
>                 Key: HDFS-5952
>                 URL: https://issues.apache.org/jira/browse/HDFS-5952
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 2.6.0
>            Reporter: Akira AJISAKA
>         Attachments: HDFS-5952.patch
> Delimited processor in OfflineImageViewer is not supported after HDFS-5698 was merged.
> The motivation of delimited processor is to run data analysis on the fsimage, therefore,
there might be more values to create a tool for Hive or Pig that reads the PB format fsimage

This message was sent by Atlassian JIRA

View raw message