hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages
Date Fri, 02 May 2014 23:19:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13988414#comment-13988414

Andrew Wang commented on HDFS-6293:

Hey Suresh,

This plan sounds generally good to me, thanks for working this out. I talked to our internal
users, and had a few questions/comments.

- PB would be preferable to JSON. I'd be interested to hear your reasoning why JSON is significantly
easier; I figured since we already have PB in the build and experience using it, it wouldn't
be that much work.
- Can we provide some kind of REST API for fetching this extra listing file? This is preferable
to manually finding the file and doing scp.
- What kinds of atomicity guarantees are there between the fsimage and this listing? We'd
like to be able to take the listing and replay the edit log on top. Including the txid in
the listing is also important for this work.
- Will this also be done by other saveNamespaces besides checkpointing (i.e. "-saveNamespace"
as well as at startup)?

I'd also appreciate if you posted any further call-ins to this JIRA, since we'd like to be
included in the future. Thanks!

> Issues with OIV processing PB-based fsimages
> --------------------------------------------
>                 Key: HDFS-6293
>                 URL: https://issues.apache.org/jira/browse/HDFS-6293
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Kihwal Lee
>            Priority: Blocker
>         Attachments: Heap Histogram.html
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, OIV consumes
excessive amount of memory.  We have tested with a fsimage with about 140M files/directories.
The peak heap usage when processing this image in pre-protobuf (i.e. pre-2.4.0) format was
about 350MB.  After converting the image to the protobuf format on 2.4.0, OIV would OOM even
with 80GB of heap (max new size was 1GB).  It should be possible to process any image with
the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  I also noticed
that the secret manager section has no tokens while there were unexpired tokens in the original
image (pre-2.4.0).  I did not check whether they were also missing in the new pb fsimage.

This message was sent by Atlassian JIRA

View raw message