hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yong Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8998) Small files storage supported inside HDFS
Date Sun, 06 Sep 2015 15:56:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732434#comment-14732434

Yong Zhang commented on HDFS-8998:

Hi [~andrew.wang], thanks for your comments, sorry for late reply.
I read the Ozone design document and view exist code, Ozone is design for cloudy file system
with multi-tenancy, it is different goal, but also store some metadata in LevelDB to reduce
memory used.
The design you proposed sounds like it needs compaction to be coordinated by the NN, rather
than offloading to the DNs. Level/RocksDB I think would also better handle concurrent writes
without the concept of "locked" and "unlocked" blocks.
Maybe small file zone is not exact for hdfs directory, but just call it 'small file zone'.
Client creating file under small file zone, ad these file written is just append on exist
block, if one block is being written, it is blocked, and this blocked info keep in NN to let
other client not use this block until write finish. 
Compaction is only happened on block rewrite because one block belong to more than one file,
and file deletion is only deleting INode, one block will be rewritten if more than threshold
data should be removed, it is controlled by NN, in other words delete operation is offline.

Also, could you comment on the usecase where you see the issues with # of files affecting
DNs before NNs? IIUC this design does not address NN memory consumption, which is the issue
we see first in practice.
Yes, most work on DN because of we already have jira of keeping meta in LevelDB. 

Goal # of files, expected size of a "small" file
Any bad behavior if a large file is accidentally written to the small file zone?
I also have face some problems on it, user may copy file from local to hdfs, or streaming
writing to hdfs, it is hard to identify, so just like I mentioned before, all data writing
is append on exist block.

Support for rename into / out of small file zone?
Yes. but rename is only meta changed, and will add more xattr to identify small file move
out of small file zone.

Is there a way to convert a bunch of small files into a compacted file, like with HAR?
Files will be bunched on block level.

How common is it for a user to know apriori that a bunch of small files will be written, and
is okay putting them in a zone? A lot of the time I see this happening by accident, either
a poorly written app or misconfiguration.
when file written finish, we call close output stream, block append will finish, I will try
to do some test on exist append feature, and update document about chapter Reliability.  

 I will propose design updated soon.

> Small files storage supported inside HDFS
> -----------------------------------------
>                 Key: HDFS-8998
>                 URL: https://issues.apache.org/jira/browse/HDFS-8998
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Yong Zhang
>            Assignee: Yong Zhang
>         Attachments: HDFS-8998.design.001.pdf
> HDFS has problems on store small files, just like this blog said (http://blog.cloudera.com/blog/2009/02/the-small-files-problem).
> This blog also tell us some way how to store small file in HDFS, but they are not good
way, seems HAR files and Sequence Files are better for read-only files.
> Current each HDFS block is only for one HDFS file, if too many small file there, many
small blocks will be in DataNode, which will make DataNode heavy loading.
> This jira will show how to online merge small blocks to big one, and how to delete small
file, and so on.
> Cerrentlly we have many open jira for improving HDFS scalability on NameNode, such as
HDFS-7836, HDFS-8286 and so on. 
> So small file meta (INode and BlocksMap) will also be in NameNode.
> Design document will be uploaded soon. 

This message was sent by Atlassian JIRA

View raw message