hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tony Reix (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6608) FsDatasetCache: hard-coded 4096 value in test is not appropriate for all HW
Date Mon, 30 Jun 2014 10:30:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047527#comment-14047527

Tony Reix commented on HDFS-6608:

Looking at non-test HDFS code, there are several places where 4096 appears and is related
to x86_64 architecture, and should be changed:

  <description>The size of buffer to stream files.
  The size of this buffer should probably be a multiple of hardware
  page size (4096 on Intel x86), and it determines how much data is
  buffered during read and write operations.</description>


      The buffer size used by a read/write request when streaming data from/to HDFS.

      The buffer size used by a read/write request when streaming data from/to HDFS.

      int bufferSize = fs.getConf().getInt("httpfs.buffer.size", 4096);
      and more

> FsDatasetCache: hard-coded 4096 value in test is not appropriate for all HW
> ---------------------------------------------------------------------------
>                 Key: HDFS-6608
>                 URL: https://issues.apache.org/jira/browse/HDFS-6608
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0
>         Environment: PPC64 (LE & BE, OpenJDK & IBM JVM, Ubuntu, RHEL 7 &
RHEL 6.5)
>            Reporter: Tony Reix
> The value 4096 is hard-coded in HDFS code (product and tests).
> It appears 171 times, including 8 times in product (not tests) code:
> hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs : 163
> hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs : 4
> hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http : 3
> hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs : 1
> This value deals with different subjects: files, block size, page size, etc.
> 4096 (as block size and page size) is appropriate for many systems, but not for PPC64,
for which it is 65536.
> Looking at HDFS product (not test) code, it seems (no 100% sure) that the code is OK
(not using hard-coded page/block size). However someone should check this in depth.
> his.maxBytes = dataset.datanode.getDnConf().getMaxLockedMemory();
> However, at test level, the value 4096 is used in many places and it is very hard to
understand if it depends on the HW architecture or not.
> About test TestFsDatasetCache#testPageRounder, the HW value is sometimes got from the
system :
>  private static final long PAGE_SIZE = NativeIO.POSIX.getCacheManipulator().getOperatingSystemPageSize();
> private static final long BLOCK_SIZE = PAGE_SIZE;
> but there are several places where 4096 is used whenever it should depend on the HW value.
>  With:
> // Most Linux installs allow a default of 64KB locked memory
> private static final long CACHE_CAPACITY = 64 * 1024
> However, for PPC64, this value should be much bigger.
> This TestFsDatasetCache#testPageRounder test is aimed to cache 5 pages of size 512. However,
the page size is 65536 on PPC64 and 4064 on x86_64. Thus, the method in charge of reserving
blocks in the HDFS cache will by 4096 bytes steps on x86_64 and 65536 bytes steps on PPC64
, whith a hard-coded limit : maxBytes = 65536 bytes
> 5 * 4096 = 20480 : OK
> 5 * 65536 = 327680 : KO : the test ends by TimeOut since the limit is overpassed at the
very beginning and the test is still waiting.
> As a conclusion, there are several issues to fix:
>  - instead of using many hard-coded values 4096, the (test mainly) code should use Java
constants built by using HW values (like : NativeIO.POSIX.getCacheManipulator().getOperatingSystemPageSize()
>  - several constants must be used since 4096 deals with different subjects, included
some that do not depend on the HW
>  - the test must be improved for handling cases where the limit is over-passed at the
very beginning

This message was sent by Atlassian JIRA

View raw message