Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <18219431.1189023813926.JavaMail.jira@brutus>
Date: Wed, 5 Sep 2007 13:23:33 -0700 (PDT)
From: "Owen O'Malley (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-1838) Files created with an pre-0.15 gets
 blocksize as zero, causing performance degradation
In-Reply-To: <4315729.1188974793507.JavaMail.root@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525207 ] 

Owen O'Malley commented on HADOOP-1838:
---------------------------------------

I'd much rather have the upgrade set the blocksize to the default block size in the case of single block files, rather leave 0 as a special value. The problem with special values is that they need to be tested for in every single use of the field and are thus much much harder to maintain.

> Files created with an pre-0.15 gets blocksize as zero, causing performance degradation
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1838
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1838
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.15.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: blockSizeZero.patch
>
>
> HADOOP-1656 introduced the support for storing block size persistently as inode metadata. Previously, if the file has only one block then it was not possible to accurately determine the blocksize that the application has requested at file-creation time.
> The upgrade of an older layout to the new layout kept the blocksize as zero for single-block files that were upgraded to the new layout. This was done to indicate the DFS really does not know the "true" blocksize of this file. This caused map-reduce to determine that a split is 1 byte in length!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.