Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates
 209.85.210.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAHxBGfKAG7c0swQ3wJy8g_hdGE+B3i+0hkMrMVbbYofTOOMeyA@mail.gmail.com>
References: 
 <CAHxBGfKAG7c0swQ3wJy8g_hdGE+B3i+0hkMrMVbbYofTOOMeyA@mail.gmail.com>
From: Harsh J <harsh@cloudera.com>
Date: Wed, 7 Nov 2012 20:34:22 +0530
Message-ID: 
 <CAOcnVr1uM8ezRTAKuEJPu_nhYws+9wyp4JuDBJbRhqiy+VSz3Q@mail.gmail.com>
Subject: Re: Regarding loading Image file into HDFS
To: user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1

Hi,

Blocks are split at arbitrary block size boundaries. Readers can read
the whole file by reading all blocks together (this is transparently
handled by the underlying DFS reader classes itself, a developer does
not have to care about it).

HDFS does not care about what _type_ of file you store, its agnostic
and just splits it based on the block size. Its up to the apps to not
split a reader across blocks if it can't be parallelized.

On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan
<ramasubramanian.narayanan@gmail.com> wrote:
> Hi,
>
>  I have basic doubt... How Hadoop splits an Image file into blocks and puts
> in HDFS? Usually Image file cannot be splitted right how it is happening in
> Hadoop?
>
> regards,
> Rams


-- 
Harsh J