hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Unsplittable files on HDFS
Date Wed, 27 Apr 2011 11:09:35 GMT
On 27/04/11 10:48, Niels Basjes wrote:
> Hi,
>
> I did the following with a 1.6GB file
>     hadoop fs -Ddfs.block.size=2147483648 -put
> /home/nbasjes/access-2010-11-29.log.gz /user/nbasjes
> and I got
>
> Total number of blocks: 1
> 4189183682512190568:	 	10.10.138.61:50010	 	10.10.138.62:50010
>
> Yes, that does the trick. Thank you.
>
> Niels
>
> 2011/4/27 Harsh J<harsh@cloudera.com>:
>> Hey Niels,
>>
>> The block size is a per-file property. Would putting/creating these
>> gzip files on the DFS with a very high block size (such that it
>> doesn't split across for such files) be a valid solution to your
>> problem here?
>>

Don't set a block size >2GB, not all the bits of the code that use 
signed 32 bit integers have been eliminated yet.

Mime
View raw message