hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maisnam ns <maisnam...@gmail.com>
Subject Re: Hadoop noob question
Date Sat, 11 May 2013 11:08:18 GMT
@Nitin Pawar , thanks for clearing my doubts .

But I have one more question , say I have 10 TB data in the pipeline .

Is it perfectly OK to use hadopo fs put command to upload these files of
size 10 TB and is there any limit to the file size  using hadoop command
line . Can hadoop put command line work with huge data.

Thanks in advance

On Sat, May 11, 2013 at 4:24 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:

> first of all .. most of the companies do not get 100 PB of data in one go.
> Its an accumulating process and most of the companies do have a data
> pipeline in place where the data is written to hdfs on a frequency basis
> and  then its retained on hdfs for some duration as per needed and from
> there its sent to archivers or deleted.
> For data management products, you can look at falcon which is open sourced
> by inmobi along with hortonworks.
> In any case, if you want to write files to hdfs there are few options
> available to you
> 1) Write your dfs client which writes to dfs
> 2) use hdfs proxy
> 3) there is webhdfs
> 4) command line hdfs
> 5) data collection tools come with support to write to hdfs like flume etc
> On Sat, May 11, 2013 at 4:19 PM, Thoihen Maibam <thoihen123@gmail.com>wrote:
>> Hi All,
>> Can anyone help me know how does companies like Facebook ,Yahoo etc
>> upload bulk files say to the tune of 100 petabytes to Hadoop HDFS cluster
>> for processing
>> and after processing how they download those files from HDFS to local
>> file system.
>> I don't think they might be using the command line hadoop fs put to
>> upload files as it would take too long or do they divide say 10 parts each
>> 10 petabytes and  compress and use the command line hadoop fs put
>> Or if they use any tool to upload huge files.
>> Please help me .
>> Thanks
>> thoihen
> --
> Nitin Pawar

View raw message