Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 35B96116F4 for ; Fri, 15 Aug 2014 18:30:53 +0000 (UTC) Received: (qmail 69056 invoked by uid 500); 15 Aug 2014 18:30:36 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 68951 invoked by uid 500); 15 Aug 2014 18:30:36 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 68924 invoked by uid 99); 15 Aug 2014 18:30:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Aug 2014 18:30:35 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of iphcalvin@gmail.com designates 209.85.213.180 as permitted sender) Received: from [209.85.213.180] (HELO mail-ig0-f180.google.com) (209.85.213.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Aug 2014 18:30:10 +0000 Received: by mail-ig0-f180.google.com with SMTP id l13so2581770iga.13 for ; Fri, 15 Aug 2014 11:30:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=aQnYE+Mu1jaC7Zh7vAaFrQRso4eYhYNPSlQq7ZpG3QY=; b=SSN76UGh0Tqio+ohkikvJDV3A9WXdZFn5mQcFQCVHw3ZWSoqxE/qTTKoLIXANLAbLR 9TJ1CYkkiJoDibqcqzDCa99RSqiwJyHGfp5S5GUMu1+x3Jsj0taca2AUH7gv1ItRPG4B Ksz4A1tM8ywn7yhxTrlZfz8fDCTWxt7dVlOynrCE41+VSvdn+6CZfWI7RP/SrCkutTy7 vc557P8TC7YqYl5zEJy3XSuJsxlNM/AzS/QmLy98tg+3tGS65IW4ua2xR623WzAmSjuw zpd9Y7BkdOsFdkRXY23zvi0yA2sKu31OOd44EbgwvjM1wgk+/ztu3qKsx4h2h/BzmLtn ywZw== MIME-Version: 1.0 X-Received: by 10.42.198.75 with SMTP id en11mr22064674icb.7.1408127409115; Fri, 15 Aug 2014 11:30:09 -0700 (PDT) Received: by 10.107.138.149 with HTTP; Fri, 15 Aug 2014 11:30:09 -0700 (PDT) In-Reply-To: References: Date: Fri, 15 Aug 2014 12:30:09 -0600 Message-ID: Subject: Re: hadoop/yarn and task parallelization on non-hdfs filesystems From: Calvin To: user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the responses! To clarify, I'm not using any special FileSystem implementation. An example input parameter to a MapReduce job would be something like "-input file:///scratch/data". Thus I think (any clarification would be helpful) Hadoop is then utilizing LocalFileSystem (org.apache.hadoop.fs.LocalFileSystem). The input data is large enough and splittable (1300 .bz2 files, 274MB each, 350GB total). Thus even if it the input data weren't splittable, Hadoop should be able to parallelize up to 1300 map tasks if capacity is available; in my case, I find that the Hadoop cluster is not fully utilized (i.e., ~25-35 containers running when it can scale up to ~80 containers) when not using HDFS, while achieving maximum use when using HDFS. I'm wondering if Hadoop is "holding back" or throttling the I/O if LocalFileSystem is being used, and what changes I can make to have the Hadoop tasks scale. In the meantime, I'll take a look at the API calls that Harsh mentioned. On Fri, Aug 15, 2014 at 10:15 AM, Harsh J wrote: > The split configurations in FIF mentioned earlier would work for local files > as well. They aren't deemed unsplitable, just considered as one single > block. > > If the FS in use has its advantages it's better to implement a proper > interface to it making use of them, than to rely on the LFS by mounting it. > This is what we do with HDFS. > > On Aug 15, 2014 8:52 PM, "java8964" wrote: >> >> I believe that Calvin mentioned before that this parallel file system >> mounted into local file system. >> >> In this case, will Hadoop just use java.io.File as local File system to >> treat them as local file and not split the file? >> >> Just want to know the logic in hadoop handling the local file. >> >> One suggestion I can think is to split the files manually outside of >> hadoop. For example, generate lots of small files as 128M or 256M size. >> >> In this case, each mapper will process one small file, so you can get good >> utilization of your cluster, assume you have a lot of small files. >> >> Yong >> >> > From: harsh@cloudera.com >> > Date: Fri, 15 Aug 2014 16:45:02 +0530 >> > Subject: Re: hadoop/yarn and task parallelization on non-hdfs >> > filesystems >> > To: user@hadoop.apache.org >> > >> > Does your non-HDFS filesystem implement a getBlockLocations API, that >> > MR relies on to know how to split files? >> > >> > The API is at >> > http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.FileStatus, >> > long, long), and MR calls it at >> > >> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java#L392 >> > >> > If not, perhaps you can enforce a manual chunking by asking MR to use >> > custom min/max split sizes values via config properties: >> > >> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java#L66 >> > >> > On Fri, Aug 15, 2014 at 10:16 AM, Calvin wrote: >> > > I've looked a bit into this problem some more, and from what another >> > > person has written, HDFS is tuned to scale appropriately [1] given the >> > > number of input splits, etc. >> > > >> > > In the case of utilizing the local filesystem (which is really a >> > > network share on a parallel filesystem), the settings might be set >> > > conservatively in order not to thrash the local disks or present a >> > > bottleneck in processing. >> > > >> > > Since this isn't a big concern, I'd rather tune the settings to >> > > efficiently utilize the local filesystem. >> > > >> > > Are there any pointers to where in the source code I could look in >> > > order to tweak such parameters? >> > > >> > > Thanks, >> > > Calvin >> > > >> > > [1] >> > > https://stackoverflow.com/questions/25269964/hadoop-yarn-and-task-parallelization-on-non-hdfs-filesystems >> > > >> > > On Tue, Aug 12, 2014 at 12:29 PM, Calvin wrote: >> > >> Hi all, >> > >> >> > >> I've instantiated a Hadoop 2.4.1 cluster and I've found that running >> > >> MapReduce applications will parallelize differently depending on what >> > >> kind of filesystem the input data is on. >> > >> >> > >> Using HDFS, a MapReduce job will spawn enough containers to maximize >> > >> use of all available memory. For example, a 3-node cluster with 172GB >> > >> of memory with each map task allocating 2GB, about 86 application >> > >> containers will be created. >> > >> >> > >> On a filesystem that isn't HDFS (like NFS or in my use case, a >> > >> parallel filesystem), a MapReduce job will only allocate a subset of >> > >> available tasks (e.g., with the same 3-node cluster, about 25-40 >> > >> containers are created). Since I'm using a parallel filesystem, I'm >> > >> not as concerned with the bottlenecks one would find if one were to >> > >> use NFS. >> > >> >> > >> Is there a YARN (yarn-site.xml) or MapReduce (mapred-site.xml) >> > >> configuration that will allow me to effectively maximize resource >> > >> utilization? >> > >> >> > >> Thanks, >> > >> Calvin >> > >> > >> > >> > -- >> > Harsh J