Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8131E9047 for ; Sat, 17 Mar 2012 14:50:36 +0000 (UTC) Received: (qmail 52339 invoked by uid 500); 17 Mar 2012 14:50:35 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 52291 invoked by uid 500); 17 Mar 2012 14:50:35 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 52283 invoked by uid 99); 17 Mar 2012 14:50:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Mar 2012 14:50:35 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of brock@cloudera.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qc0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Mar 2012 14:50:29 +0000 Received: by qcsd1 with SMTP id d1so597510qcs.35 for ; Sat, 17 Mar 2012 07:50:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=MYcLlRCM3VwEJOoFi8stXhfpbdRwVEdocE82dznLGzI=; b=j4qyh+BcvSOhtyx3ys2BhMLAzW4tb0gVguS8vy7jsYT5gUZiYwECWp57EGbi3YUAwo KVctOXFbQeCRBI1iimCeZOmSCWVrAfnoEMywalfNAlrxSVnlZIo84scIlXQoFlvgZ/T6 KvK3ngmKivae6QxQF2oZIGIDfZLvxYsthq5YmqpFZ6hBpA3WNTxNK3mmrH8krejCgJjC WfqVBJn86b1wezftU/JJsG5ccjkoMn3hrJwrX2NHFNL5aTIT30oPTbPRV27Tnt8WxwY8 Aq03f2BzLxndJWFUBuYi9R621W8UgDQHVXSXGDUlX66rC0Pvl64YAdo+DZ4956aILtZ7 Rbuw== MIME-Version: 1.0 Received: by 10.229.137.21 with SMTP id u21mr2348005qct.115.1331995808405; Sat, 17 Mar 2012 07:50:08 -0700 (PDT) Received: by 10.229.154.15 with HTTP; Sat, 17 Mar 2012 07:50:08 -0700 (PDT) In-Reply-To: References: Date: Sat, 17 Mar 2012 09:50:08 -0500 Message-ID: Subject: Re: random seeks during write in HDFS From: Brock Noland To: hdfs-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQkfa5fJEhiBLb0ZakkJbZpZQWOu+E9wS5q2xH+d+2f3/sma5fOOf2XLgMhFPSux6b2mgQI+ X-Virus-Checked: Checked by ClamAV on apache.org Hi, This question is for hdfs-user not mapreduce-user, as such I have removed them. Yes HDFS does not allow ramdom writes. I suggest your read this doc: http://hadoop.apache.org/common/docs/current/hdfs_design.html Specifically the "Assumptions and Goals" section. Here are two ways to get around this design assumption: 1) Write updated copies of the record with a new time stamp and then dedup based on a unique key and timestamp. 2) Use HBase Cheers, Brock On Sat, Mar 17, 2012 at 9:09 AM, Hassen Riahi wrote: > Hi, > > We are trying to execute a mapper making a random access during writing > files. It seems that HDFS supports only random seek during read and not > during write (neither the file modification). Is it right? we are using > hadoop-0.20. If it is the case, is there any plan to support it in the > future? > > The limitation described above makes the mapper failing to write files. Is > there any suggestions to bypass this limitation? such as write files in a > temp area and copying them then to HDFS? > > Thanks for the help, > Hassen > -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/