Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3EF8E892 for ; Sat, 23 Feb 2013 13:46:25 +0000 (UTC) Received: (qmail 62049 invoked by uid 500); 23 Feb 2013 13:46:20 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 61922 invoked by uid 500); 23 Feb 2013 13:46:20 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 61474 invoked by uid 99); 23 Feb 2013 13:46:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Feb 2013 13:46:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lucejb@gmail.com designates 209.85.219.42 as permitted sender) Received: from [209.85.219.42] (HELO mail-oa0-f42.google.com) (209.85.219.42) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Feb 2013 13:46:12 +0000 Received: by mail-oa0-f42.google.com with SMTP id i18so1463170oag.29 for ; Sat, 23 Feb 2013 05:45:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=vODnT8XUQAqsYTs08S6QGrKjkAPy53msrpbHdwoXf2E=; b=DP0tdz2cQsFnwFE3DQdXjikToWR+ZkgnjPdzgaN8avKoOEPXUhdBKuQSv4MweN7mJA l2LLAi+Gtiqg+g8fm0sDyweSFmnf+b/+QgAaMLnVBN03EVUZFsG2NiTv92PFfiFeJpjT kGCoXcZ0SKmlZrWDxGBO9EXhLpAw56+oGxyLl46iq3Cn8dJFLwy+Upf9GBlL1roVZJ4p r93uro5kBu2tM3IdGIfIqCWa8j+bOEHsJesUeRiGQiHLsJp+bcz0ZtFQnuAPkf3g23fB RwGD2fAI1h48XV63BHTdLd7mBQ4uNVVQHTkS1MqphKdtbqasMGuhlk2Tu3RW5sRXmJn4 rROg== MIME-Version: 1.0 X-Received: by 10.182.54.46 with SMTP id g14mr2114708obp.55.1361627151656; Sat, 23 Feb 2013 05:45:51 -0800 (PST) Received: by 10.76.27.73 with HTTP; Sat, 23 Feb 2013 05:45:51 -0800 (PST) In-Reply-To: References: Date: Sat, 23 Feb 2013 10:45:51 -0300 Message-ID: Subject: Re: map reduce and sync From: Lucas Bernardi To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=14dae93b57beb0e71d04d6648671 X-Virus-Checked: Checked by ClamAV on apache.org --14dae93b57beb0e71d04d6648671 Content-Type: text/plain; charset=ISO-8859-1 Helo Hemanth, thanks for answering. The file is open by a separate process not map reduce related at all. You can think of it as a servlet, receiving requests, and writing them to this file, every time a request is received it is written and org.apache.hadoop.fs.FSDataOutputStream.sync() is invoked. At the same time, I want to run a map reduce job over this file. Simply runing the word count example doesn't seem to work, it is like if the file were empty. hadoop -fs -tail works just fine, and reading the file using org.apache.hadoop.fs.FSDataInputStream also works ok. Last thing, the web interface doesn't see the contents, and command hadoop -fs -ls says the file is empty. What am I doing wrong? Thanks! Lucas On Sat, Feb 23, 2013 at 4:37 AM, Hemanth Yamijala wrote: > Could you please clarify, are you opening the file in your mapper code and > reading from there ? > > Thanks > Hemanth > > On Friday, February 22, 2013, Lucas Bernardi wrote: > >> Hello there, I'm trying to use hadoop map reduce to process an open file. >> The writing process, writes a line to the file and syncs the file to >> readers. >> (org.apache.hadoop.fs.FSDataOutputStream.sync()). >> >> If I try to read the file from another process, it works fine, at least >> using >> org.apache.hadoop.fs.FSDataInputStream. >> >> hadoop -fs -tail also works just fine >> >> But it looks like map reduce doesn't read any data. I tried using the >> word count example, same thing, it is like if the file were empty for the >> map reduce framework. >> >> I'm using hadoop 1.0.3. and pig 0.10.0 >> >> I need some help around this. >> >> Thanks! >> >> Lucas >> > --14dae93b57beb0e71d04d6648671 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Helo Hemanth, thanks for answering.
The file is open by a separate proc= ess not map reduce related at all. You can think of it as a servlet, receiv= ing requests, and writing them to this file, every time a request is receiv= ed it is written and=A0org.apache.hadoop.fs.FSDataOutputSt= ream.sync() is invoked.

At the same time, I want to run a map reduce job over this file. Si= mply runing the word count example doesn't seem to work, it is like if = the file were empty.

hadoop -fs -tail works just fine, and reading the f= ile using org.apache.hadoop.fs.FSDataInputStream also works ok.

Last thing, the web interface doesn't see the conte= nts, and command hadoop -fs -ls says the file is empty.

What am I doing wrong?

Thanks!

Lucas



On Sat, Fe= b 23, 2013 at 4:37 AM, Hemanth Yamijala <yhemanth@thoughtworks.com= > wrote:
Could you please clarify, are you opening th= e file in your mapper code and reading from there ?

Than= ks
Hemanth

On Friday= , February 22, 2013, Lucas Bernardi wrote:
Hello there, I'm trying to use hado= op map reduce to process an open file. The writing process, wri= tes a line to the file and syncs the file to readers.
(org.apache.hadoop.fs.FSDataOutputStream.sync()).

If I try to read the file from another process, it works fine, at least usi= ng=A0
org.apache.hadoop.fs.FSDataInputStream.

hadoop -fs -tail also works just fine

But it looks like map reduce doesn't read any data. I tried using the w= ord count example, same thing, it is like if the file were empty for the ma= p reduce framework.

I'm using hadoop 1.0.3. and pig 0.10.0

I need some help around this.

Thanks!

Lucas

--14dae93b57beb0e71d04d6648671--