Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BEC69102E2 for ; Tue, 20 Aug 2013 21:38:59 +0000 (UTC) Received: (qmail 49239 invoked by uid 500); 20 Aug 2013 21:38:54 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 49129 invoked by uid 500); 20 Aug 2013 21:38:54 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 49122 invoked by uid 99); 20 Aug 2013 21:38:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Aug 2013 21:38:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jiang2.wu@citi.com designates 67.231.145.106 as permitted sender) Received: from [67.231.145.106] (HELO mx0a-00123c01.pphosted.com) (67.231.145.106) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Aug 2013 21:38:44 +0000 Received: from pps.filterd (m0030125 [127.0.0.1]) by mx0a-00123c02.pphosted.com (8.14.5/8.14.5) with SMTP id r7KLZuP0014515 for ; Tue, 20 Aug 2013 21:38:23 GMT Received: from mail.citigroup.com ([192.193.222.17]) by mx0a-00123c02.pphosted.com with ESMTP id 1e5gq65me1-1 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 20 Aug 2013 21:37:30 +0000 Received: from imbhub-sw12.nam.nsroot.net (namdlpdimpnj05.nam.nsroot.net [150.110.210.37]) by smtpinbound.citigroup.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id r7KLanii031827 for ; Tue, 20 Aug 2013 21:36:49 GMT Received: from exnjiht01.nam.nsroot.net (EXNJIHT01.nam.nsroot.net [150.110.165.227]) by imbhub-sw12.nam.nsroot.net (Switch-3.4.1/Switch-3.4.1) with ESMTP id r7KLajNM023920 for ; Tue, 20 Aug 2013 21:36:49 GMT Received: from EXGTIHT03.nam.nsroot.net (169.171.127.30) by exnjiht01.nam.nsroot.net (150.110.165.227) with Microsoft SMTP Server (TLS) id 8.3.264.0; Tue, 20 Aug 2013 17:36:45 -0400 Received: from EXGTMB19.nam.nsroot.net ([169.254.3.21]) by EXGTIHT03.nam.nsroot.net ([169.171.127.30]) with mapi id 14.02.0328.009; Tue, 20 Aug 2013 16:36:40 -0500 From: "Wu, Jiang2 " To: "'user@hadoop.apache.org'" Subject: read a changing hdfs file Thread-Topic: read a changing hdfs file Thread-Index: Ac6d7VmQl7/obP9HSJOzXCxOmWI+6g== Date: Tue, 20 Aug 2013 21:36:40 +0000 Message-ID: <6678A26479D62E40927B180E42350D500335E741@EXGTMB19.nam.nsroot.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [169.171.127.250] Content-Type: multipart/alternative; boundary="_000_6678A26479D62E40927B180E42350D500335E741EXGTMB19namnsro_" MIME-Version: 1.0 X-WiganSS: 01000000010018exnjiht01.nam.nsroot.net ID0042<6678A26479D62E40927B180E42350D500335E741@EXGTMB19.nam.nsroot.net> X-CFilter-Loop: Reflected X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794,1.0.431,0.0.0000 definitions=2013-08-20_08:2013-08-20,2013-08-20,1970-01-01 signatures=0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_6678A26479D62E40927B180E42350D500335E741EXGTMB19namnsro_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, I did some experiments to read a changing hdfs file. It seems that the read= ing takes a snapshot at the file opening moment, and will not read any data= appended to the file afterwards. It's different from what happens when rea= ding a changing local file. My code is as follows Configuration conf =3D new Configuration(); InputStream in =3D null; try { FileSystem fs =3D FileSystem.get(URI.create= ("hdfs://MyCluster/"), conf); in =3D fs.open(new Path("/tmp/test.txt")); Scanner scanner=3Dnew Scanner(in); while(scanner.hasNextLine()){ System.out.println("+++++++++++++++= ++++++++++++++++ read "+scanner.nextLine()); } System.out.println("+++++++++++++++++++++++= ++++++++ reader finished "); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } finally { IOUtils.closeStream(in); } I'm wondering if this is the designed hdfs reading behavior, or can be chan= ged by using different API or configuration? What I expect is the same beha= vior as a local file reading: when a reader reads a file while another writ= er is writing to the file, the reader will receive all data written by the = writer. Thanks, Jiang --_000_6678A26479D62E40927B180E42350D500335E741EXGTMB19namnsro_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Hi,
 
I did some experiments to read a changing hdfs file. It seems that the= reading takes a snapshot at the file opening moment, and will not read any= data appended to the file afterwards. It’s different from what happe= ns when reading a changing local file. My code is as follows
 
           &nbs= p;            Config= uration conf =3D new Configuration();
           &nbs= p;            InputS= tream in =3D null;
           &nbs= p;            try {<= /div>
           &nbs= p;            &= nbsp;       FileSystem fs =3D FileSystem.get(= URI.create("hdfs://MyCluster/"),
           &nbs= p;            &= nbsp;           &nbs= p;           conf);
           &nbs= p;            &= nbsp;       in =3D fs.open(new Path("/tm= p/test.txt"));
           &nbs= p;            &= nbsp;       Scanner scanner=3Dnew Scanner(in)= ;
           &nbs= p;            &= nbsp;       while(scanner.hasNextLine()){
           &nbs= p;            &= nbsp;           &nbs= p;   System.out.println("+++++++= +++++++++++++++= +++++++++ read "+scanner.nextL= ine());
           &nbs= p;            &= nbsp;       }
           &nbs= p;            &= nbsp;       System.out.println("+= 3;++++++++++++++= 3;++++++++++++++ re= ader finished ");
           &nbs= p;            } catc= h (IOException e) {
           &nbs= p;            &= nbsp;       // TODO Auto-generated catch bloc= k
           &nbs= p;            &= nbsp;       e.printStackTrace();
           &nbs= p;            } fina= lly {
           &nbs= p;            &= nbsp;       IOUtils.closeStream(in);
           &nbs= p;            }
 
I’m wondering if this is the designed hdfs reading behavior, or = can be changed by using different API or configuration? What I expect is th= e same behavior as a local file reading: when a reader reads a file while a= nother writer is writing to the file, the reader will receive all data written by the writer.
 
Thanks,
Jiang
 
 
--_000_6678A26479D62E40927B180E42350D500335E741EXGTMB19namnsro_--