Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 18C079F22 for ; Tue, 7 Aug 2012 15:26:14 +0000 (UTC) Received: (qmail 33016 invoked by uid 500); 7 Aug 2012 15:26:09 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 32690 invoked by uid 500); 7 Aug 2012 15:26:08 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 32683 invoked by uid 99); 7 Aug 2012 15:26:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 15:26:08 +0000 X-ASF-Spam-Status: No, hits=2.6 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,NO_RDNS_DOTCOM_HELO,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.145.54.173 is neither permitted nor denied by domain of daryn@yahoo-inc.com) Received: from [216.145.54.173] (HELO mrout3.yahoo.com) (216.145.54.173) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 15:26:00 +0000 Received: from SP1-EX07CAS04.ds.corp.yahoo.com (sp1-ex07cas04.corp.sp1.yahoo.com [216.252.116.155]) by mrout3.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id q77FPUNe096303 for ; Tue, 7 Aug 2012 08:25:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yahoo-inc.com; s=cobra; t=1344353131; bh=Ja8W3U/+4Iev2ic9YorkerFo1FgA/J855Wd9IPI7jAU=; h=From:To:Date:Subject:Message-ID:References:In-Reply-To: Content-Type:MIME-Version; b=Svkhj1OZ1UsBEfUz17BvM/UoZA+HROecBh4tSIKtcWLvGp5QLsZB6QSEXctyzMg2r E6W6mnRtHGmu4XJvkFVj6m9gh86k6QAQO2WwGXjuU5NUDkYAAUawVC44et6t11RsQb PBd4hXG3IWgXsniOMXApK9dTg9VYpVaBjAe6sOuY= Received: from SP1-EX07VS02.ds.corp.yahoo.com ([216.252.116.135]) by SP1-EX07CAS04.ds.corp.yahoo.com ([216.252.116.158]) with mapi; Tue, 7 Aug 2012 08:25:30 -0700 From: Daryn Sharp To: "user@hadoop.apache.org" Date: Tue, 7 Aug 2012 08:25:29 -0700 Subject: Re: fs cache giving me headaches Thread-Topic: fs cache giving me headaches Thread-Index: Ac10sODo4hqbxzFVS3a+tZLAf37siw== Message-ID: References: <1DA607DA-E9D9-43E5-B93F-654C1AA090BE@yahoo-inc.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_FFD58989E2CE4063994241EB57A619F2yahooinccom_" MIME-Version: 1.0 X-Milter-Version: master.31+4-gbc07cd5+ X-CLX-ID: 353131000 --_000_FFD58989E2CE4063994241EB57A619F2yahooinccom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable There is no UGI caching, so each request will receive a unique UGI even for= the same user. Thus you can safely call FileSystem.closeAllForUGI(ugi) wh= en the request is complete. If however you spin off threads that continue = to use the UGI even after the request is completed, then you'll have to det= ermine for yourself when it's safe to close the filesystems. I've been kicking around a few ways to transparently close cached filesyste= ms for a ugi when that ugi goes out of scope. I should probably file a jir= a (if it stops going down) for discussion. Daryn On Aug 7, 2012, at 10:15 AM, Koert Kuipers wrote: Daryn, The problem with FileSystem.closeAllForUGI(ugi) for me is that a server can= be multi-threaded, and a user could be doing multiple request at the same = time, so if i used closeAllForUGI isn't there a risk of shutting down the o= ther requests for the same user? On Mon, Aug 6, 2012 at 2:52 PM, Daryn Sharp > wrote: Yes, the implementation of fs.close() leaves something to be desired. Ther= e's actually been debate in the past about close being a no-op for a cached= fs, but the idea was rejected by the majority of people. In the server case, you can use FileSystem.closeAllForUGI(ugi) at the end o= f a request to flush all the fs cache entries for the ugi. You'll get the = benefit of the cache during execution of the request, and be able to close = the cached fs instances to prevent memory leaks. I hope this helps! Daryn On Aug 6, 2012, at 12:32 PM, Koert Kuipers wrote: ---------- Forwarded message ---------- From: "Koert Kuipers" > Date: Aug 4, 2012 1:54 PM Subject: fs cache giving me headaches To: > nothing has confused me as much in hadoop as FileSystem.close(). any decent java programmer that sees that an object implements Closable wri= tes code like this: Final FileSystem fs =3D FileSystem.get(conf); try { // do something with fs } finally { fs.close(); } so i started out using hadoop FileSystem like this, and i ran into all sort= s of weird errors where FileSystems in unrelated code (sometimes not even m= y code) started misbehaving and streams where unexpectedly shut. Then i rea= lized that FileSystem uses a cache and close() closes it for everyone! Not = pretty in my opinion, but i can live with it. So i checked other code and f= ound that basically nobody closes FileSystems. Apparently the expected way = of using FileSystems is to simple never close them. So i adopted this appro= ach (which i think is really contrary to java conventions for a Closeable). Lately i started working on some code for a daemon/server where many FileSy= stems objects are created for different users (UGIs) that use the service. = As it turns out other projects have run into trouble with the FileSystem ca= che in situations like this (for example, Scribe and Hoop). I imagine the c= ache can get very large and cause problems (i have not tested this myself). Looking at the code for Hoop i noticed they simply turned off the FileSyste= m cache and made sure to close every FileSystem. So here the suggested appr= oach to deal with FileSystems seems to be: Final FileSystem fs =3D FileSystem.newInstance(conf); // or FileSystem.get(= conf) but with caching turned off in the conf try { // do something with fs } finally { fs.close(); } This code bypasses the cache if i understand it correctly, avoiding any cac= he size limitations. However if i adopt this approach i basically can not r= e-use any existing code or libraries that do not close FileSystems, splitti= ng the codebase into two which is pretty ugly. And this code is not efficie= nt in situations where there are very few used FileSystem objects and a cac= he would improve performance, so the split works both ways. In short, there is so single way to code with FileSystem that works in both= situations! Ideally i would have liked fs.close() to do the right thing de= pending in the settings: if cache is turned off it closes the FileSystem, a= nd if it is turned on its a NOOP. That way i could always use FileSystem.ge= t(conf) and always close my filesystems, and the code would be usable irres= pective of whether the cache is turned on or off. Any insights or suggestions? Thanks! --_000_FFD58989E2CE4063994241EB57A619F2yahooinccom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable There is no UGI caching, s= o each request will receive a unique UGI even for the same user.  Thus= you can safely call FileSystem.closeAllForUGI(ugi) when the request i= s complete.  If however you spin off threads that continue to use the = UGI even after the request is completed, then you'll have to determine for = yourself when it's safe to close the filesystems.

I've b= een kicking around a few ways to transparently close cached filesystems for= a ugi when that ugi goes out of scope.  I should probably file a jira= (if it stops going down) for discussion.

Daryn


On Aug 7, 2012, at 10:15 AM, Koert Kuipers wrote:

Daryn,
The p= roblem with FileSystem.closeAllForUGI(ugi) for me is that a server can be m= ulti-threaded, and a user could be doing multiple request at the same time,= so if i used closeAllForUGI isn't there a risk of shutting down the other = requests for the same user?

On Mon, Aug 6, 2012 at 2:52 PM, Daryn Sharp = <daryn@yahoo-inc.com> wrote:
Yes, the implementation of fs.close() l= eaves something to be desired.  There's actually been debate in the pa= st about close being a no-op for a cached fs, but the idea was rejected by = the majority of people.

In the server case, you can use FileSystem.closeAllForUGI(ug= i) at the end of a request to flush all the fs cache entries for the ugi. &= nbsp;You'll get the benefit of the cache during execution of the request, a= nd be able to close the cached fs instances to prevent memory leaks. I hope= this helps!

Daryn


On Aug 6, 2012, at 12:32 PM, Koert Kuipers wrote:

---------- Forwarded mes= sage ----------
From: "Koert Kuipers" <koert@tresata.com>
Date: Aug 4, 2012 1:54 PM
Subject: fs cache giving me headaches
To: <common-user@hadoop.apache.org>

nothing = has confused me as much in hadoop as FileSystem.close().
any decent java= programmer that sees that an object implements Closable writes code like t= his:
Final FileSystem fs =3D FileSystem.get(conf);
try {
    // do something with fs
} finally {
  &n= bsp; fs.close();
}

so i started out using hadoop FileSystem like = this, and i ran into all sorts of weird errors where FileSystems in unrelat= ed code (sometimes not even my code) started misbehaving and streams where = unexpectedly shut. Then i realized that FileSystem uses a cache and close()= closes it for everyone! Not pretty in my opinion, but i can live with it. = So i checked other code and found that basically nobody closes FileSystems.= Apparently the expected way of using FileSystems is to simple never close = them. So i adopted this approach (which i think is really contrary to java = conventions for a Closeable).

Lately i started working on some code for a daemon/server where many Fi= leSystems objects are created for different users (UGIs) that use the servi= ce. As it turns out other projects have run into trouble with the FileSyste= m cache in situations like this (for example, Scribe and Hoop). I imagine t= he cache can get very large and cause problems (i have not tested this myse= lf).

Looking at the code for Hoop i noticed they simply turned off the FileS= ystem cache and made sure to close every FileSystem. So here the suggested = approach to deal with FileSystems seems to be:
Final FileSystem fs =3D F= ileSystem.newInstance(conf); // or FileSystem.get(conf) but with caching tu= rned off in the conf
try {
    // do something with fs
} finally {
    fs.close();
}

This code bypasses the cache if i understand it correctly, avoiding any= cache size limitations. However if i adopt this approach i basically can n= ot re-use any existing code or libraries that do not close FileSystems, spl= itting the codebase into two which is pretty ugly. And this code is not eff= icient in situations where there are very few used FileSystem objects and a= cache would improve performance, so the split works both ways.

In short, there is so single way to code with FileSystem that works in = both situations! Ideally i would have liked fs.close() to do the right thin= g depending in the settings: if cache is turned off it closes the FileSyste= m, and if it is turned on its a NOOP. That way i could always use FileSyste= m.get(conf) and always close my filesystems, and the code would be usable i= rrespective of whether the cache is turned on or off.

Any insights or suggestions? Thanks!



= --_000_FFD58989E2CE4063994241EB57A619F2yahooinccom_--