hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taeho Kang" <tka...@gmail.com>
Subject Re: Question on opening file info from namenode in DFSClient
Date Fri, 07 Nov 2008 08:53:48 GMT
Hi, thanks for your reply Dhruba,

One of my co-workers is writing a BigTable-like application that could be
used for online, near-real-time, services. So since the application could be
hooked into online services, there would times when a large number of users
(e.g. 1000 users) request to access few files in a very short time.

Of course, in a batch process job, this is a rare case, but for online
services, it's quite a common case.
I think HBase developers would have run into similar issues as well.

Is this enough explanation?

Thanks in advance,

Taeho



On Tue, Nov 4, 2008 at 3:12 AM, Dhruba Borthakur <dhruba@gmail.com> wrote:

> In the current code, details about block locations of a file are
> cached on the client when the file is opened. This cache remains with
> the client until the file is closed. If the same file is re-opened by
> the same DFSClient, it re-contacts the namenode and refetches the
> block locations. This works ok for most map-reduce apps because it is
> rare that the same DSClient re-opens the same file again.
>
> Can you pl explain your use-case?
>
> thanks,
> dhruba
>
>
> On Sun, Nov 2, 2008 at 10:57 PM, Taeho Kang <tkang1@gmail.com> wrote:
> > Dear Hadoop Users and Developers,
> >
> > I was wondering if there's a plan to add "file info cache" in DFSClient?
> >
> > It could eliminate network travelling cost for contacting Namenode and I
> > think it would greatly improve the DFSClient's performance.
> > The code I was looking at was this
> >
> > -----------------------
> > DFSClient.java
> >
> >    /**
> >     * Grab the open-file info from namenode
> >     */
> >    synchronized void openInfo() throws IOException {
> >      /* Maybe, we could add a file info cache here! */
> >      LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize);
> >      if (newInfo == null) {
> >        throw new IOException("Cannot open filename " + src);
> >      }
> >      if (locatedBlocks != null) {
> >        Iterator<LocatedBlock> oldIter =
> > locatedBlocks.getLocatedBlocks().iterator();
> >        Iterator<LocatedBlock> newIter =
> > newInfo.getLocatedBlocks().iterator();
> >        while (oldIter.hasNext() && newIter.hasNext()) {
> >          if (!
> oldIter.next().getBlock().equals(newIter.next().getBlock()))
> > {
> >            throw new IOException("Blocklist for " + src + " has
> changed!");
> >          }
> >        }
> >      }
> >      this.locatedBlocks = newInfo;
> >      this.currentNode = null;
> >    }
> > -----------------------
> >
> > Does anybody have an opinion on this matter?
> >
> > Thank you in advance,
> >
> > Taeho
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message