hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6313) Expose flush APIs to application users
Date Thu, 29 Oct 2009 00:26:59 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Hairong Kuang updated HADOOP-6313:

    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

> Expose flush APIs to application users
> --------------------------------------
>                 Key: HADOOP-6313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6313
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>         Attachments: hflushCommon.patch, hflushCommon1.patch
> Earlier this year, Yahoo, Facebook, and Hbase developers had a roundtable discussion
where we agreed to support three types of flush in HDFS (API1, 2, and 3) and the append project
aims to implement API2. Here is a proposal to expose these APIs to application users.
> 1. Three flush APIs
> * API1: flushes out from the address space of client into the socket to the data nodes.
  On the return of the call there is no guarantee that that data is out of the underlying
node and no guarantee of having reached a DN.  New readers will eventually see this data if
there are no failures.
> * API2: flushes out to all replicas of the block. The data is in the buffers of the DNs
but not on the DN's OS buffers.  New readers will see the data after the call has returned.

> * API3: flushes out to all replicas and all replicas have done posix fsync equivalent
- ie the OS has flushed it to the disk device (but the disk may have it in its cache).
> 2. Support flush APIs in FS
> * FSDataOutputStream#flush supports API1
> * FSDataOutputStream implements Syncable interface defined below. If its wrapped output
stream (i.e. each file system's stream) is Syncable, FSDataOutputStream#hflush() and hsync()
call its wrapped output stream's hflush & hsync.
> {noformat}
>   public interface Syncable {
>     public void hflush() throws IOException;  // support API2
>     public void hsync() throws IOException;   // support API3
>   }
> {noformat}
> * In each file system, if only hflush() is implemented, hsync() by default calls hflush().
 If only hsync() is implemented, hflush() by default calls flush().

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message