hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Judd (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers
Date Sun, 18 Jan 2009 17:32:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664986#action_12664986
] 

Doug Judd commented on HADOOP-4379:
-----------------------------------

Hi Dhruba,

I've been working with Luke a little on this.  Here are more details.  The log that gets written
in the test is very small.  The first thing the software does when it creates the log is it
writes a 7-byte header.  Then later as the test proceeds, the system will append a small entry
and the do a sync.  We use the FSDataOutputStream class.  The sequence of operations looks
something like this:

out_stream.write(data, 0, amount);
out_stream.sync()
[...]

When the test completes, all of the logs are exactly 7 bytes long.  It remains this way even
if I wait 10 minutes or kill the Hypertable java process and wait several minutes as well.
 Here is the listing:

[doug@motherlode000 aol-basic]$ hadoop fs -ls /hypertable/servers/10.0.30.1*_38060/log/range_txn
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.102_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.104_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.106_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.108_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.110_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.112_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.114_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 /hypertable/servers/10.0.30.116_38060/log/range_txn/0.log

After shutting down HDFS and restarting it again, the listing looks like this:

[doug@motherlode000 aol-basic]$ hadoop fs -ls /hypertable/servers/10.0.30.1*_38060/log/range_txn
-rw-r--r--   3 doug supergroup        564 2009-01-17 19:52 /hypertable/servers/10.0.30.102_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup         84 2009-01-17 19:52 /hypertable/servers/10.0.30.104_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup       1063 2009-01-17 19:52 /hypertable/servers/10.0.30.106_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup        634 2009-01-17 19:52 /hypertable/servers/10.0.30.108_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup        217 2009-01-17 19:52 /hypertable/servers/10.0.30.110_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup       1943 2009-01-17 19:52 /hypertable/servers/10.0.30.112_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup       1072 2009-01-17 19:52 /hypertable/servers/10.0.30.114_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup        525 2009-01-17 19:52 /hypertable/servers/10.0.30.116_38060/log/range_txn/0.log

The last time I ran this test I encountered a problem where it appeared that some of our commits
were lost.  Here's what I did:

1. ran tests (which create a table with 75274825 cells)
2. kill Hypertable
3. shutdown HDFS
4. restart HDFS
5. restart Hypertable (which re-plays the commit logs)
6. dumped the table

The table dump in #6 came up short (e.g. 72M entries).  It appears that some of the commit
logs (different log than the range_txn log) came back up incomplete.

Let us know if you want us to run an instrumented version or anything.  We can send you the
Hadoop log files if that helps.  Thanks!

- Doug


> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt
>
>
> In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc),
it says
> * A reader is guaranteed to be able to read data that was 'flushed' before the reader
opened the file
> However, this feature is not yet implemented.  Note that the operation 'flushed' is now
called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message