hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luca Telloli (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-397) Incorporate storage directories into EditLogFileInput/Output streams
Date Wed, 15 Jul 2009 14:30:15 GMT

     [ https://issues.apache.org/jira/browse/HDFS-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Luca Telloli updated HDFS-397:
------------------------------

    Attachment: HDFS-397.patch

Raghu, I acknowledge that there might be a problem with the current code, although unit tests
for BackupNode pass correctly, and I implemented a few changes that should fix this issue.
The problematic lines, as you correctly pointed out, are: 

{noformat}
@@ -295,7 +295,7 @@
-    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.divertFileStreams();

@@ -363,7 +363,7 @@
-    editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.purgeEditLog();
{noformat}

>From what I understand, the BackupStorage wants to write into a subdirectory of a storage
directory that is not STORAGE_DIR_CURRENT, but STORAGE_SPOOL_DIR. 

Given the original code, and reading current trunk, what will happen in divertFileStreams
is the following: 
- the code iterates over all current storage directories of type EDITS and open streams of
type FILE 
- check if the streams matches the directory 
- then for each stream {
- closes the stream 
- creates a new stream, that in this case would point to STORAGE_SPOOL_DIR/STORAGE_SPOOL_FILE
- replaces the old stream with the new stream in the list of streams
} 

On revertFileStream, what happens is the following: 
- the code iterates over all current storage directories of type EDITS and open streams of
type FILE 
- check if the streams matches the directory 
- then for each stream {
- closes STORAGE_SPOOL_FILE/edits.new 
- renames the parameter source to CURRENT/edits
} 

I'm not sure why the file in the STORAGE_SPOOL_DIR gets renamed to the edits file in the STORAGE_DIR_CURRENT,
but I made a couple of modifications to my code that will implement that same mechanism. 

I understand though that the mechanism of writing into a subdirectory is due to the fact that,
even if BackupNode and Namenode run on different machine, they might be writing on a shared
storage directory, for instance on NFS, which I guess is a concrete example of when the test
could fail, but not directly implementable in the test itself. 

My modifications include adding a JournalType parameter to the divertFileStreams() and purgeEditLog()
methods, as follows: 
{noformat}
-    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.divertFileStreams(JournalType.BACKUP);
{noformat}
and
{noformat}
-    editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.purgeEditLog(JournalType.BACKUP);
{noformat}

which is then used in the EditLogFileOutputStream class to take the correct decisions. The
final strings in BackupStorage have been made protected so I can read them from the EditLogFileOutputStream
class. 

So I'd say that this is still a reorganization of code, if minor I'll let you to decide, and
there's no change in semantics implied or expected. Unit tests pass fine on Linux. 

{quote}
It is very unrealistic to expect tests to catch every random bug and typo in a patch (for
anything I have ever worked on or known.
{quote}
I agree with you, but that's not what I said. 


> Incorporate storage directories into EditLogFileInput/Output streams
> --------------------------------------------------------------------
>
>                 Key: HDFS-397
>                 URL: https://issues.apache.org/jira/browse/HDFS-397
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Luca Telloli
>            Assignee: Luca Telloli
>         Attachments: HADOOP-6001.patch, HADOOP-6001.patch, HADOOP-6001.patch, HDFS-397.patch,
HDFS-397.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message