hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
Date Fri, 03 Nov 2017 23:24:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238568#comment-16238568
] 

Virajith Jalaparti commented on HDFS-12665:
-------------------------------------------

Hi [~ehiggs], thanks for posting the new patch. A couple of comments:
# This patch includes changes that are part of HDFS-11902. Can you post a patch that does
not include these?
# The datanode should check the block pool id associated with a FileRegion before loading
it. The patch eliminates this check (in {{ProvidedBlockPoolSlice}}). This should be retained
as it ensures that the Datanode doesn't load blocks that shouldn't be associated with a Namenode.
For example, consider the case where a DN reports to two Namenodes, NN1 and NN2, in federation.
Only NN1 is configured with PROVIDED. Both NN1 and NN2 might have a block with the same id
but NN1 refers to a PROVIDED block and NN2 refers to a local block. The DN needs to distinguish
these two blocks with the same id. 
One way for the DN to know this is if the {{FileRegion}} or {{AliasMap}} has a block pool
id associated with it. This ensures that the block can be distinguished in the {{ReplicaMap}}
of the {{FsDatasetImpl}} and the two block aren't fixed up.

My proposal is to have the following as part of the API of {{BlockAliasMap}} so that we have
know the block pool id from the alias map.

{code}
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/blockaliasmap/BlockAliasMap.java
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/blockaliasmap/BlockAliasMap.java
index d276fb52036..e564097fd2e 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/blockaliasmap/BlockAliasMap.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/blockaliasmap/BlockAliasMap.java
@@ -47,6 +47,7 @@
      */
     public abstract U resolve(Block ident) throws IOException;

+    public abstract String getBlockPoolID() throws IOException;
   }

   /**
@@ -74,10 +75,12 @@
   /**
    * Returns the writer for the alias map.
    * @param opts writer options.
+   * @param blockPoolID block pool id to use
    * @return {@link Writer} to the alias map.
    * @throws IOException
    */
-  public abstract Writer<T> getWriter(Writer.Options opts) throws IOException;
+  public abstract Writer<T> getWriter(Writer.Options opts, String blockPoolID)
+      throws IOException;

   /**
    * Refresh the alias map.
{code}

I think this change along with the change of adding the {{ProvidedStorageLocation}} should
be done as part of HDFS-12713.

> [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-12665
>                 URL: https://issues.apache.org/jira/browse/HDFS-12665
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ewan Higgs
>            Assignee: Ewan Higgs
>            Priority: Major
>         Attachments: HDFS-12665-HDFS-9806.001.patch, HDFS-12665-HDFS-9806.002.patch,
HDFS-12665-HDFS-9806.003.patch
>
>
> The design of Provided Storage requires the use of an AliasMap to manage the mapping
between blocks of files on the local HDFS and ranges of files on a remote storage system.
To reduce load from the Namenode, this can be done using a pluggable external service (e.g.
AzureTable, Cassandra, Ratis). However, to aide adoption and ease of deployment, we propose
an in memory version.
> This AliasMap will be a wrapper around LevelDB (already a dependency from the Timeline
Service) and use protobuf for the key (blockpool, blockid, and genstamp) and the value (url,
offset, length, nonce). The in memory service will also have a configurable port on which
it will listen for updates from Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message