hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11470) Ozone: SCM: Add SCM CLI
Date Fri, 31 Mar 2017 03:09:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950247#comment-15950247
] 

Anu Engineer commented on HDFS-11470:
-------------------------------------

[~cheersyang] Thanks you for the comments and questions. Specific questions discussed below.

bq. Permission. Who can run these commands, most of them seem to require admin privilege,
e.g create/delete container, pipeline, node pools; some other commands might be applicable
for normal users, info container, pipeline, node pools. Usually commands under -admin requires
admin permission and others don't, is it same for SCM commands?
In the case of SCM all commands -- let us for the time being say,  will need "admin" privilege.
 Since users are never expected to work against raw SCM.

bq. Command output. What is the output of each command, json format data?
Good question, I haven't thought about it. What do you suggest ? I am good with JSON, or should
it be simple text with --json as an argument that outputs json.

bq. Put key allows an argument -o to write a file into a container with they key, does this
only support local file? Does it make sense to support DFS compatible files? Same as other
-o options for other commands, getContainer, getKey etc.
Again good catch. I was thinking mostly of local file, since from my perspective no real user
should ever put a block directly into SCM. They should use ozone APIs do that. This is mostly
for our internal testing purposes. But I see why supporting DFS would be a good idea.

bq. List key seems required, otherwise how an user would know key names they can possibly
operate on? I implemented listKey API in HDFS-11569, we can have similar API open for commandline,
that way user can specify prefix, preKey name and count to get a better looking result. This
seamlessly can support list key range by range if user wants.
The list you have implemented operates against a single container. Now image a billion of
such containers. That is the SCM block address space. While the command is useful for us ..
May be we can expose exactly what you have done, that is list the keys in a container. 
what do you say ? so list keys will take containerID as an argument.

bq. Create container. Container ID is more an internal thing, would it make sense to let SCM
return a container ID for user? Split this command to 2 phase call, first hdfs scm -container
create -p <pipelineID>, it returns a container ID in the output, then hdfs scm -container
create <containerID> -p <pipelineID> to actually create it. If this is just for
dev purpose, might not be necessary.

Thanks for flagging it, it is a mistake from me. I will fix it in the next update of the doc.

bq. List container. Would it be useful to support -end option as well? 
Are you talking about a case where user wants to list all keys from A to K and does not know
the count ? It is easy for us to support it if needed.

bq. Remove a node from pool. What happens at the backend, is it similar to decommission this
node? Do we move the containers on this node to another node in the same pool?
Nope, a node that is removed from pool remains healthy and available but will not be used
SCM to place new objects. SCM will discard all information from that node other than heartbeats
and replicate closed containers to other machines. This is an internal step used by SCM to
move a machine from one pool to another pool.

Just make sure that we are clear about this, when we decommission a machine, it is removed
from cluster membership, even heart beats from that node will be rejected once the decommission
is complete.

bq. This might not be relevant, but still want to ask. Will it be necessary to support balance
pool? That says to move containers across nodes in a same pool to get a better balanced disk
usage.
Hopefully no, Please look at the patch HDFS-11564.  Please look at the class comments in file
{{SCMContainerPlacementCapacity.java}}. It explains why hopefully we do not need balance operations.
But it is not a bad idea to have a capability to balance.

I will update the document on monday, will give time for rest of the contributors to read
the doc and update in single go. 

Once more, thank you very much for your time and thoughtful comments.












> Ozone: SCM: Add SCM CLI
> -----------------------
>
>                 Key: HDFS-11470
>                 URL: https://issues.apache.org/jira/browse/HDFS-11470
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>         Attachments: storage-container-manager-cli-v001.pdf
>
>
> This jira the describes the SCM CLI. Since CLI will have lots of commands, we will file
other JIRAs for specific commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message