hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDDS-1085) Create an OM API to serve snapshots to Recon server
Date Sat, 16 Feb 2019 01:13:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769933#comment-16769933
] 

Anu Engineer edited comment on HDDS-1085 at 2/16/19 1:12 AM:
-------------------------------------------------------------

[~avijayan] It is a very good patch, well written and very easy to understand. I have some
very minor comments.
 # *DBCheckPointSnapShot#getCheckpointLocation* – Return a path ?
 # *OMDbSnapshotServlet.java#doGet* - I understand that doing this inline is perhaps simpler
than anything else. But we seem to be doing, one, checkpointing, two taring before we start
the transfer. For DB sizes, in GBs it might be ok, but in the long run I am worried that we
might start seeing client timeouts.
 ## To understand what is happening, it might be interesting to have 3 counters – or a map
of counters.
 ### How much time are we taking for each CheckPoint
 ### How much time are we taking for each Tar operation – along with sizes
 ### How much time are we taking for the transfer.
 ## You don't have to do this in this patch, feel free to add that in a different patch. In
the long run, if we have issues like client time out, this number will help us tune the client
params. Also, at some point, we will have to do this in a background thread and just return
when we are ready and not sync like this. But this is a great start. So let us go ahead and
see what we can get out of this.
 # *OMDbSnapshotServlet.java#doGet* - Since we are using the TransferImage class, are we going
to carry hadoop-hdfs Jar too ? Should we even consider moving this to hadoop-common? [~xyao],
[~elek], [~bharatviswa]
 # OmUtils.java- check if we have this Tarfile code already in Ozone. I think we have something
like this already [~elek] ?
 # *OmUtils.java#addFilesToArchive* – In the recursive call we seem to pass _cFile.getAbsolutePath_,
is that expected? or should the archive contain relative paths?
 # *RDBCheckpointManager#createCheckpointSnapshot* - I see we are reading the temp directory
for the JVM env. but doesn't the checkpoint of RocksDB need/or is fast if it is on the same
disk since it is able to hard link the SST and WAL files? Just wanted to make sure that my
understanding is not busted.

 

 


was (Author: anu):
[~avijayan] It is a very good patch, well written and very easy to understand. I have some
very minor comments.
 # *DBCheckPointSnapShot#getCheckpointLocation* – Return a path ?
 # *OMDbSnapshotServlet.java#doGe*t - I understand that doing this inline is perhaps simpler
than anything else. But we seem to be doing, one, checkpointing, two taring before we start
the transfer. For DB sizes, in GBs it might be ok, but in the long run I am worried that we
might start seeing client timeouts.
 ## To understand what is happening, it might be interesting to have 3 counters – or a map
of counters.
 ### How much time are we taking for each CheckPoint
 ### How much time are we taking for each Tar operation – along with sizes
 ### How much time are we taking for the transfer.
 ## You don't have to do this in this patch, feel free to add that in a different patch. In
the long run, if we have issues like client time out, this number will help us tune the client
params. Also, at some point, we will have to do this in a background thread and just return
when we are ready and not sync like this. But this is a great start. So let us go ahead and
see what we can get out of this.
 # *OMDbSnapshotServlet.java#doGet* - Since we are using the TransferImage class, are we going
to carry hadoop-hdfs Jar too ? Should we even consider moving this to hadoop-common? [~xyao],
[~elek], [~bharatviswa]
 # OmUtils.java- check if we have this Tarfile code already in Ozone. I think we have something
like this already [~elek] ?
 # *OmUtils.java#addFilesToArchive* – In the recursive call we seem to pass _cFile.getAbsolutePath_,
is that expected? or should the archive contain relative paths?
 # *RDBCheckpointManager#createCheckpointSnapshot* - I see we are reading the temp directory
for the JVM env. but doesn't the checkpoint of RocksDB need/or is fast if it is on the same
disk since it is able to hard link the SST and WAL files? Just wanted to make sure that my
understanding is not busted.

 

 

> Create an OM API to serve snapshots to Recon server
> ---------------------------------------------------
>
>                 Key: HDDS-1085
>                 URL: https://issues.apache.org/jira/browse/HDDS-1085
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Siddharth Wagle
>            Assignee: Aravindan Vijayan
>            Priority: Major
>         Attachments: HDDS-1085-000.patch, HDDS-1085-001.patch, HDDS-1085-002.patch
>
>
> We need to add an API to OM so that we can serve snapshots from the OM server.
>  - The snapshot should be streamed to fsck server with the ability to throttle network
utilization (like TransferFsImage)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message