Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Fri, 10 Jul 2015 22:15:05 +0000 (UTC)
From: "Arpit Agarwal (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12840956.1435345850000.159526.1436566505520@Atlassian.JIRA>
In-Reply-To: <JIRA.12840956.1435345850000@Atlassian.JIRA>
References: <JIRA.12840956.1435345850000@Atlassian.JIRA>
 <JIRA.12840956.1435345850022@arcas>
Subject: [jira] [Comment Edited] (HDFS-8677) Ozone: Introduce
 KeyValueContainerDatasetSpi
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622960#comment-14622960 ] 

Arpit Agarwal edited comment on HDFS-8677 at 7/10/15 10:14 PM:
---------------------------------------------------------------

Thanks for the detailed reviews!

----
[~kanaka], I addressed #1 and #3 from your feedback.

bq. 2) The interface methods can have public access as similar to old FsDatasetSpi
Interface methods are public by default, the keyword is unnecessary.

----
[~anu], addressed your suggestions. Few comments below:
bq. On KeyValueContainer Just some thoughts :(Needs no action from you right now). Does it make sense to support a getGenerationStamp/ setGenerationStamp on Keys too ? I was just thinking about how the interface might change when we support versions too. Another way to do it might be to support get(key, version).
Generation stamp as we are using it here is per-container and updated only when setting up a new write pipeline. But I agree in the future we may need a per-key version.
 
bq. At the protocol level the semantics we offer is "fail if a bucket is not empty" - void KeyValueContainer#destroy() This function seems to offer a recursive delete. I can see why this is good to have (makes testing etc easier), but could lead to violation of the above semantics in the long run. Does it make sense to say destroy fails if container is not empty. It might be that we have real use cases for this, if so just document it.
I think this makes sense. Added a {{#delete}} method that only works on empty containers and removed {{#destroy}}.

bq. keyvaluecontainerdataset in the comments "The length of individual keys and values depends on the specific implementation but it is expected to be limited to a few KB." Our design document says - 1024 bytes for Key Length, and MBs to GBs for the value. You might want to update the comment
Key-value containers will not store user values if they are over a few KB since it would not be very performant to do so. Instead we'll store the user values in a file and store a reference to the file in the KV container. This is why we have the blob interface, which accepts streaming writes via a FileChannel. This is for the pipeline code to use and does not match what we expose to clients.

bq. nit: KeyValueContainerDatasetSpi#createContainer throws IOException, ContainerAlreadyExistsException; since ContainerAlreadyExistsException is derived from IOException this might be redundant. On the other hand, the verbosity makes it clear what the most probable error is. I am just flagging it for your attention.
Yes this was deliberate to explain the contract better.

bq. Do we need a throws clause, or is this function guaranteed to never fail ? KeyValueContainerDatasetSpi#getContainerSet same comment for KeyValueContainerDatasetSpi#getBlobs
Correct the expectation is these will not fail since these are in-memory operations.

bq. Since we support KeyValueContainerIterator#seek(byte[] key) does it make sense to support tell, as in "where am I right now" ? or do you suppose nextKey is adequate enough.
If I understand you correctly, {{nextKey}} is the equivalent of tell since it does not advance the iterator. Let me know if I misunderstood.

bq. keyvaluecontainerdataset the comment - "Blobs support streaming writes but not atomic puts". Can explain this a little bit more, what exactly is the expected semantics here
So the idea is that the pipeline code will create a file, stream user content into it, close the channel and then insert a reference to the file into the corresponding KV container to make it visible in the namespace. Let me know if you think the interface documentation can be improved.


was (Author: arpitagarwal):
Thanks for the detailed reviews!

----
[~kanaka], I addressed #1 and #3 from your feedback.

bq. 2) The interface methods can have public access as similar to old FsDatasetSpi
Interface methods are public by default, the keyword is unnecessary.

----
[~anu], addressed your suggestions. Few comments below:
bq. On KeyValueContainer Just some thoughts :(Needs no action from you right now). Does it make sense to support a getGenerationStamp/ setGenerationStamp on Keys too ? I was just thinking about how the interface might change when we support versions too. Another way to do it might be to support get(key, version).
Generation stamp as we are using it here is per-container and updated only when setting up a new write pipeline. But I agree in the future we may need a per-key version.
 
bq. At the protocol level the semantics we offer is "fail if a bucket is not empty" - void KeyValueContainer#destroy() This function seems to offer a recursive delete. I can see why this is good to have (makes testing etc easier), but could lead to violation of the above semantics in the long run. Does it make sense to say destroy fails if container is not empty. It might be that we have real use cases for this, if so just document it.
I think this makes sense. Added a {{#delete}} method that only works on empty containers and removed {{#destroy}}.

bq. keyvaluecontainerdataset in the comments "The length of individual keys and values depends on the specific implementation but it is expected to be limited to a few KB." Our design document says - 1024 bytes for Key Length, and MBs to GBs for the value. You might want to update the comment
Key-value containers will not store user values if they are over a few KB since it would not be very performant to do so. Instead we'll store the user values in a file and store a reference to the file in the KV container. This is why we have the blob interface, which accepts streaming writes via a FileChannel.

bq. nit: KeyValueContainerDatasetSpi#createContainer throws IOException, ContainerAlreadyExistsException; since ContainerAlreadyExistsException is derived from IOException this might be redundant. On the other hand, the verbosity makes it clear what the most probable error is. I am just flagging it for your attention.
Yes this was deliberate to explain the contract better.

> Ozone: Introduce KeyValueContainerDatasetSpi
> --------------------------------------------
>
>                 Key: HDFS-8677
>                 URL: https://issues.apache.org/jira/browse/HDFS-8677
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-8677-HDFS-7240.01.patch, HDFS-8677-HDFS-7240.02.patch, HDFS-8677-HDFS-7240.03.patch
>
>
> KeyValueContainerDatasetSpi will be a new interface for Ozone containers, just as FsDatasetSpi is an interface for manipulating HDFS block files.
> The interface will have support for both key-value containers for storing Ozone metadata and blobs for storing user data.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)