hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8677) Ozone: Introduce KeyValueContainerDatasetSpi
Date Fri, 10 Jul 2015 22:05:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622960#comment-14622960

Arpit Agarwal commented on HDFS-8677:

Thanks for the detailed reviews!

[~kanaka], I addressed #1 and #3 from your feedback.

bq. 2) The interface methods can have public access as similar to old FsDatasetSpi
Interface methods are public by default, the keyword is unnecessary.

[~anu], addressed your suggestions. Few comments below:
bq. On KeyValueContainer Just some thoughts :(Needs no action from you right now). Does it
make sense to support a getGenerationStamp/ setGenerationStamp on Keys too ? I was just thinking
about how the interface might change when we support versions too. Another way to do it might
be to support get(key, version).
Generation stamp as we are using it here is per-container and updated only when setting up
a new write pipeline. But I agree in the future we may need a per-key version.
bq. At the protocol level the semantics we offer is "fail if a bucket is not empty" - void
KeyValueContainer#destroy() This function seems to offer a recursive delete. I can see why
this is good to have (makes testing etc easier), but could lead to violation of the above
semantics in the long run. Does it make sense to say destroy fails if container is not empty.
It might be that we have real use cases for this, if so just document it.
I think this makes sense. Added a {{#delete}} method that only works on empty containers and
removed {{#destroy}}.

bq. keyvaluecontainerdataset in the comments "The length of individual keys and values depends
on the specific implementation but it is expected to be limited to a few KB." Our design document
says - 1024 bytes for Key Length, and MBs to GBs for the value. You might want to update the
Key-value containers will not store user values if they are over a few KB since it would not
be very performant to do so. Instead we'll store the user values in a file and store a reference
to the file in the KV container. This is why we have the blob interface, which accepts streaming
writes via a FileChannel.

bq. nit: KeyValueContainerDatasetSpi#createContainer throws IOException, ContainerAlreadyExistsException;
since ContainerAlreadyExistsException is derived from IOException this might be redundant.
On the other hand, the verbosity makes it clear what the most probable error is. I am just
flagging it for your attention.
Yes this was deliberate to explain the contract better.

> Ozone: Introduce KeyValueContainerDatasetSpi
> --------------------------------------------
>                 Key: HDFS-8677
>                 URL: https://issues.apache.org/jira/browse/HDFS-8677
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-8677-HDFS-7240.01.patch, HDFS-8677-HDFS-7240.02.patch
> KeyValueContainerDatasetSpi will be a new interface for Ozone containers, just as FsDatasetSpi
is an interface for manipulating HDFS block files.
> The interface will have support for both key-value containers for storing Ozone metadata
and blobs for storing user data.

This message was sent by Atlassian JIRA

View raw message