hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11926) Ozone: Implement a common helper to return a range of KVs in levelDB
Date Wed, 07 Jun 2017 04:47:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040159#comment-16040159

Anu Engineer commented on HDFS-11926:

[~cheersyang] Me and [~xyao] spend quite some time today trying to pin point if the semantics
of RocksDB and LevelDB is same. Unfortunately the LevelDB documentation  seems to indicate
that we cannot rely on what RocksDB is guaranteeing. 

A database may only be opened by one process at a time. The leveldb implementation acquires
a lock from the operating system to prevent misuse. Within a single process, the same leveldb::DB
object may be safely shared by multiple concurrent threads. I.e., different threads may write
into or fetch iterators or call Get on the same database without any external synchronization
(the leveldb implementation will automatically do the required synchronization). However other
objects (like Iterator and WriteBatch) may require external synchronization. If two threads
share such an object, they must protect access to it using their own locking protocol. More
details are available in the public header files.

However we found some discussion in Google groups which seemed to indicate what [~xyao] said
in his comments to be true. So for all we know it *might* work without snapshots. The downside
is that we will have to reproduce the race condition to test it and your comments seems to
indicate the cost of snapshot is very low. So a strict interpretation of the above documentation
by us was the if you use the same object then iteration is going to be thread safe between
2 threads. However we are not able to understand how this behaves if there is are concurrent
readers and writers using different objects.  That is, it does not give us any information
if 2 calls in a thread would see the same state (based on a version number) or not.

There are 2 options for us, one is to read the source of LevelDB to understand how this code
behaves or  switch over to use RocksDB, which explicitly documents this behavior.  

> Ozone: Implement a common helper to return a range of KVs in levelDB
> --------------------------------------------------------------------
>                 Key: HDFS-11926
>                 URL: https://issues.apache.org/jira/browse/HDFS-11926
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Blocker
>         Attachments: HDFS-11926-HDFS-7240.001.patch, HDFS-11926-HDFS-7240.002.patch,
HDFS-11926-HDFS-7240.003.patch, HDFS-11926-HDFS-7240.004.patch
> There are quite some *LIST* operations need to get a range of keys or values from levelDB,
and filter entries with key prefix. 
> # HDFS-11782 listKeys
> # HDFS-11779 listBuckets
> # HDFS-11773 listVolumes
> # HDFS-11679 listContainers
> we need to implement a common utility for them.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message