hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12071) Ozone: Corona: Implementation of Corona
Date Thu, 20 Jul 2017 22:32:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095482#comment-16095482
] 

Anu Engineer commented on HDFS-12071:
-------------------------------------

+1, I ran corona successfully on a test cluster. We need more stats, the ability to control
threads, the ability to dump JMX from the server etc. etc. But this is an amazing start. For
the first time, I just wrote *78 thousand keys into Ozone*. I am happy to report ozone worked
perfectly. So I am going to commit this.


> Ozone: Corona: Implementation of Corona
> ---------------------------------------
>
>                 Key: HDFS-12071
>                 URL: https://issues.apache.org/jira/browse/HDFS-12071
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Nandakumar
>            Assignee: Nandakumar
>         Attachments: HDFS-12071-HDFS-7240.000.patch, HDFS-12071-HDFS-7240.001.patch,
HDFS-12071-HDFS-7240.002.patch
>
>
> Tool to populate ozone with data for testing.
> This is not a map-reduce program and this is not for benchmarking Ozone write throughput.
> It supports both online and offline modes. Default mode is offline, {{-mode}} can be
used to change the mode.
>  
> In online mode, active internet connection is required, common crawl data from AWS will
be used. Default source is [CC-MAIN-2017-17/warc.paths.gz | https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2017-17/warc.paths.gz]
(it contains the path to actual data segment), user can override this using {{-source}}.
> The following values are derived from URL of Common Crawl data
> * Domain will be used as Volume
> * URL will be used as Bucket
> * FileName will be used as Key
>  
> In offline mode, the data will be random bytes and size of data will be 10 KB.
> * Default number of Volumes 10, {{-numOfVolumes}} can be used to override 
> * Default number of Buckets per Volume 1000, {{-numOfBuckets}} can be used to override

> * Default number of Keys per Bucket 500000, {{-numOfKeys}} can be used to override 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message