hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nandakumar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12537) Ozone: Reduce key creation overhead in Corona
Date Wed, 04 Oct 2017 16:49:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191549#comment-16191549
] 

Nandakumar commented on HDFS-12537:
-----------------------------------

Thanks [~ljain] for updating the patch.

In *Corona.java*

If {{threadThroughput}} is just for holding the thread's throughput, you can use a simple
list and make it thread safe with {{Collections.synchronizedList()}} instead of using {{BlockingQueue<Double>}}

Line: 325
This 
{code}
    jsonDir = cmdLine.hasOption(JSON_WRITE_DIRECTORY) ?
        cmdLine.getOptionValue(JSON_WRITE_DIRECTORY) : null;
{code}
can be replaced with
{code}
jsonDir = cmdLine.getOptionValue(JSON_WRITE_DIRECTORY);
{code}
{{CommandLine#getOptionValue}} returns the Value of argument if option is set, and has an
argument, otherwise null.

Line 412: There is no need to assign {{keyValue}} to the local variable {{value}}, keyValue
can be directly used in {{os.write(keyValue)}}

Line 428: Incomplete value is added for validation, this will cause the validation of writes
to fail. Since the size of value can be huge we can use checksum to optimize data validation,
this can be done in follow up jira. For now you can add the complete value for validation.



> Ozone: Reduce key creation overhead in Corona
> ---------------------------------------------
>
>                 Key: HDFS-12537
>                 URL: https://issues.apache.org/jira/browse/HDFS-12537
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Lokesh Jain
>            Assignee: Lokesh Jain
>         Attachments: HDFS-12537-HDFS-7240.001.patch, HDFS-12537-HDFS-7240.002.patch,
HDFS-12537-HDFS-7240.003.patch
>
>
> Currently Corona creates random key values for each key. This creates a lot of overhead.
An option should be provided to use a single key value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message