Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9BB3F200D08 for ; Thu, 7 Sep 2017 08:05:13 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9A39A16130A; Thu, 7 Sep 2017 06:05:13 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BB697161300 for ; Thu, 7 Sep 2017 08:05:12 +0200 (CEST) Received: (qmail 26834 invoked by uid 500); 7 Sep 2017 06:05:11 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 26823 invoked by uid 99); 7 Sep 2017 06:05:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Sep 2017 06:05:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2BE6CD7A9F for ; Thu, 7 Sep 2017 06:05:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.201 X-Spam-Level: X-Spam-Status: No, score=-99.201 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id QfZXAQXL5qp4 for ; Thu, 7 Sep 2017 06:05:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 26C4C5F257 for ; Thu, 7 Sep 2017 06:05:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A0172E0059 for ; Thu, 7 Sep 2017 06:05:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5714524153 for ; Thu, 7 Sep 2017 06:05:00 +0000 (UTC) Date: Thu, 7 Sep 2017 06:05:00 +0000 (UTC) From: "Anu Engineer (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HDFS-12340) Ozone: C/C++ implementation of ozone client using curl MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 07 Sep 2017 06:05:13 -0000 [ https://issues.apache.org/jira/browse/HDFS-12340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156526#comment-16156526 ] Anu Engineer edited comment on HDFS-12340 at 9/7/17 6:04 AM: ------------------------------------------------------------- Thank you for updating the patch. Some more minor comments. 1. /hadoop-common/src/CMakeLists.txt: Can we please handle the case where there is no curl installed on the build machine? if not found case. 2.nit: Can we please format the curl finding part consistent with the code around it? 3. src/main/native/libozoneclient/CMakeLists.txt: nit: I find it unusable to name the extension with a capital letter as in ".C", if there is such a convention can you please share? 4. LIBOZONECLIENT_VERSION "0.0.0", can we please set to this to 0.0.1 ? 5. Same comment for other CMake files where we use a ".C" file convention. 6. In main.c -- Can we use a proper command parser instead of case 1: 2: etc. I am looking for something like {noformat} > createVolume volume1 > help {noformat} etc. and correspondingly change the printUsage ? 7. In main.c -- since we have a bunch of isValidStr calls in the code, does it make sense to put them in the same place and have one printUsage call ? also would you like to init rc to -1 or 1 as the case may be and just return the rc always? 8. In main.c: main function -- can you please init all the variables. {noformat} char user[64] = {0}; char serverIp[64] = {0}; char port[64] = {0}; char version[16] = {0}; char vol[256] = {0}; char bucket[256] = {0}; char infile[256] = {0}; char key[256] = {0}; {noformat} 9. ozone_client.C: nit: can we please rename this as ozone_client.c -- small "c" as the name of the file. 10. we have isValidStr twice, I agree the scope is restricted, but I think we can define it once and can be used with 0 to cover the first use case. 11. nit: int executeCurlDownloadOpearation(ozoneInfo* info, char *url, FILE *fd); can we please have consistent placement for the * operator ? always with the variable and away from the type ? 12: Again a nit: you don't have to fix this. When we write a library, it is very rare that we will print to streams, even error stream. Generally, we will return different error codes so that calling application can decide how to print it, or we can provide a function that returns error string if needed. It makes it easier to use the library under c/c++ code. Please see how curl itself does this. 13: nit : you don't need these lines. {noformat} /* initialize the ozone Info structure */ info->curl = NULL; info->chunk = NULL; info->url = NULL; info->user = NULL; info->version = NULL; {noformat} calloc already takes care of this. 14: in the error path of initOzoneClient, don't we have to call the opposite function of curl_easy_init, that is curl_easy_cleanup ? 15: In function initOzoneClient, the variable rc is not needed. You can jump to exit only if you get a failure and return info just before that. That way you can eliminate the if / else condition too. 16: In function resetOzoneInfo: nit: line 148, double semi-colon at the end. 17: Before we dereference info, check it is not null. Line 152 18: resetOzoneInfo: The error handling of this function seems wrong. Here is a case where there could be a resource leak. Let us say the function fails at line 158, where is trying to do curl_easy_init(). The function fails with returning \-1. Now, what do I do with the ozoneInfo object? I cannot call the resetOzoneInfo function safely, since curl_easy_cleanup will fail if I try to call this function again, and I have no idea how to release this object. I would suggest that we re-write it, possibly with better error handling strategy. I cannot call destroyOzoneClient since it would suffer from the same issue that curl_easy_cleanup would happen on the info->curl, which has not been allocated. 19: In resetOzoneInfo: Is it possible that chunk will just accumulate the same tags again and again ? 20: In createBucket: {noformat} strcpy(url, info->url); strcat(url, volumeName); strcat(url, "/"); strcat(url, bucketName); {noformat} You can replace this with a single call to sprintf, same comment every other function. 21: In putKey: if we fail the if test in line 341/346/351/356 the program will crash since we are closing a NULL file in the exit path. was (Author: anu): Thank you for updating the patch. Some more minor comments. 1. /hadoop-common/src/CMakeLists.txt: Can we please handle the case where there is not curl installed on the build machine? if not found case. 2.nit: Can we please format the curl finding part consistent with the code around it? 3. src/main/native/libozoneclient/CMakeLists.txt: nit: I find it unusable to name the extension with a capital letter as in ".C", if there is such a convention can you please share? 4. LIBOZONECLIENT_VERSION "0.0.0", can we please set to this to 0.0.1 ? 5. Same comment for other CMake files where we use a ".C" file convention. 6. In main.c -- Can we use a proper command parser instead of case 1: 2: etc. I am looking for something like {noformat} > createVolume volume1 > help {noformat} etc. and correspondingly change the printUsage ? 7. In main.c -- since we have a bunch of isValidStr calls in the code, does it make sense to put them in the same place and have one printUsage call ? also would you like to init rc to -1 or 1 as the case may be and just return the rc always? 8. In main.c: main function -- can you please init all the variables. {noformat} char user[64] = {0}; char serverIp[64] = {0}; char port[64] = {0}; char version[16] = {0}; char vol[256] = {0}; char bucket[256] = {0}; char infile[256] = {0}; char key[256] = {0}; {noformat} 9. ozone_client.C: nit: can we please rename this as ozone_client.c -- small "c" as the name of the file. 10. we have isValidStr twice, I agree the scope is restricted, but I think we can define it once and can be used with 0 to cover the first use case. 11. nit: int executeCurlDownloadOpearation(ozoneInfo* info, char *url, FILE *fd); can we please have consistent placement for the * operator ? always with the variable and away from the type ? 12: Again a nit: you don't have to fix this. When we write a library, it is very rare that we will print to streams, even error stream. Generally, we will return different error codes so that calling application can decide how to print it, or we can provide a function that returns error string if needed. It makes it easier to use the library under c/c++ code. Please see how curl itself does this. 13: nit : you don't need these lines. {noformat} /* initialize the ozone Info structure */ info->curl = NULL; info->chunk = NULL; info->url = NULL; info->user = NULL; info->version = NULL; {noformat} calloc already takes care of this. 14: in the error path of initOzoneClient, don't we have to call the opposite function of curl_easy_init, that is curl_easy_cleanup ? 15: In function initOzoneClient, the variable rc is not needed. You can jump to exit only if you get a failure and return info just before that. That way you can eliminate the if / else condition too. 16: In function resetOzoneInfo: nit: line 148, double semi-colon at the end. 17: Before we dereference info, check it is not null. Line 152 18: resetOzoneInfo: The error handling of this function seems wrong. Here is a case where there could be a resource leak. Let us say the function fails at line 158, where is trying to do curl_easy_init(). The function fails with returning \-1. Now, what do I do with the ozoneInfo object? I cannot call the resetOzoneInfo function safely, since curl_easy_cleanup will fail if I try to call this function again, and I have no idea how to release this object. I would suggest that we re-write it, possibly with better error handling strategy. I cannot call destroyOzoneClient since it would suffer from the same issue that curl_easy_cleanup would happen on the info->curl, which has not been allocated. 19: In resetOzoneInfo: Is it possible that chunk will just accumulate the same tags again and again ? 20: In createBucket: {noformat} strcpy(url, info->url); strcat(url, volumeName); strcat(url, "/"); strcat(url, bucketName); {noformat} You can replace this with a single call to sprintf, same comment every other function. 21: In putKey: if we fail the if test in line 341/346/351/356 the program will crash since we are closing a NULL file in the exit path. > Ozone: C/C++ implementation of ozone client using curl > ------------------------------------------------------ > > Key: HDFS-12340 > URL: https://issues.apache.org/jira/browse/HDFS-12340 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Affects Versions: HDFS-7240 > Reporter: Shashikant Banerjee > Assignee: Shashikant Banerjee > Labels: OzonePostMerge > Fix For: HDFS-7240 > > Attachments: HDFS-12340-HDFS-7240.001.patch, HDFS-12340-HDFS-7240.002.patch, main.C, ozoneClient.C, ozoneClient.h > > > This Jira is introduced for implementation of ozone client in C/C++ using curl library. > All these calls will make use of HTTP protocol and would require libcurl. The libcurl API are referenced from here: > https://curl.haxx.se/libcurl/ > Additional details would be posted along with the patches. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org