From issues-return-6559-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org  Fri May  3 13:01:02 2019
Return-Path: <issues-return-6559-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id E5D5918067E
	for <archive-asf-public@cust-asf.ponee.io>; Fri,  3 May 2019 15:01:01 +0200 (CEST)
Received: (qmail 91436 invoked by uid 500); 3 May 2019 13:01:01 -0000
Mailing-List: contact issues-help@phoenix.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:issues-help@phoenix.apache.org>
List-Unsubscribe: <mailto:issues-unsubscribe@phoenix.apache.org>
List-Post: <mailto:issues@phoenix.apache.org>
List-Id: <issues.phoenix.apache.org>
Reply-To: dev@phoenix.apache.org
Delivered-To: mailing list issues@phoenix.apache.org
Received: (qmail 91417 invoked by uid 99); 3 May 2019 13:01:01 -0000
Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139)
    by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 May 2019 13:01:01 +0000
Received: from jira-lw-us.apache.org (unknown [207.244.88.139])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5FE6CE02EC
	for <issues@phoenix.apache.org>; Fri,  3 May 2019 13:01:00 +0000 (UTC)
Received: from jira-lw-us.apache.org (localhost [127.0.0.1])
	by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1E60725813
	for <issues@phoenix.apache.org>; Fri,  3 May 2019 13:01:00 +0000 (UTC)
Date: Fri, 3 May 2019 13:01:00 +0000 (UTC)
From: "Hadoop QA (JIRA)" <jira@apache.org>
To: issues@phoenix.apache.org
Message-ID: <JIRA.13229606.1556024946000.192829.1556888460121@Atlassian.JIRA>
In-Reply-To: <JIRA.13229606.1556024946000@Atlassian.JIRA>
References: <JIRA.13229606.1556024946000@Atlassian.JIRA> <JIRA.13229606.1556024946294@jira-lw-us.apache.org>
Subject: [jira] [Commented] (PHOENIX-5258) Add support to parse header from
 the input CSV file as input columns for CsvBulkLoadTool
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


    [ https://issues.apache.org/jira/browse/PHOENIX-5258?page=3Dcom.atlassi=
an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16=
832480#comment-16832480 ]=20

Hadoop QA commented on PHOENIX-5258:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest a=
ttachment=20
  http://issues.apache.org/jira/secure/attachment/12967764/PHOENIX-5258-4.x=
-HBase-1.4.patch
  against 4.x-HBase-1.4 branch at commit 4eec41f3f2b04865b6d59ebd3fbd3aa1e0=
a0fd80.
  ATTACHMENT ID: 12967764

    {color:green}+1 @author{color}.  The patch does not contain any @author=
 tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to inclu=
de any new or modified tests.
                        Please justify why no new tests are needed for this=
 patch.
                        Also please list what manual steps were performed t=
o verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the =
total number of javac compiler warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 6 rele=
ase audit warnings (more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following l=
ines longer than 100:
    +            stmt.execute("CREATE TABLE S.TABLE14 (ID INTEGER NOT NULL =
PRIMARY KEY, NAME VARCHAR, TYPE VARCHAR, CATEGORY VARCHAR)");
+                        "Headers in provided input files are different. He=
aders must be unique for all input files"
+    static final Option SKIP_HEADER_OPT =3D new Option("k", "skip-header",=
 false, "Skip the first line of CSV files (the header)");
+    static final Option HEADER_OPT =3D new Option("r", "header", false, "P=
arses the first line of CSV as the header");
+    private List<String> parseCsvHeaders(CommandLine cmdLine, Configuratio=
n conf) throws IOException {
+                "Headers in provided input files are different. Headers mu=
st be unique for all input files"
+    private List<String> fetchAllHeaders(Iterable<String> paths, Configura=
tion conf) throws IOException {

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end=
.UpgradeIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.inde=
x.MutableIndexIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.Inde=
xRebuildTaskIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.join=
.HashJoinMoreIT

Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/2551//t=
estReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Bui=
ld/2551//artifact/patchprocess/patchReleaseAuditWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/2551/=
/console

This message is automatically generated.

> Add support to parse header from the input CSV file as input columns for =
CsvBulkLoadTool
> -------------------------------------------------------------------------=
---------------
>
>                 Key: PHOENIX-5258
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5258
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Prashant Vithani
>            Priority: Minor
>             Fix For: 4.15.0, 5.1.0
>
>         Attachments: PHOENIX-5258-4.x-HBase-1.4.patch, PHOENIX-5258-maste=
r.patch
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, CsvBulkLoadTool does not support reading header from the input=
 csv and expects the content of the csv to match with the table schema. The=
 support for the header can be added to dynamically map the schema with the=
 header.
> The=C2=A0proposed solution=C2=A0is to introduce another option for the to=
ol `=E2=80=93header`. If this option is passed, the input columns list is c=
onstructed by reading the first line of the input CSV file.
>  * If there=C2=A0is only one file, read the header from the first line an=
d generate the `ColumnInfo` list.
>  * If there are multiple files, read the header from all the files, and t=
hrow an error if the headers across files do not match.


--
This message was sent by Atlassian JIRA
(v7.6.3#76005)