From issues-return-49842-archive-asf-public=cust-asf.ponee.io@drill.apache.org Mon Feb 19 07:38:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 1E949180607 for ; Mon, 19 Feb 2018 07:38:07 +0100 (CET) Received: (qmail 73990 invoked by uid 500); 19 Feb 2018 06:38:07 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 73980 invoked by uid 99); 19 Feb 2018 06:38:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Feb 2018 06:38:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 94D3E180404 for ; Mon, 19 Feb 2018 06:38:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.311 X-Spam-Level: X-Spam-Status: No, score=-110.311 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id ZBFt6Nmno9ET for ; Mon, 19 Feb 2018 06:38:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8DF235F47E for ; Mon, 19 Feb 2018 06:38:04 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 38091E00CB for ; Mon, 19 Feb 2018 06:38:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 54E7621E5B for ; Mon, 19 Feb 2018 06:38:01 +0000 (UTC) Date: Mon, 19 Feb 2018 06:38:00 +0000 (UTC) From: "Paul Rogers (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-6168) Table functions do not "inherit" default configuration MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Paul Rogers created DRILL-6168: ---------------------------------- Summary: Table functions do not "inherit" default configuration Key: DRILL-6168 URL: https://issues.apache.org/jira/browse/DRILL-6168 Project: Apache Drill Issue Type: Bug Affects Versions: 1.12.0 Reporter: Paul Rogers See DRILL-6167 that describes an attempt to use a table function with a regex format plugin. Consider the plugin configuration: {code} RegexFormatConfig sampleConfig = new RegexFormatConfig(); sampleConfig.extension = "log1"; sampleConfig.regex = DATE_ONLY_PATTERN; sampleConfig.fields = Lists.newArrayList("year", "month", "day"); {code} (This plugin is defined in code in a test rather than the usual JSON in the Web console.) Run a test with the above. Things work fine. Now, try the plugin config with a table function as described in DRILL-6167: {code} String sql = "SELECT * FROM table(cp.`regex/simple.log2`\n" + "(type => 'regex', regex => '(\\\\d\\\\d\\\\d\\\\d)-(\\\\d\\\\d)-(\\\\d\\\\d) .*'))"; client.queryBuilder().sql(sql).printCsv(); {code} Because we are using a file with suffix "log2", the query will match the format plugin config defined above. A query without the table function does, in fact, work using the defined config. But, with a table function, we get this warning from our regex code: {noformat} 13307 WARN [257590e1-e846-9d82-61d4-e246a4925ac3:frag:0:0] [org.apache.drill.exec.store.easy.regex.RegexRecordReader] - Column list has fewer names than the pattern has groups, filling extras with Column$n. {noformat} (The warning is in the custom plugin, not Drill.) This is the plugin saying, "hey! you didn't provide column names!". But, in the format definition, we did provide names. If we run the query without a table function, we do see those names used. Result: {noformat} 3 row(s): Column$0,Column$1,Column$2 2017,12,17 2017,12,18 2017,12,19 Total rows returned : 3. Returned in 9072ms. {noformat} Yes, indeed, the table function discarded the defined format config values, filling in blanks, including for the column names. The expected behavior is that all properties defined in the config should remain unchanged _except_ for those in the table function. Why? In order to know which format plugin to use, the code has to map from the suffix (".log2" here) to a format plugin _config_. (The config is the only thing that specifies a suffix.) Since we mapped to a config (not the unconfigured plugin), we'd expect the config properties to be used. It is highly surprising that all we get to use is the suffix, but all other attributes are ignored. This seems very much in the "bug" category and not at all in the "feature" category. -- This message was sent by Atlassian JIRA (v7.6.3#76005)