From user-return-565-archive-asf-public=cust-asf.ponee.io@madlib.apache.org Thu Jul 26 20:47:51 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 2AB8B180621 for ; Thu, 26 Jul 2018 20:47:49 +0200 (CEST) Received: (qmail 43343 invoked by uid 500); 26 Jul 2018 18:47:49 -0000 Mailing-List: contact user-help@madlib.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@madlib.apache.org Delivered-To: mailing list user@madlib.apache.org Received: (qmail 43333 invoked by uid 99); 26 Jul 2018 18:47:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jul 2018 18:47:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id CB1DB1A13E1 for ; Thu, 26 Jul 2018 18:47:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.075 X-Spam-Level: *** X-Spam-Status: No, score=3.075 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FORGED_HOTMAIL_RCVD2=1.187, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=hotmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Ou4bla3YY873 for ; Thu, 26 Jul 2018 18:47:46 +0000 (UTC) Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-oln040092005105.outbound.protection.outlook.com [40.92.5.105]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 0530C5F107 for ; Thu, 26 Jul 2018 18:47:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=S9h+1AOquBuWIeyQDmRAkVzpPEwlVoa16WercQ9gBQk=; b=NxJn/17J3bIGqlG2ZhLlYRrNLPDccmlfsuU0Q8R53G39ZkChDpJ4ScWjnWtGhgPvu5WvPAo9UgF0EGqYuNa7TG/P6yS8COkutLsVfIx2cWPgtzpQyP0FvUf76qd0f8MLvFVoLXq4yUkF3/ZoKkF6Mb1OUVsGLqGIiPfaT/SijFm9VDTRSz6odcuFNPXMJzZrIp2TYIkl+51rOy2Z0rda6avWMIfIvZk69uyUUJwxuzjKgBAg574TsYp4FGzmsPcpaaG7hacFTRuuZiBufuRk1XkUlOdtm1tS+FU3iE5TIEuqNAFDvp+v5ZUgeqplEH8dkPIx6TTy5YKsp3DI+Zhljg== Received: from CY1NAM02FT011.eop-nam02.prod.protection.outlook.com (10.152.74.57) by CY1NAM02HT156.eop-nam02.prod.protection.outlook.com (10.152.75.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.20.995.12; Thu, 26 Jul 2018 18:47:38 +0000 Received: from DM5PR12MB1801.namprd12.prod.outlook.com (10.152.74.52) by CY1NAM02FT011.mail.protection.outlook.com (10.152.75.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.995.12 via Frontend Transport; Thu, 26 Jul 2018 18:47:38 +0000 Received: from DM5PR12MB1801.namprd12.prod.outlook.com ([fe80::874:480f:2189:8199]) by DM5PR12MB1801.namprd12.prod.outlook.com ([fe80::874:480f:2189:8199%9]) with mapi id 15.20.0995.014; Thu, 26 Jul 2018 18:47:38 +0000 From: LUYAO CHEN To: "user@madlib.apache.org" Subject: A problem of madlib.cross_validation_general Thread-Topic: A problem of madlib.cross_validation_general Thread-Index: AQHUJRCjGFhjZwbBREOyoRizJ2nnMQ== Date: Thu, 26 Jul 2018 18:47:38 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:FC9AC2A5DF9ED04D0592103A71007CB0B735FFF8D07F4CF650D7001F06146D35;UpperCasedChecksum:2CFDB4B9D47D28E25C98D2569056074543DA1411F450430A26341BD2B8A5D6D3;SizeAsReceived:6891;Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [5mBo99Xujwik0oa2ivj7dGrXiYMHwwZsLFfCn1cQvcU=] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CY1NAM02HT156;6:Gsw7K+xBeRv1lHF456FyAEnFucwWxydSDMF8H38pyAsttho8njdorL48BA4mSfaW6EiVqTrPCbagRWzISwdDQNtHGZ3dnqniLgKp6J/glo+0iDYX+x+9/yceM4b4OyBxZgVquMHh8SGl3pvEWBHR4gzSFQSk03BHdiWqHKe2RIEol79d7wY3EBcnpJ6LsAmzyg+3hhkZX57pokO59a5/TvjMey1cn4ANq01/gtdL+4B6C7b/BEcfHCAdq4oGzS04gVSrlu3HzAc/0L0lsFJew6dDLpZ9arfiohhenxe2N4AcScnGU08mJyC+TCSd9mRCamNNgv7NTV50k4jds/eG4KBK/fD3pi71OqWwT9pN4XOBcMZcX5srKReChy71xoq4jLWmTipos7Wyn4+NHeZxylYHgdho+a0dXhSo/shwtLW/IHRYH/HAG2v+hST/o998NnpekC3NL6u/dgnUJYDr6Q==;5:Zi3QCZbcAO5E3OOOenXnPJYJ9Rif1wLALxWjI0SOg4OP+JHuG0eLuCPj2VASERolSGyY50RRjgeUWzFjx2xSa3krwHQagO6SnWLpX+1LzhaSmgsEIcqkVvR/Q1mkidpxCSB7CEQhw3I46IsqWKLH1nPOX88xs1f7QLT2wml4lAU=;7:0MGunImv0NjDrUxyv0XmRRwk6bsSXXnlO3KT1iLf/j9bVzi7pKobDubINNqNjGna2s6xmiCHF83m7X6RQCB6ai0nNPCDrEgw/Vq70zAmu+Pk6KyjS/mumGaYV0LlDeVSOahf+GGbiVo22Pxf0ioXz0f38hxdKc/8rjGQ4UAitKoQ4PhLZV9uV5SkAvVwF0K1qa5kzuZKKcjK9wbAQwUQ3TLyoHLITCM1Shy7ELbAEWBeTszL4Jg4FJ6GKh5ffOLS x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0;PCL:0;RULEID:(7020095)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(2017031322404)(1603101448)(1601125500)(1701031045);SRVR:CY1NAM02HT156; x-ms-traffictypediagnostic: CY1NAM02HT156: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(82015058);SRVR:CY1NAM02HT156;BCL:0;PCL:0;RULEID:;SRVR:CY1NAM02HT156; x-forefront-prvs: 07459438AA x-forefront-antispam-report: SFV:NSPM;SFS:(7070007)(189003)(199004)(74316002)(413944005)(26005)(2501003)(14454004)(83332001)(7696005)(8936002)(102836004)(256004)(1730700003)(81156014)(8676002)(6506007)(53336002)(25786009)(6346003)(53346004)(68736007)(82202002)(486006)(97736004)(20460500001)(104016004)(476003)(73972006)(5640700003)(6436002)(56003)(54896002)(9686003)(5250100002)(33656002)(86362001)(99286004)(105586002)(55016002)(4743002)(5660300001)(106356001)(2900100001)(19627405001)(2351001)(87572001)(400700002)(6916009)(6606003)(15852004)(27310500001);DIR:OUT;SFP:1901;SCL:1;SRVR:CY1NAM02HT156;H:DM5PR12MB1801.namprd12.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1; received-spf: None (protection.outlook.com: hotmail.com does not designate permitted sender hosts) authentication-results: spf=none (sender IP is ) smtp.mailfrom=luyao_chen@hotmail.com; x-microsoft-antispam-message-info: svBSvtdCERrdBqsi9Yh21405KdeclNnOtRnDtFR1G6uJQjlXKvS4zUZ7PUuCy9Y8xJDwlqzVoKoow80fRuR0VwfBkFO8vbkq0xRrFrQPVJA4IVZWgHTkkvIDpqWBeZnwMz4TPnekPRjYAdtXLritDN61PJKVPQx13/Fq0rqiRCKfY2O6d+OgfEx8VamTSlJSirRz1jmkCgEW+gv8+k4uFyiPu562awvv01WhNY70He4= Content-Type: multipart/alternative; boundary="_000_DM5PR12MB1801CA31C3D5D79F6F185534FE2B0DM5PR12MB1801namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 9a4e3081-9524-43cf-bfc3-dcaef82d5da1 X-MS-Exchange-CrossTenant-Network-Message-Id: 9088c1a6-e992-46e3-3fbf-08d5f3284282 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 9a4e3081-9524-43cf-bfc3-dcaef82d5da1 X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Jul 2018 18:47:38.2554 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1NAM02HT156 --_000_DM5PR12MB1801CA31C3D5D79F6F185534FE2B0DM5PR12MB1801namp_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear user community, I found a problem that could be a bug. Could you explore? It happened when the explore value is varchar, in the below case, I want to= check the performance of different optimizer, which in characters. DROP TABLE IF EXISTS patients; CREATE TABLE patients( id INTEGER NOT NULL, second_attack boolean, -- second_attack integer, treatment INTEGER, trait_anxiety INTEGER); INSERT INTO patients VALUES (1, True, 1, 70), (2, True, 1, 80), (3, True, 1, 50), (4, True, 0, 60), (5, True, 0, 40), (6, True, 0, 65), (7, True, 0, 75), (8, True, 0, 80), (9, True, 0, 70), (10, True, 0, 60), (11, FALSE, 1, 65), (12, FALSE, 1, 50), (13, FALSE, 1, 45), (14, FALSE, 1, 35), (15, FALSE, 1, 40), (16, FALSE, 1, 50), (17, FALSE, 0, 55), (18, FALSE, 0, 45), (19, FALSE, 0, 50), (20, FALSE, 0, 60); drop table if exists cv_result cascade; SELECT madlib.cross_validation_general ( 'madlib.logregr_train', -- modelling_func '{%data%, %model%, second_attack, "ARRAY[1, treatment, trait_anxiet= y]", null, 10,Optimizer,0.001,TRUE }'::varchar[], -- modelling_params '{varchar, varchar, varchar, varchar, varchar,integer,varchar,doubl= e precision,boolean}'::varchar[], --modelling_params_= type 'Optimizer', --param_explored '{irls,cg,igd}'::varchar[], --expl= ore_values 'madlib.cv_logregr_predict', = --predict_func '{%model%, %data%,"ARRAY[1, treatment, trait_anxiety]",%id%, %predi= ction%}'::varchar[], --predict_params '{text, text, text,text, text}'::varchar[], = --predict_params_type 'madlib.cv_logregr_accuracy', = --metric_func '{%prediction%, %data%, %id%, second_attack, %error%}'::varchar[], = --metric_params '{varchar, varchar, varchar,varchar, varchar}'::varchar[], = --metric_params_type 'patients', = --data_tbl 'id', --data_id TRUE, --id_is_ran= dom 'cv_result', --validation_= result '{id,second_attack, treatment, trait_anxiety}'::varchar[], = --data_cols 10 --fold_num ); ERROR: spiexceptions.UndefinedColumn: column "irls" does not exist LINE 5: (irls)::varchar as Optimizer, ^ QUERY: DROP TABLE IF EXISTS __madlib_temp_accum_error84716681_15326305= 57_66836697__; CREATE TEMP TABLE __madlib_temp_accum_error84716681_1532630557_= 66836697__ as SELECT (irls)::varchar as Optimizer, pg_temp.__madlib_temp_output_error1973940_1532630557_11= 061763__.* FROM pg_temp.__madlib_temp_output_error1973940_1532630557_1= 1061763__ CONTEXT: Traceback (most recent call last): PL/Python function "cross_validation_general", line 23, in return cross_validation.cross_validation_general(**globals()) PL/Python function "cross_validation_general", line 366, in cross_validat= ion_general PL/Python function "cross_validation_general", line 283, in _one_step_cv PL/Python function "cross_validation_general" --_000_DM5PR12MB1801CA31C3D5D79F6F185534FE2B0DM5PR12MB1801namp_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Dear user community,


I found a problem that could be a= bug.  Could you explore?


It happened when the explore valu= e is varchar, in the below case, I want to check the performance of di= fferent optimizer, which in characters.


DROP TABLE IF EXISTS patients;
CREATE TABLE patients( id INTEGER NOT NULL,
                    =    second_attack boolean, -- second_attack integer,
                    =    treatment INTEGER,
                    =    trait_anxiety INTEGER);          &nbs= p;             
INSERT INTO patients VALUES 
(1,  True, 1, 70),
(2,  True, 1, 80),
(3,  True, 1, 50),
(4,  True, 0, 60),
(5,  True, 0, 40),
(6,  True, 0, 65),
(7,  True, 0, 75),
(8,  True, 0, 80),
(9,  True, 0, 70),
(10, True, 0, 60),
(11, FALSE, 1, 65),
(12, FALSE, 1, 50),
(13, FALSE, 1, 45),
(14, FALSE, 1, 35),
(15, FALSE, 1, 40),
(16, FALSE, 1, 50),
(17, FALSE, 0, 55),
(18, FALSE, 0, 45),
(19, FALSE, 0, 50),
(20, FALSE, 0, 60);


drop table  if exists  cv_result cascade;
SELECT madlib.cross_validation_general
    ( 'madlib.logregr_train',        -- = modelling_func                 = ;                     &nb= sp;    
        '{%data%, %model%, second_attack, "AR= RAY[1, treatment, trait_anxiety]", null, 10,Optimizer,0.001,TRUE = }'::varchar[],   -- modelling_params
        '{varchar, varchar, varchar, varchar, varc= har,integer,varchar,double precision,boolean}'::varchar[],    &nb= sp;                  --modelli= ng_params_type
        'Optimizer',        &n= bsp;                     =         --param_explored
        '{irls,cg,igd}'::varchar[],    &= nbsp;                    =         --explore_values
        'madlib.cv_logregr_predict',    =                      = ;                    --pr= edict_func
        '{%model%, %data%,"ARRAY[1, treatment= , trait_anxiety]",%id%, %prediction%}'::varchar[],     =                  --predict_pa= rams
        '{text, text, text,text, text}'::varchar[]= ,                    &nbs= p;             --predict_params_type
        'madlib.cv_logregr_accuracy',   =                     &nbs= p;                     &n= bsp;        --metric_func        &n= bsp; 
        '{%prediction%, %data%, %id%, second_attac= k, %error%}'::varchar[],              &n= bsp; --metric_params
        '{varchar, varchar, varchar,varchar, varch= ar}'::varchar[],                &nb= sp;--metric_params_type
        'patients',        &nb= sp;                     &= nbsp;                    =       --data_tbl            &= nbsp;  
        'id',          &n= bsp;                     =         --data_id
         TRUE,        &nb= sp;                     &= nbsp;                   --id_i= s_random
         'cv_result',      &nb= sp;                     &= nbsp;            --validation_result
         '{id,second_attack, treatment, trait= _anxiety}'::varchar[],              &nbs= p;               --data_cols 
         10         =                     &nbs= p;                     &n= bsp; --fold_num
);

ERROR:  spiexceptions.UndefinedColumn: column "irls" do= es not exist
LINE 5:                  =    (irls)::varchar as Optimizer,
                    =          ^
QUERY:
            DROP TABLE IF EXISTS __madli= b_temp_accum_error84716681_1532630557_66836697__;
            CREATE TEMP TABLE __madlib_t= emp_accum_error84716681_1532630557_66836697__ as
                SELECT
                    = (irls)::varchar as Optimizer,
                    = pg_temp.__madlib_temp_output_error1973940_1532630557_11061763__.*
                FROM pg_temp._= _madlib_temp_output_error1973940_1532630557_11061763__


CONTEXT:  Traceback (most recent call last):
  PL/Python function "cross_validation_general", line 2= 3, in <module>
    return cross_validation.cross_validation_general(**globa= ls())
  PL/Python function "cross_validation_general", line 3= 66, in cross_validation_general
  PL/Python function "cross_validation_general", line 2= 83, in _one_step_cv
PL/Python function "cross_validation_general"


--_000_DM5PR12MB1801CA31C3D5D79F6F185534FE2B0DM5PR12MB1801namp_--