From user-return-487-archive-asf-public=cust-asf.ponee.io@madlib.apache.org Wed Jan 3 23:23:00 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 89A1718077A for ; Wed, 3 Jan 2018 23:23:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7972B160C1B; Wed, 3 Jan 2018 22:23:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 98D74160C05 for ; Wed, 3 Jan 2018 23:22:59 +0100 (CET) Received: (qmail 52411 invoked by uid 500); 3 Jan 2018 22:22:58 -0000 Mailing-List: contact user-help@madlib.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@madlib.apache.org Delivered-To: mailing list user@madlib.apache.org Received: (qmail 52401 invoked by uid 99); 3 Jan 2018 22:22:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jan 2018 22:22:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 24006C040D for ; Wed, 3 Jan 2018 22:22:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.898 X-Spam-Level: * X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=eng.ucsd.edu Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 4uFb39YJj8DD for ; Wed, 3 Jan 2018 22:22:57 +0000 (UTC) Received: from mail-pl0-f43.google.com (mail-pl0-f43.google.com [209.85.160.43]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D15D35F24E for ; Wed, 3 Jan 2018 22:22:56 +0000 (UTC) Received: by mail-pl0-f43.google.com with SMTP id g2so2055492pli.8 for ; Wed, 03 Jan 2018 14:22:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eng.ucsd.edu; s=google; h=mime-version:from:date:message-id:subject:to; bh=pHJ686NFuyhBG8wtY985jfYlorlaE4wwrEIbyqGIZeE=; b=Mg7c3tdRAerUHsXMBmAAXUxtrj4QnWz48P+pazoEX0APJSVJ4InzmGkJm2ZVbXZGLG vJVG6Hgng7pDdGTXIJl3pr8WYFJ8WspMayj0VPTdEuKS+3JbT4qyyNeofqkp01Dq9Jat z0/VtDyd++v+2zMhjev6O5nWaZUXB5cz4H12A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=pHJ686NFuyhBG8wtY985jfYlorlaE4wwrEIbyqGIZeE=; b=kHNjYJH5CiFaL63JrvZSwEAMKcNDfnQn7i8i/qc7/D2h8wNt3kgOP0AnF3QMwhvTki S+QpkPexGqxPl50nUkIH9gRZfdXn/OdHqiZabCaL7Wp8ZPIZXZ0jE/O2CMqwNsmiEb5m bfWXjD9jAIR1S8zZoxHNpjJB3TUy9u3Kl+aUw4pTgo4Nqvu50RRVZTTNi2pPjMHvKA55 AeJancl+o2O7+pJWYh32Y6VtoUCiCYMb00cI/gX8ZNd0LcLvkceZvjK5aiVRTdX1ld/S anj+EVo108h8mi7KlCYG+qEfKYhMZKaXRx8cEb3oDPyWGI2Eb0qYyQkk+sRSq/1Sws8P uOIA== X-Gm-Message-State: AKGB3mIYWsMGqo2RVSOwBMIJUpLSXpV+nfwyjt523EqLup0FY7xGKaqj Fl7mttUMK1sXIcjDVlpikd2J+ISN9j/D2NHgV1xl741Q X-Google-Smtp-Source: ACJfBosPuXGuVE2J6r9TXtSM42r0QECaAYsMO4M/7Vkyg6oxySC7ihDAWf3g5gLMFuXgX62Jtu/lbIb6ADBGj0XMcXg= X-Received: by 10.84.171.195 with SMTP id l61mr2693225plb.129.1515018170322; Wed, 03 Jan 2018 14:22:50 -0800 (PST) MIME-Version: 1.0 Received: by 10.100.169.4 with HTTP; Wed, 3 Jan 2018 14:22:49 -0800 (PST) From: Anthony Thomas Date: Wed, 3 Jan 2018 14:22:49 -0800 Message-ID: Subject: Multiplying a large sparse matrix by a vector To: user@madlib.apache.org Content-Type: multipart/alternative; boundary="94eb2c11957adeed1b0561e6a7ac" --94eb2c11957adeed1b0561e6a7ac Content-Type: text/plain; charset="UTF-8" Hi Madlib folks, I have a large tall and skinny sparse matrix which I'm trying to multiply by a dense vector. The matrix is 1.25e8 by 100 with approximately 1% nonzero values. This operations always triggers an error from Greenplum: plpy.SPIError: invalid memory alloc request size 1073741824 (context 'accumArrayResult') (mcxt.c:1254) (plpython.c:4957) CONTEXT: Traceback (most recent call last): PL/Python function "matrix_vec_mult", line 24, in matrix_in, in_args, vector) PL/Python function "matrix_vec_mult", line 2044, in matrix_vec_mult PL/Python function "matrix_vec_mult", line 2001, in _matrix_vec_mult_dense PL/Python function "matrix_vec_mult" Some Googling suggests this error is caused by a hard limit from Postgres which restricts the maximum size of an array to 1GB. If this is indeed the cause of the error I'm seeing does anyone have any suggestions about how to circumvent this issue? This comes up in other cases as well like transposing a tall and skinny matrix. MVM with smaller matrices works fine. Here is relevant version information: SELECT VERSION(); PostgreSQL 8.3.23 (Greenplum Database 5.1.0 build dev) on x86_64-pc-linux-gnu, compiled by GCC gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609 compiled on Dec 21 2017 09:09:46 SELECT madlib.version(); MADlib version: 1.12, git revision: unknown, cmake configuration time: Thu Dec 21 18:04:47 UTC 201 7, build type: RelWithDebInfo, build system: Linux-4.4.0-103-generic, C compiler: gcc 4.9.3, C++ co mpiler: g++ 4.9.3 Madlib install-check reported one error in the "convex" module related to "loss too high" which seems unrelated to the issue described above. I know Ubuntu isn't officially supported by Greenplum so I'd like to be confident this issue isn't just the result of using an unsupported OS. Please let me know if any other information would be helpful. Thanks, Anthony --94eb2c11957adeed1b0561e6a7ac Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Madlib folks,

I have a= large tall and skinny sparse matrix which I'm trying to multiply by a = dense vector. The matrix is 1.25e8 by 100 with approximately 1% nonzero val= ues. This operations always triggers an error from Greenplum:
plpy.SPIError: invalid memory alloc request size 1073741824 (co= ntext 'accumArrayResult') (mcxt.c:1254) (plpython.c:4957)
CONTEX= T:=C2=A0 Traceback (most recent call last):
=C2=A0 PL/Python function &q= uot;matrix_vec_mult", line 24, in <module>
=C2=A0=C2=A0=C2=A0= matrix_in, in_args, vector)
=C2=A0 PL/Python function "matrix_vec_= mult", line 2044, in matrix_vec_mult
=C2=A0 PL/Python function &quo= t;matrix_vec_mult", line 2001, in _matrix_vec_mult_dense
PL/Python = function "matrix_vec_mult"

Some Googling sugg= ests this error is caused by a hard limit from Postgres which restricts the= maximum size of an array to 1GB. If this is indeed the cause of the error = I'm seeing does anyone have any suggestions about how to circumvent thi= s issue? This comes up in other cases as well like transposing a tall and s= kinny matrix. MVM with smaller matrices works fine.

Here is relevant version information:

SELECT VER= SION();
PostgreSQL 8.3.23 (Greenplum Database 5.1.0 build dev) on= x86_64-pc-linux-gnu, compiled by GCC gcc
=C2=A0(Ubuntu 5.4.0-6ubuntu1~1= 6.04.5) 5.4.0 20160609 compiled on Dec 21 2017 09:09:46

SELECT madlib.version();
MADlib version: 1.12, git revision= : unknown, cmake configuration time: Thu Dec 21 18:04:47 UTC 201
7, buil= d type: RelWithDebInfo, build system: Linux-4.4.0-103-generic, C compiler: = gcc 4.9.3, C++ co
mpiler: g++ 4.9.3

Madlib = install-check reported one error in the "convex" module related t= o "loss too high" which seems unrelated to the issue described ab= ove. I know Ubuntu isn't officially supported by Greenplum so I'd l= ike to be confident this issue isn't just the result of using an unsupp= orted OS. Please let me know if any other information would be helpful.
=
Thanks,

Anthony
--94eb2c11957adeed1b0561e6a7ac--