systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SYSTEMML-2031) Perftest: Unnecessary compression of incompressible blocks
Date Thu, 30 Nov 2017 03:34:00 GMT
Matthias Boehm created SYSTEMML-2031:
----------------------------------------

             Summary: Perftest: Unnecessary compression of incompressible blocks
                 Key: SYSTEMML-2031
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2031
             Project: SystemML
          Issue Type: Bug
            Reporter: Matthias Boehm


By default, we apply compression for data sets that are known to exceed aggregate memory if
all operations of the given script that touch the respective input are supported over compressed
matrices.

On the perftest 800GB dense scenario, this leads to a slight slowdown and increase in the
matrix size due to incompressible data, where each block is represented as follows:
{code}
--col groups sizes (OLE,RLE,DDC1,DDC2,UC): 0,0,0,0,1000
--compression ratio: 0.999475777837746
{code}

We should investigate the set of incompressible columns as well as final representation and
simply return the uncompressed block in such such scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message