systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm" <mbo...@us.ibm.com>
Subject Re: Logical indexing?
Date Thu, 31 Mar 2016 17:22:09 GMT

just a quick correction of option 2:

Ind = (X[,1]>10);
Y = removeEmpty(target=X, select=Ind);

Regards,
Matthias



From:	Matthias Boehm/Almaden/IBM@IBMUS
To:	dev@systemml.incubator.apache.org
Date:	03/31/2016 10:14 AM
Subject:	Re: Logical indexing?



that's a good question - no SystemML does not support set indexing yet but
you can emulate it via permutation matrices or similar transformations.
Here are some examples:

# option 1: via permutation (aka selection) matrices
P = removeEmpty(target=diag(X[,1]>10), margin="rows");
Y = P %*% X;

# option 2: via removeEmpty
Ind = diag(X[,1]>10);
Y = removeEmpty(target=X, select=Ind);


Regards,
Matthias

Ethan Xu ---03/31/2016 08:47:43 AM---Does SystemML support logical
indexing? For example if X is a numerical matrix with 2 columns and n

From: Ethan Xu <ethan.yifanxu@gmail.com>
To: dev@systemml.incubator.apache.org
Date: 03/31/2016 08:47 AM
Subject: Logical indexing?



Does SystemML support logical indexing?

For example if X is a numerical matrix with 2 columns and n rows (in my
case n ~ 35 million). I'd like to split the matrix row-wise according to
values of the first column. This is useful when I need to find
distributions of subgroups of population.  In R I can do

Y = X[ X[ ,1] > 10, ]

OR

ind = which(X[ ,1] > 10)
Y = X[ind, ]

It seems neither syntex works in SystemML.

I noticed there's an aggregate() function for SystemML, but it supports
coded categorical variable.

Perhaps one way to do that is creating an indicator n by 1 matrix Z that
takes values 1 and 2 where 1 corresponds to X[, 1] <= 10 and 2 corresponds
to X[,1] > 10. Then aggregate() X[,2] with respect to Z.

It seems transform() with 'bin' option is one obvious way to create such a
Z, however the 'bin' method only supports 'equi-width' currently.

Is looping through X[,1] the best option? Maybe I missed some other
convenient functions.

Any suggestions are greatly appreciated!

Best,

Ethan



Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message