madlib-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank McQuillan <fmcquil...@pivotal.io>
Subject Re: Proposed improvement to association rules (Apriori) algorithm
Date Fri, 28 Oct 2016 00:26:18 GMT
I created a JIRA on this
https://issues.apache.org/jira/browse/MADLIB-1031



On Thu, Oct 27, 2016 at 3:00 PM, Frank McQuillan <fmcquillan@pivotal.io>
wrote:

> Here is a comment from a MADlib user that I recently heard:
>
> “No apparent way to set an upper bound for itemset size in assoc_rules
> function. This results in it running forever with larger data sets. In the
> R "arules" package, you can set a max itemset size so that it doesn't look
> for unnecessarily large associations.”
> https://cran.r-project.org/web/packages/arules/arules.pdf
>
> Does a single optional parameter make sense to add to
> http://madlib.incubator.apache.org/docs/latest/group__
> grp__assoc__rules.html
> similar to the maxlen parameter in “arules” ?
>
> Any other considerations here or improvements to make the this algorithm
> at the same time? minlen?
>
> Thanks,
> Frank
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message