spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <>
Subject [jira] [Commented] (SPARK-6487) Add sequential pattern mining algorithm to Spark MLlib
Date Tue, 24 Mar 2015 03:59:53 GMT


Xiangrui Meng commented on SPARK-6487:

[~Zhang JiaJin] I'm not very familiar with patten mining, but I don't see many citations of
the paper you mentioned. So I need more information to understand the importance/popularity
of sequential pattern mining and whether there exist really scalable algorithms. If there
are not many requests for this feature or there are no scalable algorithms, you can certainly
register your implementation as a third-party package on and maintain it
outside Spark for users.

> Add sequential pattern mining algorithm to Spark MLlib
> ------------------------------------------------------
>                 Key: SPARK-6487
>                 URL:
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Zhang JiaJin
> [~mengxr] [~zhangyouhua]
> Sequential pattern mining is an important branch in the pattern mining. In the past the
actual work, we use the sequence mining (mainly PrefixSpan algorithm) to find the telecommunication
signaling sequence pattern, achieved good results. But once the data is too large, the operation
time is too long, even can not meet the the service requirements. We are ready to implement
the PrefixSpan algorithm in spark, and applied to our subsequent work. 
> The related Paper: "Distributed PrefixSpan algorithm based on MapReduce".

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message