mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vipul Pandey <vipan...@gmail.com>
Subject PFPGrowth - weird output?
Date Fri, 04 Feb 2011 01:21:17 GMT
Hi all!

I'm trying to run PFPgrowth on my data and this is an output I get. (Please
note that I parse the output in frequentpatterns folder and generate this
output with the support followed by the itemset)

support : Itemset
*234     1518311    1476937  *
235     55843184
238     1238079
244     34541
247     4516454
252     106478
252     670864
*254     1476937   1518311  *

You can see that two items are reported twice (*1518311    1476937*) with
different supports.

And below are all the occurance of these two items together .... if you
notice it has all the permutations of the three items (*1476937* *720020* *
1518311*  )

22 *1476937* 720020 *1518311*
30 *1518311* *1476937* 720020
30 720020 *1518311* *1476937*
34 720020 *1476937* *1518311*
38 *1518311* 720020 *1476937*
42 *1476937* *1518311* 720020
234 *1518311* *1476937*
254 *1476937* *1518311*

Does this mean if I have to get the support of just the the pair  (*1476937*
 *1518311*  ) I will have to add all of them up !?

Even in that case ... this total comes out to *684* and if I count the
number of co-ocurrances of these two items in the original baskets the
support is *766*? Why's there a difference? any idea?


Thanks!
Vipul

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message