mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1853) Improvements to CCO (Correlated Cross-Occurrence)
Date Mon, 22 Aug 2016 16:14:20 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431095#comment-15431095
] 

ASF GitHub Bot commented on MAHOUT-1853:
----------------------------------------

Github user dlyubimov commented on a diff in the pull request:

    https://github.com/apache/mahout/pull/251#discussion_r75707999
  
    --- Diff: math-scala/src/main/scala/org/apache/mahout/math/cf/SimilarityAnalysis.scala
---
    @@ -211,9 +314,17 @@ object SimilarityAnalysis extends Serializable {
     
       }
     
    -  def computeSimilarities(drm: DrmLike[Int], numUsers: Int, maxInterestingItemsPerThing:
Int,
    -                        bcastNumInteractionsB: BCast[Vector], bcastNumInteractionsA:
BCast[Vector],
    -                        crossCooccurrence: Boolean = true) = {
    +  def computeSimilarities(
    +    drm: DrmLike[Int],
    +    numUsers: Int,
    +    maxInterestingItemsPerThing: Int,
    +    bcastNumInteractionsB: BCast[Vector],
    +    bcastNumInteractionsA: BCast[Vector],
    +    crossCooccurrence: Boolean = true,
    +    minLLROpt: Option[Double] = None) = {
    +
    +    val minLLR = minLLROpt.getOrElse(0.0d) // accept all values if not specified
    --- End diff --
    
    i think style convention was to use 0.0 (minority split in favor 0d) but never 0.0d 


> Improvements to CCO (Correlated Cross-Occurrence)
> -------------------------------------------------
>
>                 Key: MAHOUT-1853
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1853
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.12.0
>            Reporter: Andrew Palumbo
>            Assignee: Pat Ferrel
>             Fix For: 0.13.0
>
>
> Improvements to CCO (Correlated Cross-Occurrence) to include auto-threshold calculation
for LLR downsampling, and possible multiple fixed thresholds for A’A, A’B etc. This is
to account for the vast difference in dimensionality between indicator types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message