mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mustafa Elbehery <elbeherymust...@gmail.com>
Subject TelephoneCall Logistic Regression Example
Date Tue, 19 May 2015 10:34:20 GMT
Hi Folks,

I have a question regarding the *TelephoneCall *in example package. We we
add load the training data from the CSV into the training matrix, we add a
weight for each feature-field in the feature vector.

In the TelephoneCall code, we add the weight with a *Log(v)*, logarithmic
value not the real value. I can not understand why ?!! Please find code
snippet below :-

case "age": {
  double v = Double.parseDouble(fieldValue);
  featureEncoder.addToVector(name, Math.log(v), vector);
  break;
}

However, in the balance field, we assign a negative value if less than
threshold, like this


case "balance": {
  double v;
  v = Double.parseDouble(fieldValue);
  if (v < -2000) {
    v = -2000;
  }
  featureEncoder.addToVector(name, Math.log(v + 2001) - 8, vector);
  break;
}



Anyone can explain the logic, I am trying to run it on my own dataset,
and I am taking this example as a reference



Also I would like to know why we use a hashed vector, I can not get the
idea of that ?!!

Cheers.

-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15750363097
skype: mustafaelbehery87

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message