spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Nguyen <>
Subject Re: LogisticRegression: Predicting continuous outcomes
Date Thu, 29 May 2014 02:46:27 GMT
Bharath, (apologies if you're already familiar with the theory): the
proposed approach may or may not be appropriate depending on the overall
transfer function in your data. In general, a single logistic regressor
cannot approximate arbitrary non-linear functions (of linear combinations
of the inputs). You can review works by, e.g., Hornik and Cybenko in the
late 80's to see if you need something more, such as a simple, one
hidden-layer neural network.

This is a good summary:

Christopher T. Nguyen
Co-founder & CEO, Adatao <>

On Wed, May 28, 2014 at 11:18 AM, Bharath Ravi Kumar <>wrote:

> I'm looking to reuse the LogisticRegression model (with SGD) to predict a
> real-valued outcome variable. (I understand that logistic regression is
> generally applied to predict binary outcome, but for various reasons, this
> model suits our needs better than LinearRegression). Related to that I have
> the following questions:
> 1) Can the current LogisticRegression model be used as is to train based on
> binary input (i.e. explanatory) features, or is there an assumption that
> the explanatory features must be continuous?
> 2) I intend to reuse the current class to train a model on LabeledPoints
> where the label is a real value (and not 0 / 1). I'd like to know if
> invoking setValidateData(false) would suffice or if one must override the
> validator to achieve this.
> 3) I recall seeing an experimental method on the class (
> )
> that clears the threshold separating positive & negative predictions. Once
> the model is trained on real valued labels, would clearing this flag
> suffice to predict an outcome that is continous in nature?
> Thanks,
> Bharath
> P.S: I'm writing to dev@ and not user@ assuming that lib changes might be
> necessary. Apologies if the mailing list is incorrect.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message