spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John King <usedforprinting...@gmail.com>
Subject Re: Spark mllib throwing error
Date Thu, 24 Apr 2014 23:05:55 GMT
In the other thread I had an issue with Python. In this issue, I tried
switching to Scala. The code is:

*import* org.apache.spark.mllib.regression.*LabeledPoint**;*

*import org.apache.spark.mllib.linalg.SparseVector;*

*import org.apache.spark.mllib.classification.NaiveBayes;*

import scala.collection.mutable.ArrayBuffer



def isEmpty(a: String): Boolean = a != null &&
!a.replaceAll("""(?m)\s+$""", "").isEmpty()

def parsePoint(a: String): LabeledPoint = {

               val values = a.split('\t')

               val feat = values(1).split(' ')

               val indices = ArrayBuffer.empty[Int]

               val featValues = ArrayBuffer.empty[Double]

               for (f <- feat) {

                   val q = f.split(':')

                   if (q.length == 2) {

                      indices += (q(0).toInt)

                      featValues += (q(1).toDouble)

               }

               }

               val vector = new SparseVector(2357815, indices.toArray,
featValues.toArray)

               return LabeledPoint(values(0).toDouble, vector)

               }


val data = sc.textFile("data.txt")

val empty = data.filter(isEmpty)

val points = empty.map(parsePoint)

points.cache()

val model = new NaiveBayes().run(points)


On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng <mengxr@gmail.com> wrote:

> Do you mind sharing more code and error messages? The information you
> provided is too little to identify the problem. -Xiangrui
>
> On Thu, Apr 24, 2014 at 1:55 PM, John King <usedforprinting123@gmail.com>
> wrote:
> > Last command was:
> >
> > val model = new NaiveBayes().run(points)
> >
> >
> >
> > On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng <mengxr@gmail.com> wrote:
> >>
> >> Could you share the command you used and more of the error message?
> >> Also, is it an MLlib specific problem? -Xiangrui
> >>
> >> On Thu, Apr 24, 2014 at 11:49 AM, John King
> >> <usedforprinting123@gmail.com> wrote:
> >> > ./spark-shell: line 153: 17654 Killed
> >> > $FWDIR/bin/spark-class org.apache.spark.repl.Main "$@"
> >> >
> >> >
> >> > Any ideas?
> >
> >
>

Mime
View raw message