Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 70BAC17765 for ; Thu, 2 Apr 2015 17:41:28 +0000 (UTC) Received: (qmail 23621 invoked by uid 500); 2 Apr 2015 17:41:21 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 23535 invoked by uid 500); 2 Apr 2015 17:41:21 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 23525 invoked by uid 99); 2 Apr 2015 17:41:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Apr 2015 17:41:21 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tsindotg@gmail.com designates 209.85.214.181 as permitted sender) Received: from [209.85.214.181] (HELO mail-ob0-f181.google.com) (209.85.214.181) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Apr 2015 17:40:55 +0000 Received: by obbec2 with SMTP id ec2so139008251obb.3 for ; Thu, 02 Apr 2015 10:40:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=2ZTE156XloYao6xq+bWrUEcin5PmkOS9isV5WvX9qKA=; b=b8QkklAZbFtpR8AwP7bQv/tbpPbPMERJVOCz1cz2txZyeYEoDfqkJ4HtTqj/qPUoST YryOkM+oZaKzaDhPcviCnISyz5lIOh6mKVG0i72rAZKfEk5XLy1OLn1nO0IrCoZg5mn2 FdkhhBG0Js/Wm9EVwBrkpjDjOijycn9gzK31vfaMW4Elpp8/83zE5RknZ8nEasd6w/yw UHZMI0o4Y5j39/YJmJqQDbopD30o5cL1wrWu4rYtkPhexUOQOb3PCzjBGEEbRJ1YVvYh kOm6Mw4Z6UQs4CnhgPyco3WIWS/fFAmIozKHBhDUdqkuIzYna51boNrLsUQClTEMbjlt Knxw== MIME-Version: 1.0 X-Received: by 10.182.251.138 with SMTP id zk10mr48935486obc.72.1427996453051; Thu, 02 Apr 2015 10:40:53 -0700 (PDT) Received: by 10.76.168.129 with HTTP; Thu, 2 Apr 2015 10:40:52 -0700 (PDT) In-Reply-To: References: Date: Thu, 2 Apr 2015 13:40:52 -0400 Message-ID: Subject: Re: Spark Sql - Missing Jar ? json_tuple NoClassDefFoundError From: Todd Nist To: Akhil Das Cc: "user@spark.apache.org" Content-Type: multipart/alternative; boundary=001a11c1dada53235b0512c15511 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1dada53235b0512c15511 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Akhil, Tried your suggestion to no avail. I actually to not see and "jackson" or "json serde" jars in the $HIVE/lib directory. This is hive 0.13.1 and spark 1.2.1 Here is what I did: I have added the lib folder to the =E2=80=93jars option when starting the spark-shell, but the job fails. The hive-site.xml is in the $SPARK_HOME/conf directory. I start the spark-shell as follows: ./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path /usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jar and like this ./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path /usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jar --jars /opt/hive/0.13.1/lib/* I=E2=80=99m just doing this in the spark-shell now: import org.apache.spark.sql.hive._val sqlContext =3D new HiveContext(sc)import sqlContext._case class MetricTable(path: String, pathElements: String, name: String, value: String)val mt =3D new MetricTable("""path": "/DC1/HOST1/""", """pathElements": [{"node": "DataCenter","value": "DC1"},{"node": "host","value": "HOST1"}]""", """name": "Memory Usage (%)""", """value": 29.590943279257175""")val rdd1 =3D sc.makeRDD(List(mt)) rdd1.printSchema() rdd1.registerTempTable("metric_table") sql( """SELECT path, name, value, v1.peValue, v1.peName FROM metric_table lateral view json_tuple(pathElements, 'name', 'value') v1 as peName, peValue """) .collect.foreach(println(_)) It results in the same error: 15/04/02 12:33:59 INFO ParseDriver: Parsing command: SELECT path, name, value, v1.peValue, v1.peName FROM metric_table lateral view json_tuple(pathElements, 'name', 'value') v1 as peName, peValue 15/04/02 12:34:00 INFO ParseDriver: Parse Completed res2: org.apache.spark.sql.SchemaRDD =3D SchemaRDD[5] at RDD at SchemaRDD.scala:108=3D=3D Query Plan =3D=3D=3D=3D Ph= ysical Plan =3D=3D java.lang.ClassNotFoundException: json_tuple Any other suggestions or am I doing something else wrong here? -Todd On Thu, Apr 2, 2015 at 2:00 AM, Akhil Das wrote: > Try adding all the jars in your $HIVE/lib directory. If you want the > specific jar, you could look fr jackson or json serde in it. > > Thanks > Best Regards > > On Thu, Apr 2, 2015 at 12:49 AM, Todd Nist wrote: > >> I have a feeling I=E2=80=99m missing a Jar that provides the support or = could >> this may be related to https://issues.apache.org/jira/browse/SPARK-5792. >> If it is a Jar where would I find that ? I would have thought in the >> $HIVE/lib folder, but not sure which jar contains it. >> >> Error: >> >> Create Metric Temporary Table for querying15/04/01 14:41:44 INFO HiveMet= aStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hiv= e.metastore.ObjectStore15/04/01 14:41:44 INFO ObjectStore: ObjectStore, ini= tialize called15/04/01 14:41:45 INFO Persistence: Property hive.metastore.i= ntegral.jdo.pushdown unknown - will be ignored15/04/01 14:41:45 INFO Persis= tence: Property datanucleus.cache.level2 unknown - will be ignored15/04/01 = 14:41:45 INFO BlockManager: Removing broadcast 015/04/01 14:41:45 INFO Bloc= kManager: Removing block broadcast_015/04/01 14:41:45 INFO MemoryStore: Blo= ck broadcast_0 of size 1272 dropped from memory (free 278018571)15/04/01 14= :41:45 INFO BlockManager: Removing block broadcast_0_piece015/04/01 14:41:4= 5 INFO MemoryStore: Block broadcast_0_piece0 of size 869 dropped from memor= y (free 278019440)15/04/01 14:41:45 INFO BlockManagerInfo: Removed broadcas= t_0_piece0 on 192.168.1.5:63230 in memory (size: 869.0 B, free: 265.1 MB)15= /04/01 14:41:45 INFO BlockManagerMaster: Updated info of block broadcast_0_= piece015/04/01 14:41:45 INFO BlockManagerInfo: Removed broadcast_0_piece0 o= n 192.168.1.5:63278 in memory (size: 869.0 B, free: 530.0 MB)15/04/01 14:41= :45 INFO ContextCleaner: Cleaned broadcast 015/04/01 14:41:46 INFO ObjectSt= ore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjt= ypes=3D"Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSche= ma,Order"15/04/01 14:41:46 INFO Datastore: The class "org.apache.hadoop.hiv= e.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not ha= ve its own datastore table.15/04/01 14:41:46 INFO Datastore: The class "org= .apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so= does not have its own datastore table.15/04/01 14:41:47 INFO Datastore: Th= e class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as = "embedded-only" so does not have its own datastore table.15/04/01 14:41:47 = INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" i= s tagged as "embedded-only" so does not have its own datastore table.15/04/= 01 14:41:47 INFO Query: Reading in results for query "org.datanucleus.store= .rdbms.query.SQLQuery@0" since the connection used is closing15/04/01 14:41= :47 INFO ObjectStore: Initialized ObjectStore15/04/01 14:41:47 INFO HiveMet= aStore: Added admin role in metastore15/04/01 14:41:47 INFO HiveMetaStore: = Added public role in metastore15/04/01 14:41:48 INFO HiveMetaStore: No user= is added in admin role, since config is empty15/04/01 14:41:48 INFO Sessio= nState: No Tez session required at this point. hive.execution.engine=3Dmr.1= 5/04/01 14:41:49 INFO ParseDriver: Parsing command: SELECT path, name, valu= e, v1.peValue, v1.peName >> FROM metric >> lateral view json_tuple(pathElements, 'name', 'value') v1 >> as peName, peValue15/04/01 14:41:49 INFO ParseDriver: Par= se CompletedException in thread "main" java.lang.ClassNotFoundException: js= on_tuple >> at java.net.URLClassLoader$1.run(URLClassLoader.java:372) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> at org.apache.spark.sql.hive.HiveFunctionWrapper.createFunction(Shim= 13.scala:141) >> at org.apache.spark.sql.hive.HiveGenericUdtf.function$lzycompute(hiv= eUdfs.scala:261) >> at org.apache.spark.sql.hive.HiveGenericUdtf.function(hiveUdfs.scala= :261) >> at org.apache.spark.sql.hive.HiveGenericUdtf.outputInspector$lzycomp= ute(hiveUdfs.scala:267) >> at org.apache.spark.sql.hive.HiveGenericUdtf.outputInspector(hiveUdf= s.scala:267) >> at org.apache.spark.sql.hive.HiveGenericUdtf.outputDataTypes$lzycomp= ute(hiveUdfs.scala:272) >> at org.apache.spark.sql.hive.HiveGenericUdtf.outputDataTypes(hiveUdf= s.scala:272) >> at org.apache.spark.sql.hive.HiveGenericUdtf.makeOutput(hiveUdfs.sca= la:278) >> at org.apache.spark.sql.catalyst.expressions.Generator.output(genera= tors.scala:60) >> at org.apache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.a= pply(basicOperators.scala:50) >> at org.apache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.appl= y(basicOperators.scala:50) >> at scala.Option.map(Option.scala:145) >> at org.apache.spark.sql.catalyst.plans.logical.Generate.generatorOut= put(basicOperators.scala:50) >> at org.apache.spark.sql.catalyst.plans.logical.Generate.output(basic= Operators.scala:60) >> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$= resolveChildren$1.apply(LogicalPlan.scala:118) >> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$res= olveChildren$1.apply(LogicalPlan.scala:118) >> at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(Travers= ableLike.scala:251) >> at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(Traversabl= eLike.scala:251) >> at scala.collection.immutable.List.foreach(List.scala:318) >> at scala.collection.TraversableLike$class.flatMap(TraversableLike.sc= ala:251) >> at scala.collection.AbstractTraversable.flatMap(Traversable.scala:10= 5) >> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveCh= ildren(LogicalPlan.scala:118) >> at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences= $$anonfun$apply$6$$anonfun$applyOrElse$1.applyOrElse(Analyzer.scala:159) >> at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences= $$anonfun$apply$6$$anonfun$applyOrElse$1.applyOrElse(Analyzer.scala:156) >> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNo= de.scala:144) >> at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sq= l$catalyst$plans$QueryPlan$$transformExpressionDown$1(QueryPlan.scala:71) >> at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$1$$anonfun$ap= ply$1.apply(QueryPlan.scala:85) >> at scala.collection.TraversableLike$$anonfun$map$1.apply(Traversable= Like.scala:244) >> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLik= e.scala:244) >> at scala.collection.mutable.ResizableArray$class.foreach(ResizableAr= ray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47= ) >> at scala.collection.TraversableLike$class.map(TraversableLike.scala:= 244) >> at scala.collection.AbstractTraversable.map(Traversable.scala:105) >> at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$1.apply(Qu= eryPlan.scala:84) >> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.sc= ala:48) >> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.sc= ala:103) >> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.sc= ala:47) >> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:2= 73) >> at scala.collection.AbstractIterator.to(Iterator.scala:1157) >> at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.s= cala:265) >> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> at scala.collection.TraversableOnce$class.toArray(TraversableOnce.sc= ala:252) >> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression= sDown(QueryPlan.scala:89) >> at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression= s(QueryPlan.scala:60) >> at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences= $$anonfun$apply$6.applyOrElse(Analyzer.scala:156) >> at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences= $$anonfun$apply$6.applyOrElse(Analyzer.scala:153) >> at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode= .scala:206) >> at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences= $.apply(Analyzer.scala:153) >> at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences= $.apply(Analyzer.scala:152) >> at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1= $$anonfun$apply$2.apply(RuleExecutor.scala:61) >> at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1= $$anonfun$apply$2.apply(RuleExecutor.scala:59) >> at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptim= ized.scala:111) >> at scala.collection.immutable.List.foldLeft(List.scala:84) >> at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1= .apply(RuleExecutor.scala:59) >> at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.ap= ply(RuleExecutor.scala:51) >> at scala.collection.immutable.List.foreach(List.scala:318) >> at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecut= or.scala:51) >> at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycomput= e(SQLContext.scala:411) >> at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContex= t.scala:411) >> at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzy= compute(SQLContext.scala:412) >> at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQL= Context.scala:412) >> at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzyc= ompute(SQLContext.scala:413) >> at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLC= ontext.scala:413) >> at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompu= te(SQLContext.scala:418) >> at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLConte= xt.scala:416) >> at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzyco= mpute(SQLContext.scala:422) >> at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLCo= ntext.scala:422) >> at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:444) >> at com.opsdatastore.elasticsearch.spark.ElasticSearchReadWrite$.main= (ElasticSearchReadWrite.scala:119) >> at com.opsdatastore.elasticsearch.spark.ElasticSearchReadWrite.main(= ElasticSearchReadWrite.scala) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorI= mpl.java:62) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodA= ccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:483) >> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358= ) >> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> Json: >> >> "metric": { >> >> "path": "/PA/Pittsburgh/12345 Westbrook Drive/main/theromostat-1", >> "pathElements": [ >> { >> "node": "State", >> "value": "PA" >> }, >> { >> "node": "City", >> "value": "Pittsburgh" >> }, >> { >> "node": "Street", >> "value": "12345 Westbrook Drive" >> }, >> { >> "node": "level", >> "value": "main" >> }, >> { >> "node": "device", >> "value": "thermostat" >> } >> ], >> "name": "Current Temperature", >> "value": 29.590943279257175, >> "timestamp": "2015-03-27T14:53:46+0000" >> } >> >> Here is the code that produces the error: >> >> // Spark importsimport org.apache.spark.{SparkConf, SparkContext}import = org.apache.spark.SparkContext._ >> import org.apache.spark.rdd.RDD >> import org.apache.spark.sql.{SchemaRDD,SQLContext}import org.apache.spar= k.sql.hive._ >> // ES importsimport org.elasticsearch.spark._import org.elasticsearch.sp= ark.sql._ >> def main(args: Array[String]) { >> val sc =3D sparkInit >> >> @transient >> val hiveContext =3D new org.apache.spark.sql.hive.HiveContext(sc) >> >> import hiveContext._ >> >> val start =3D System.currentTimeMillis() >> >> /* >> * Read from ES and provide some insights with SparkSQL >> */ >> val esData =3D sc.esRDD(s"${ElasticSearch.Index}/${ElasticSearch.Typ= e}") >> >> esData.collect.foreach(println(_)) >> >> val end =3D System.currentTimeMillis() >> println(s"Total time: ${end-start} ms") >> >> println("Create Metric Temporary Table for querying") >> >> val schemaRDD =3D hiveContext.sql( >> "CREATE TEMPORARY TABLE metric " + >> "USING org.elasticsearch.spark.sql " + >> "OPTIONS (resource 'device/metric')" ) >> >> hiveContext.sql( >> """SELECT path, name, value, v1.peValue, v1.peName >> FROM metric >> lateral view json_tuple(pathElements, 'name', 'value') v1 >> as peName, peValue >> """) >> .collect.foreach(println(_)) >> } >> } >> >> More than likely I=E2=80=99m missing a jar, but not sure what that would= be. >> >> -Todd >> > > --001a11c1dada53235b0512c15511 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Akhil,

Tried your suggestion to no a= vail.=C2=A0 I actually to not see and "jackson" or "json ser= de" jars in the $HIVE/lib directory.=C2=A0 This is hive 0.13.1 and spa= rk 1.2.1

Here is what I did:

<= div>

I have added the lib folder to the =E2=80=93jars option when sta= rting the spark-shell,
but the job fails. The hive-site.xml is in the $S= PARK_HOME/conf directory.

I start= the spark-shell as follows:

./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver=
-class-path /usr/local/spark/lib/mysql-connector-=
java-5.1.34-bin.jar

and like this

./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path /usr/local/spark/lib/mys=
ql-connector-java-5.1.34-bin<=
span class=3D"" style=3D"color:rgb(136,0,0);outline:none!important">.j=
ar --jars /opt/hive/0.13.1/lib/*

I=E2=80=99m just doing this in the spark-shell no= w:

import org.a=
pache.spark.sql.hive._
val sqlContext =3D new HiveContext(sc)
import sqlContext._
case class Metri=
cTable(path: Strin=
g, pathElements: String, name: String, value: String)
val mt =3D new MetricTable("""path": "/DC1/HOST1/"&qu=
ot;",
    &q=
uot;""pathElements": [{"node": "DataCenter&qu=
ot;,"value": "DC1"},{"node": "host"=
,"value": "HOST1"}]""",
    &q=
uot;""name": "Memory Usage (%)"""=
,
    &q=
uot;""value": 29.590943279257175""")
val rdd1 =3D sc.makeRDD(List(mt))
rdd1.printSchema()
rdd1.registerTempTable("metric_table")
sql(
    &q=
uot;""SELECT path, name, value, v1.peValue, v1.peName
         FROM metric_table
           lateral view json_tuple(pathElements, 'name', 'value=
') v1
             as peName, peValue
    """)
    .collect.foreach(println(_))

It results in the same error:

15/04/02 12:33:59 INFO ParseDriver: Parsing command: SELECT=
 path, name, value, v1.peValue, v1.peName
         FROM metric_tabl=
e
           lateral view j=
son_tuple(pathElements, 'name', 'value') v1
             as peName, p=
eValue
15/04/02 12:34:00 INFO ParseDriver: Parse Completed
res2: org.apache.spark.sql.SchemaRDD =3D
SchemaRDD[5] at RDD at SchemaRDD.scala:108
=3D=3D=
 Query Plan =3D=3D
=3D=3D=
 Physical Plan =3D=3D
java.lang.ClassNotFoundException: json_tuple

Any other suggestions or am I doing something else wrong here= ?

-Todd<= /p>




On Thu, Apr 2, 2015 at 2:00 AM, Akhil Das= <akhil@sigmoidanalytics.com> wrote:
Try adding all the jars in= your $HIVE/lib directory. If you want the specific jar, you could look fr = jackson or json serde in it.

Thanks
Best Regards

On Thu, Apr 2, 2015 at 12:49 AM, Todd Nist <= span dir=3D"ltr"><tsindotg@gmail.com> wrote:

I have a feeling I=E2=80=99m missing a Jar that provides the = support or could this may be related to=C2=A0https://iss= ues.apache.org/jira/browse/SPARK-5792. If it is a Jar where would I find that ? I would have though= t in the $HIVE/lib folder, but not sure which jar contains it.

Error:

Create Metric Temporary=
 Table for queryi=
ng
15/04/01 14:41:44 INFO HiveMetaStore: 0: Opening raw store with implemenation clas=
s:org.apache.hadoop.hive.metastore.ObjectStore
15/04/01 14:41:44 INFO ObjectStore: ObjectStore, initialize =
called
15/04/01 14:41:45 INFO Persistence: Property hive.metastore.=
integral.jdo.pushdown unknown - will be ignored
15/04/01 14:41:45 INFO Persistence: Property datanucleus.cac=
he.level2 unknown - will be ignored
15/04/01 14:41:45 INFO BlockManager: Removing broadcast 0
15/04/01 14:41:45 INFO BlockManager: Removing block broadcas=
t_0
15/04/01 14:41:45 INFO MemoryStore: Block broadcast_0 of size 1272 dropped from memory (free 278018571)
15/04/01 14:41:45 INFO BlockManager: Removing block broadcas=
t_0_piece0
15/04/01 14:41:45 INFO MemoryStore: Block broadcast_0_piece0=
 of size 869 dropped from memory (free 278019440)
15/04/01 14:41:45 INFO BlockManagerInfo: Removed broadcast_0=
_piece0 on <=
span style=3D"color:rgb(0,136,0);outline:none!important">192.168.1.5:63230 in memory (size: 869.0 B, free: 265.1 MB)
15/04/01 14:41:45 INFO BlockManagerMaster: Updated info of block broadca=
st_0_piece0
15/04/01 14:41:45 INFO BlockManagerInfo: Removed broadcast_0=
_piece0 on <=
span style=3D"color:rgb(0,136,0);outline:none!important">192.168.1.5:63278 in memory (size: 869.0 B, free: 530.0 MB)
15/04/01 14:41:45 INFO ContextCleaner: Cleaned broadcast 0
15/04/01 14:41:46 INFO ObjectStore: Setting MetaStore object=
 pin classes with hive.metastore.cache.pinobjtypes=3D"Table,StorageDescriptor,SerDeInfo,Partition,D=
atabase,Type,FieldSchema,Order"
15/04/01 14:41:46 INFO Datastore: The class "org.apache.hadoop.hive.meta=
store.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/04/01 14:41:46 INFO Datastore: The class "org.apache.hadoop.hive.meta=
store.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/04/01 14:41:47 INFO Datastore: The class "org.apache.hadoop.hive.meta=
store.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/04/01 14:41:47 INFO Datastore: The class "org.apache.hadoop.hive.meta=
store.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/04/01 14:41:47 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQ=
LQuery@0" since the connection used is closing
15/04/01 14:41:47 INFO ObjectStore: Initialized ObjectStore
15/04/01 14:41:47 INFO HiveMetaStore: Added admin role in metastore
15/04/01 14:41:47 INFO HiveMetaStore: Added public role in metastore
15/04/01 14:41:48 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/04/01 14:41:48 INFO SessionState: No Tez session required at this point. hive.execution.engin=
e=3Dmr.
15/04/01 14:41:49 INFO ParseDriver: Parsing command: SELECT path, name, =
value, v1.pe=
Value, v1.peName
             FROM metric
             lateral view json_tuple(pathElements, 'name', 'value') v1
               as peName, peValue
15/04/01 14:41:49 INFO ParseDriver: Parse Completed
Exception in thread "main" java.lang.ClassNotFoundException: json_tuple
    at java.=
net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.=
net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.=
security.AccessController.doPrivileged(Native Method)
    at java.=
net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.=
lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.=
lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.a=
pache.spark.sql.hive.HiveFunctionWrapper.createFunction(Shim13.scala:141)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.function$lzycompute(hiveUdfs.scala:261)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.function(hiveUdfs.scala:261)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.outputInspector$lzycompute(hiveUdfs.scala:267)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.outputInspector(hiveUdfs.scala:267)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.outputDataTypes$lzycompute(hiveUdfs.scala:272)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.outputDataTypes(hiveUdfs.scala:272)
    at org.a=
pache.spark.sql.hive.HiveGenericUdtf.makeOutput(hiveUdfs.scala:278)
    at org.a=
pache.spark.sql.catalyst.expressions.Generator.output(generators.scala:60)
    at org.a=
pache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.apply(basicOperators.scala:50)
	at org.apac=
he.spark.sql=
.catalyst.plans.logical.Generate$$anonfun$1.apply(basicOperators.scala:50)
    at scala=
.Option.map(=
Option.scala=
:145)
    at org.a=
pache.spark.sql.catalyst.plans.logical.Generate.generatorOutput(basicOperators.scala:50)
    at org.a=
pache.spark.sql.catalyst.plans.logical.Generate.output(basicOperators.scala:60)
    at org.a=
pache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveChildren$1.apply(LogicalPlan.s=
cala:118)
	at org.apac=
he.spark.sql=
.catalyst.plans.logical.LogicalPlan$$anonfun$resolveChildren$1.apply(LogicalPlan.scal=
a:118)
    at scala=
.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
	at scala.co=
llection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala=
.collection.immutable.List.foreach(List.scala:318)
    at scala=
.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
    at scala=
.collection.AbstractTraversable.flatMap(Traversable.scala:105)
    at org.a=
pache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveChildren(LogicalPlan.scala:118)
    at org.a=
pache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$6$$anonfun$applyOrEls=
e$1.applyO=
rElse(Analyzer.scala:159)
    at org.a=
pache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$6$$anonfun$applyOrEls=
e$1.applyO=
rElse(Analyzer.scala:156)
    at org.a=
pache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
    at org.a=
pache.spark.sql.catal=
yst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$transformExpress=
ionDown$1(=
QueryPlan.scala:7=
1)
	at org.apac=
he.spark.sql=
.catalyst.plans.QueryPlan$$anonfun$1$$anonfun$apply$1.apply(QueryPlan.scala:85)
    at scala=
.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
	at scala.co=
llection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala=
.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala=
.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at scala=
.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala=
.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.a=
pache.spark.sql.catalyst.plans.QueryPlan$$anonfun$1.apply(QueryPlan.scala:84)
	at scala.co=
llection.Iterator$$anon$11.ne=
xt(Iterator.scala:328)
    at scala=
.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala=
.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala=
.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
    at scala=
.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
    at scala=
.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
    at scala=
.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
    at scala=
.collection.AbstractIterator.to(Iterator.scala:1157)
    at scala=
.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
    at scala=
.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
    at scala=
.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
    at scala=
.collection.AbstractIterator.toArray(Iterator.scala:1157)
    at org.a=
pache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDown(QueryPlan.scala:89)
    at org.a=
pache.spark.sql.catalyst.plans.QueryPlan.transformExpressions(QueryPlan.scala:60)
    at org.a=
pache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$6.applyOrElse(Analyze=
r.scala:156)
    at org.a=
pache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$6.applyOrElse(Analyze=
r.scala:153)
    at org.a=
pache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:206)
    at org.a=
pache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:153)
    at org.a=
pache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:152)
    at org.a=
pache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
    at org.a=
pache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
    at scala=
.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
    at scala=
.collection.immutable.List.foldLeft(List.scala:84)
    at org.a=
pache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
	at org.apac=
he.spark.sql=
.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
    at scala=
.collection.immutable.List.foreach(List.scala:318)
    at org.a=
pache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:412)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.scala:412)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:413)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:413)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)
    at org.a=
pache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)
    at org.a=
pache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:444)
    at com.o=
psdatastore.elasticsearch.spark.ElasticSearchReadWrite$.main(ElasticSearchR=
eadWrite.scala:11=
9)
    at com.o=
psdatastore.elasticsearch.spark.ElasticSearchReadWrite.main(ElasticSearchRe=
adWrite.scala)
    at sun.r=
eflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.r=
eflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.r=
eflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav=
a:43)
    at java.=
lang.reflect.Method.invoke(Method.java:483)
    at org.a=
pache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
    at org.a=
pache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.a=
pache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
<= p style=3D"margin:0px 0px 1.1em;outline:none!important">Json:

"metric": {

    "path&qu=
ot;: "=
;/PA/Pittsburgh/12345 Westbrook Drive/main/theromostat-1",
    "pathEle=
ments": [
    {
        "nod=
e": &=
quot;State",
        "val=
ue": =
"PA"
    },
    {
        "nod=
e": &=
quot;City",
        "val=
ue": =
"Pittsburgh"
    },
    {
        "nod=
e": &=
quot;Street",
        "val=
ue": =
"12345 Westbrook Drive"
    },
    {
        "nod=
e": &=
quot;level",
        "val=
ue": =
"main"
    },
    {
        "nod=
e": &=
quot;device",
        "val=
ue": =
"thermostat"
    }
    ],
    "name&qu=
ot;: "=
;Current Temperature",
    "value&q=
uot;: 29.5=
90943279257175,
    "timesta=
mp": =
"2015-03-27T14:53:46+0000"
  }

Here is the code that produ= ces the error:

// Spark imports
import org.a=
pache.spark.{SparkConf, SparkContext}
import org.a=
pache.spark.SparkContext._

import org.a=
pache.spark.rdd.RDD

import org.a=
pache.spark.sql.{SchemaRDD,SQLContext}
import org.a=
pache.spark.sql.hive._

// ES imports=

import org.e=
lasticsearch.spark._
import org.e=
lasticsearch.spark.sql._

def main(arg=
s: Array[String]) {
    val sc =
=3D sparkInit

    @transien=
t
    val hive=
Context =3D new org.apache.spark.sql.hive.HiveContext(sc)

    import h=
iveContext._

    val star=
t =3D System.currentTimeMillis()

    /*
     * Read from ES and provide some insights with SparkSQL
     */
    val esDa=
ta =3D sc.esRDD(s=
"${ElasticSearch.Index}/${ElasticSearch.Type}")

    esData.collect.foreach(println(_))

    val end =
=3D System.currentTimeMillis()
    println(s&quo=
t;Total time: ${end-start} ms")

    println("=
;Create Metric Temporary Table for querying")

    val sche=
maRDD =3D hiveContext.sql(
          "C=
REATE TEMPORARY TABLE metric     " +=20
          "U=
SING org.elasticsearch.spark.sql " +
          "O=
PTIONS (resource 'device/metric')" )

    hiveContext.sql(
        "&qu=
ot;"SELECT path, name, value, v1.peValue, v1.peName=20
             FROM metric=20
             lateral view json_tuple(pathElements, 'name', 'val=
ue') v1
               as peName, peValue
        """)
        .collect.foreach(println(_))       =20
  }       =20
}

More than likely I=E2=80=99m mis= sing a jar, but not sure what that would be.

-Todd



--001a11c1dada53235b0512c15511--