systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [systemml] niketanpansare commented on issue #857: [SYSTEMML-2523] Update SystemML to Support Spark 2.3.0
Date Thu, 21 Mar 2019 20:43:37 GMT
niketanpansare commented on issue #857: [SYSTEMML-2523] Update SystemML to Support Spark 2.3.0
URL: https://github.com/apache/systemml/pull/857#issuecomment-475394794
 
 
   @romeokienzler You are getting the error because the setup contains two SystemML (possibly
conflicting dependencies) jars. There are two possible solutions to your problem:
   1. *Recommended:* Remove the older incubating jar and do not include the corresponding
1.2.0 or 1.3.0-snapshot jars (i.e. no need for `ln -s` trick).
   2. Use the python package compiled by this PR.
   
   Since there are is weird behavior, I am including the logs. I apologize it advance for
the long trace, but I felt it shed some light on the error. Please ignore the below logs if
you agree to the above statements:
   
   Setup 1. With only incubating jar (FAILS !!)
   ```
   $ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory 20g --master local[*] --driver-class-path
systemml-0.14.0-incubating.jar
   Python 3.6.3 (default, Mar 20 2018, 13:50:41) 
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   2019-03-21 13:07:11 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
         /_/
   
   Using Python version 3.6.3 (default, Mar 20 2018 13:50:41)
   SparkSession available as 'spark'.
   >>> from systemml import MLContext
   >>> ml = MLContext(spark)
   2019-03-21 13:07:20 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
   
   Welcome to Apache SystemML!
   
   >>> ml.version()
   '0.14.0-incubating'
   >>> df=spark.read.parquet('shake.parquet')
   >>> df.show()
   +-----+---------+-----+-----+-----+
   |CLASS| SENSORID|    X|    Y|    Z|
   +-----+---------+-----+-----+-----+
   |    2| qqqqqqqq| 0.12| 0.12| 0.12|
   |    2|aUniqueID| 0.03| 0.03| 0.03|
   |    2| qqqqqqqq|-3.84|-3.84|-3.84|
   |    2| 12345678| -0.1| -0.1| -0.1|
   |    2| 12345678|-0.15|-0.15|-0.15|
   |    2| 12345678| 0.47| 0.47| 0.47|
   |    2| 12345678|-0.06|-0.06|-0.06|
   |    2| 12345678|-0.09|-0.09|-0.09|
   |    2| 12345678| 0.21| 0.21| 0.21|
   |    2| 12345678|-0.08|-0.08|-0.08|
   |    2| 12345678| 0.44| 0.44| 0.44|
   |    2|    gholi| 0.76| 0.76| 0.76|
   |    2|    gholi| 1.62| 1.62| 1.62|
   |    2|    gholi| 5.81| 5.81| 5.81|
   |    2| bcbcbcbc| 0.58| 0.58| 0.58|
   |    2| bcbcbcbc|-8.24|-8.24|-8.24|
   |    2| bcbcbcbc|-0.45|-0.45|-0.45|
   |    2| bcbcbcbc| 1.03| 1.03| 1.03|
   |    2|aUniqueID|-0.05|-0.05|-0.05|
   |    2| qqqqqqqq|-0.44|-0.44|-0.44|
   +-----+---------+-----+-----+-----+
   only showing top 20 rows
   
   >>> df.createOrReplaceTempView("df")
   ANTLR Tool version 4.7 used for code generation does not match the current runtime version
4.5.3ANTLR Runtime version 4.7 used for parser compilation does not match the current runtime
version 4.5.3Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/dataframe.py", line
176, in createOrReplaceTempView
       self._jdf.createOrReplaceTempView(name)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py",
line 1160, in __call__
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63,
in deco
       return f(*a, **kw)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py",
line 320, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o52.createOrReplaceTempView.
   : java.lang.ExceptionInInitializerError
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:84)
   	at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseTableIdentifier(ParseDriver.scala:49)
   	at org.apache.spark.sql.Dataset.createTempViewCommand(Dataset.scala:3079)
   	at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3034)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:214)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN;
Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:153)
   	at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:1153)
   	... 16 more
   Caused by: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize
ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	... 18 more
   
   >>>
   ```
   
   Setup 2: Put the older incubating jar before the current SystemML 1.2.0 jars (FAILS !!)
   ```
   $ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory 20g --master local[*] --driver-class-path
systemml-0.14.0-incubating.jar:systemml-1.2.0-extra.jar:systemml-1.2.0.jar
   Python 3.6.3 (default, Mar 20 2018, 13:50:41) 
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   2019-03-21 13:12:11 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
         /_/
   
   Using Python version 3.6.3 (default, Mar 20 2018 13:50:41)
   SparkSession available as 'spark'.
   >>> from systemml import MLContext
   >>> ml = MLContext(spark)
   2019-03-21 13:12:21 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
   
   Welcome to Apache SystemML!
   
   >>> ml.version()
   '0.14.0-incubating'
   >>> df=spark.read.parquet('shake.parquet')
   >>> df.show()
   +-----+---------+-----+-----+-----+
   |CLASS| SENSORID|    X|    Y|    Z|
   +-----+---------+-----+-----+-----+
   |    2| qqqqqqqq| 0.12| 0.12| 0.12|
   |    2|aUniqueID| 0.03| 0.03| 0.03|
   |    2| qqqqqqqq|-3.84|-3.84|-3.84|
   |    2| 12345678| -0.1| -0.1| -0.1|
   |    2| 12345678|-0.15|-0.15|-0.15|
   |    2| 12345678| 0.47| 0.47| 0.47|
   |    2| 12345678|-0.06|-0.06|-0.06|
   |    2| 12345678|-0.09|-0.09|-0.09|
   |    2| 12345678| 0.21| 0.21| 0.21|
   |    2| 12345678|-0.08|-0.08|-0.08|
   |    2| 12345678| 0.44| 0.44| 0.44|
   |    2|    gholi| 0.76| 0.76| 0.76|
   |    2|    gholi| 1.62| 1.62| 1.62|
   |    2|    gholi| 5.81| 5.81| 5.81|
   |    2| bcbcbcbc| 0.58| 0.58| 0.58|
   |    2| bcbcbcbc|-8.24|-8.24|-8.24|
   |    2| bcbcbcbc|-0.45|-0.45|-0.45|
   |    2| bcbcbcbc| 1.03| 1.03| 1.03|
   |    2|aUniqueID|-0.05|-0.05|-0.05|
   |    2| qqqqqqqq|-0.44|-0.44|-0.44|
   +-----+---------+-----+-----+-----+
   only showing top 20 rows
   
   >>> df.createOrReplaceTempView("df")
   ANTLR Tool version 4.7 used for code generation does not match the current runtime version
4.5.3ANTLR Runtime version 4.7 used for parser compilation does not match the current runtime
version 4.5.3Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/dataframe.py", line
176, in createOrReplaceTempView
       self._jdf.createOrReplaceTempView(name)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py",
line 1160, in __call__
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63,
in deco
       return f(*a, **kw)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py",
line 320, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o52.createOrReplaceTempView.
   : java.lang.ExceptionInInitializerError
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:84)
   	at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseTableIdentifier(ParseDriver.scala:49)
   	at org.apache.spark.sql.Dataset.createTempViewCommand(Dataset.scala:3079)
   	at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3034)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:214)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN;
Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:153)
   	at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:1153)
   	... 16 more
   Caused by: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize
ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	... 18 more
   
   >>>
   ```
   
   Setup 3: Put the the current SystemML 1.2.0 jars before the older incubating jar (FAILS
!!)
   ```
   $ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory 20g --master local[*] --driver-class-path
systemml-1.2.0-extra.jar:systemml-1.2.0.jar:systemml-0.14.0-incubating.jar
   Python 3.6.3 (default, Mar 20 2018, 13:50:41) 
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   2019-03-21 13:14:49 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
         /_/
   
   Using Python version 3.6.3 (default, Mar 20 2018 13:50:41)
   SparkSession available as 'spark'.
   >>> from systemml import MLContext
   >>> ml = MLContext(spark)
   2019-03-21 13:15:11 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
   
   Welcome to Apache SystemML!
   Version 1.2.0
   >>> ml.version()
   '1.2.0'
   >>> df=spark.read.parquet('shake.parquet')
   >>> df.show()
   +-----+---------+-----+-----+-----+
   |CLASS| SENSORID|    X|    Y|    Z|
   +-----+---------+-----+-----+-----+
   |    2| qqqqqqqq| 0.12| 0.12| 0.12|
   |    2|aUniqueID| 0.03| 0.03| 0.03|
   |    2| qqqqqqqq|-3.84|-3.84|-3.84|
   |    2| 12345678| -0.1| -0.1| -0.1|
   |    2| 12345678|-0.15|-0.15|-0.15|
   |    2| 12345678| 0.47| 0.47| 0.47|
   |    2| 12345678|-0.06|-0.06|-0.06|
   |    2| 12345678|-0.09|-0.09|-0.09|
   |    2| 12345678| 0.21| 0.21| 0.21|
   |    2| 12345678|-0.08|-0.08|-0.08|
   |    2| 12345678| 0.44| 0.44| 0.44|
   |    2|    gholi| 0.76| 0.76| 0.76|
   |    2|    gholi| 1.62| 1.62| 1.62|
   |    2|    gholi| 5.81| 5.81| 5.81|
   |    2| bcbcbcbc| 0.58| 0.58| 0.58|
   |    2| bcbcbcbc|-8.24|-8.24|-8.24|
   |    2| bcbcbcbc|-0.45|-0.45|-0.45|
   |    2| bcbcbcbc| 1.03| 1.03| 1.03|
   |    2|aUniqueID|-0.05|-0.05|-0.05|
   |    2| qqqqqqqq|-0.44|-0.44|-0.44|
   +-----+---------+-----+-----+-----+
   only showing top 20 rows
   
   >>> df.createOrReplaceTempView("df")
   ANTLR Tool version 4.7 used for code generation does not match the current runtime version
4.5.3ANTLR Runtime version 4.7 used for parser compilation does not match the current runtime
version 4.5.3Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/dataframe.py", line
176, in createOrReplaceTempView
       self._jdf.createOrReplaceTempView(name)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py",
line 1160, in __call__
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63,
in deco
       return f(*a, **kw)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py",
line 320, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o52.createOrReplaceTempView.
   : java.lang.ExceptionInInitializerError
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:84)
   	at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseTableIdentifier(ParseDriver.scala:49)
   	at org.apache.spark.sql.Dataset.createTempViewCommand(Dataset.scala:3079)
   	at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3034)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:214)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN;
Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:153)
   	at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:1153)
   	... 16 more
   Caused by: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize
ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	... 18 more
   
   >>>
   ```
   
   Setup 4: Put the jar from the PR before the older incubating jar (SUCCEEDS !!)
   ```
   $ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory 20g --master local[*] --driver-class-path
systemml-1.3.0-SNAPSHOT-extra-pr.jar:systemml-1.3.0-SNAPSHOT-pr.jar:systemml-0.14.0-incubating.jar
   Python 3.6.3 (default, Mar 20 2018, 13:50:41) 
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   2019-03-21 13:19:59 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
         /_/
   
   Using Python version 3.6.3 (default, Mar 20 2018 13:50:41)
   SparkSession available as 'spark'.
   >>> from systemml import MLContext
   >>> ml = MLContext(spark)
   2019-03-21 13:20:22 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
   
   Welcome to Apache SystemML!
   Version 1.3.0-SNAPSHOT
   >>> ml.version()
   '1.3.0-SNAPSHOT'
   >>> df=spark.read.parquet('shake.parquet')
   >>> df.show()
   +-----+---------+-----+-----+-----+
   |CLASS| SENSORID|    X|    Y|    Z|
   +-----+---------+-----+-----+-----+
   |    2| qqqqqqqq| 0.12| 0.12| 0.12|
   |    2|aUniqueID| 0.03| 0.03| 0.03|
   |    2| qqqqqqqq|-3.84|-3.84|-3.84|
   |    2| 12345678| -0.1| -0.1| -0.1|
   |    2| 12345678|-0.15|-0.15|-0.15|
   |    2| 12345678| 0.47| 0.47| 0.47|
   |    2| 12345678|-0.06|-0.06|-0.06|
   |    2| 12345678|-0.09|-0.09|-0.09|
   |    2| 12345678| 0.21| 0.21| 0.21|
   |    2| 12345678|-0.08|-0.08|-0.08|
   |    2| 12345678| 0.44| 0.44| 0.44|
   |    2|    gholi| 0.76| 0.76| 0.76|
   |    2|    gholi| 1.62| 1.62| 1.62|
   |    2|    gholi| 5.81| 5.81| 5.81|
   |    2| bcbcbcbc| 0.58| 0.58| 0.58|
   |    2| bcbcbcbc|-8.24|-8.24|-8.24|
   |    2| bcbcbcbc|-0.45|-0.45|-0.45|
   |    2| bcbcbcbc| 1.03| 1.03| 1.03|
   |    2|aUniqueID|-0.05|-0.05|-0.05|
   |    2| qqqqqqqq|-0.44|-0.44|-0.44|
   +-----+---------+-----+-----+-----+
   only showing top 20 rows
   
   >>> df.createOrReplaceTempView("df")
   >>>
   ```
   
   Setup 5: No jar provided (SUCCEEDS !!)
   
   ```
   $ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory 20g --master local[*]
   Python 3.6.3 (default, Mar 20 2018, 13:50:41) 
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   2019-03-21 13:23:26 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
         /_/
   
   Using Python version 3.6.3 (default, Mar 20 2018 13:50:41)
   SparkSession available as 'spark'.
   >>> from systemml import MLContext
   >>> ml = MLContext(spark)
   2019-03-21 13:23:46 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
   
   Welcome to Apache SystemML!
   Version 1.2.0
   >>> ml.version()
   '1.2.0'
   >>> df=spark.read.parquet('shake.parquet')
   >>> df.show()
   +-----+---------+-----+-----+-----+
   |CLASS| SENSORID|    X|    Y|    Z|
   +-----+---------+-----+-----+-----+
   |    2| qqqqqqqq| 0.12| 0.12| 0.12|
   |    2|aUniqueID| 0.03| 0.03| 0.03|
   |    2| qqqqqqqq|-3.84|-3.84|-3.84|
   |    2| 12345678| -0.1| -0.1| -0.1|
   |    2| 12345678|-0.15|-0.15|-0.15|
   |    2| 12345678| 0.47| 0.47| 0.47|
   |    2| 12345678|-0.06|-0.06|-0.06|
   |    2| 12345678|-0.09|-0.09|-0.09|
   |    2| 12345678| 0.21| 0.21| 0.21|
   |    2| 12345678|-0.08|-0.08|-0.08|
   |    2| 12345678| 0.44| 0.44| 0.44|
   |    2|    gholi| 0.76| 0.76| 0.76|
   |    2|    gholi| 1.62| 1.62| 1.62|
   |    2|    gholi| 5.81| 5.81| 5.81|
   |    2| bcbcbcbc| 0.58| 0.58| 0.58|
   |    2| bcbcbcbc|-8.24|-8.24|-8.24|
   |    2| bcbcbcbc|-0.45|-0.45|-0.45|
   |    2| bcbcbcbc| 1.03| 1.03| 1.03|
   |    2|aUniqueID|-0.05|-0.05|-0.05|
   |    2| qqqqqqqq|-0.44|-0.44|-0.44|
   +-----+---------+-----+-----+-----+
   only showing top 20 rows
   
   >>> df.createOrReplaceTempView("df")
   >>>
   ```
   
   Setup 6: Provide just `1.2.0` jars (FAILS !!)
   ```
   $ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory 20g --master local[*] --driver-class-path
systemml-1.2.0
   systemml-1.2.0-extra.jar  systemml-1.2.0.jar        
   [npansar@dml3 debug_classpath]$ ~/spark-2.3.0-bin-hadoop2.7/bin/pyspark --driver-memory
20g --master local[*] --driver-class-path systemml-1.2.0.jar:systemml-1.2.0-extra.jar
   Python 3.6.3 (default, Mar 20 2018, 13:50:41) 
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   2019-03-21 13:32:09 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
         /_/
   
   Using Python version 3.6.3 (default, Mar 20 2018 13:50:41)
   SparkSession available as 'spark'.
   >>> from systemml import MLContext
   >>> ml = MLContext(spark)
   2019-03-21 13:32:25 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
   
   Welcome to Apache SystemML!
   Version 1.2.0
   >>> ml.version()
   '1.2.0'
   >>> df=spark.read.parquet('shake.parquet')
   >>> df.show()
   +-----+---------+-----+-----+-----+
   |CLASS| SENSORID|    X|    Y|    Z|
   +-----+---------+-----+-----+-----+
   |    2| qqqqqqqq| 0.12| 0.12| 0.12|
   |    2|aUniqueID| 0.03| 0.03| 0.03|
   |    2| qqqqqqqq|-3.84|-3.84|-3.84|
   |    2| 12345678| -0.1| -0.1| -0.1|
   |    2| 12345678|-0.15|-0.15|-0.15|
   |    2| 12345678| 0.47| 0.47| 0.47|
   |    2| 12345678|-0.06|-0.06|-0.06|
   |    2| 12345678|-0.09|-0.09|-0.09|
   |    2| 12345678| 0.21| 0.21| 0.21|
   |    2| 12345678|-0.08|-0.08|-0.08|
   |    2| 12345678| 0.44| 0.44| 0.44|
   |    2|    gholi| 0.76| 0.76| 0.76|
   |    2|    gholi| 1.62| 1.62| 1.62|
   |    2|    gholi| 5.81| 5.81| 5.81|
   |    2| bcbcbcbc| 0.58| 0.58| 0.58|
   |    2| bcbcbcbc|-8.24|-8.24|-8.24|
   |    2| bcbcbcbc|-0.45|-0.45|-0.45|
   |    2| bcbcbcbc| 1.03| 1.03| 1.03|
   |    2|aUniqueID|-0.05|-0.05|-0.05|
   |    2| qqqqqqqq|-0.44|-0.44|-0.44|
   +-----+---------+-----+-----+-----+
   only showing top 20 rows
   
   >>> df.createOrReplaceTempView("df")
   ANTLR Tool version 4.7 used for code generation does not match the current runtime version
4.5.3ANTLR Runtime version 4.7 used for parser compilation does not match the current runtime
version 4.5.3Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/dataframe.py", line
176, in createOrReplaceTempView
       self._jdf.createOrReplaceTempView(name)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py",
line 1160, in __call__
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63,
in deco
       return f(*a, **kw)
     File "/home/npansar/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py",
line 320, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o52.createOrReplaceTempView.
   : java.lang.ExceptionInInitializerError
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:84)
   	at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
   	at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseTableIdentifier(ParseDriver.scala:49)
   	at org.apache.spark.sql.Dataset.createTempViewCommand(Dataset.scala:3079)
   	at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3034)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:214)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN;
Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:153)
   	at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.<clinit>(SqlBaseLexer.java:1153)
   	... 16 more
   Caused by: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize
ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e
or a legacy UUID).
   	... 18 more
   
   >>> 
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message