spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Alwell <palw...@hortonworks.com>
Subject Re: Spark Dataframe and HIVE
Date Sun, 11 Feb 2018 17:51:01 GMT
Its not a bug. You're missing a parquet dependency for Hive. Try saving the sample table as
a CSV and seeing if that works. If it does, now you know its a matter of format. Lets try
to keep this simple and see why Hive isn’t reading the files you created.

+1 Deepak. Lets also make sure we are using the Hive DDL to point to the tables file and definition.

Does this have to be in parquet? Hive gets better performance from ORC, and there is a new
spark API for this.

Pat Alwell
Solutions Engineer
Hortonworks
M:(540)226-7365
palwell@hortonworks.com<mailto:palwell@hortonworks.com>




On Feb 11, 2018, at 01:02, ☼ R Nair (रविशंकर नायर) <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:

Hi,
So , is this a bug, or something I need to fix? If its our issue, how can we fix? Please help.

Best,

On Sun, Feb 11, 2018 at 3:49 AM, Shmuel Blitz <shmuel.blitz@similarweb.com<mailto:shmuel.blitz@similarweb.com>>
wrote:
Your table is missing a "PARTITIONED BY " section.

Spark 2.x save the partition information in the  TBLPROPERTIES  section.


On Sun, Feb 11, 2018 at 10:41 AM, Deepak Sharma <deepakmca05@gmail.com<mailto:deepakmca05@gmail.com>>
wrote:
I can see its trying to read the parquet and failing while decompressing using snappy:
parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:201)

So the table looks good but this needs to be fixed before you can query the data in hive.

Thanks
Deepak

On Sun, Feb 11, 2018 at 1:45 PM, ☼ R Nair (रविशंकर नायर) <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:
When I do that , and then do a select, full of errors. I think Hive table to read.

select * from mine;
OK
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)
at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:204)
at parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:557)
at parquet.column.impl.ColumnReaderImpl.access$300(ColumnReaderImpl.java:57)
at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:516)
at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:513)
at parquet.column.page.DataPageV1.accept(DataPageV1.java:96)
at parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:513)
at parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderImpl.java:505)
at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:607)
at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:351)
at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:66)
at parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:61)
at parquet.io<http://parquet.io>.RecordReaderImplementation.<init>(RecordReaderImplementation.java:270)
at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:134)
at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:99)
at parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:154)
at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:99)
at parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137)
at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:208)
at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:201)
at org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:122)
at org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:85)
at org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:673)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:323)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
... 52 more
Exception in thread "main" org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY]
null
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)
at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:204)
at parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:557)
at parquet.column.impl.ColumnReaderImpl.access$300(ColumnReaderImpl.java:57)
at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:516)
at parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:513)
at parquet.column.page.DataPageV1.accept(DataPageV1.java:96)
at parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:513)
at parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderImpl.java:505)
at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:607)
at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:351)
at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:66)
at parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:61)
at parquet.io<http://parquet.io>.RecordReaderImplementation.<init>(RecordReaderImplementation.java:270)
at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:134)
at parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:99)
at parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:154)
at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:99)
at parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137)
at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:208)
at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:201)
at org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:122)
at org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:85)
at org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:673)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:323)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Feb 11, 2018 3:14:06 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter
due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized
will read a total of 36635 records.
Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading
next block
Feb 11, 2018 3:14:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory
in 27 ms. row count = 36635


On Sun, Feb 11, 2018 at 3:10 AM, Deepak Sharma <deepakmca05@gmail.com<mailto:deepakmca05@gmail.com>>
wrote:
There was a typo:
Instead of :
alter table mine set locations "hdfs://localhost:8020/user/hive/warehouse/mine";

Use :
alter table mine set location "hdfs://localhost:8020/user/hive/warehouse/mine";

On Sun, Feb 11, 2018 at 1:38 PM, Deepak Sharma <deepakmca05@gmail.com<mailto:deepakmca05@gmail.com>>
wrote:
Try this in hive:
alter table mine set locations "hdfs://localhost:8020/user/hive/warehouse/mine";

Thanks
Deepak

On Sun, Feb 11, 2018 at 1:24 PM, ☼ R Nair (रविशंकर नायर) <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:
Hi,
Here you go:

hive> show create table mine;
OK
CREATE TABLE `mine`(
  `policyid` int,
  `statecode` string,
  `socialid` string,
  `county` string,
  `eq_site_limit` decimal(10,2),
  `hu_site_limit` decimal(10,2),
  `fl_site_limit` decimal(10,2),
  `fr_site_limit` decimal(10,2),
  `tiv_2014` decimal(10,2),
  `tiv_2015` decimal(10,2),
  `eq_site_deductible` int,
  `hu_site_deductible` int,
  `fl_site_deductible` int,
  `fr_site_deductible` int,
  `latitude` decimal(6,6),
  `longitude` decimal(6,6),
  `line` string,
  `construction` string,
  `point_granularity` int)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.serde.ParquetHiveSerDe'
WITH SERDEPROPERTIES (
  'path'='hdfs://localhost:8020/user/hive/warehouse/mine')
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.MapredParquetOutputFormat'
LOCATION
  'file:/Users/ravishankarnair/spark-warehouse/mine'
TBLPROPERTIES (
  'spark.sql.sources.provider'='parquet',
  'spark.sql.sources.schema.numParts'='1',
  'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}},{\"name\":\"statecode\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name\":\"Socialid\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\"county\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"county\",\"scale\":0}},{\"name\":\"eq_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"eq_site_limit\",\"scale\":2}},{\"name\":\"hu_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"hu_site_limit\",\"scale\":2}},{\"name\":\"fl_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"fl_site_limit\",\"scale\":2}},{\"name\":\"fr_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"fr_site_limit\",\"scale\":2}},{\"name\":\"tiv_2014\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"tiv_2014\",\"scale\":2}},{\"name\":\"tiv_2015\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"tiv_2015\",\"scale\":2}},{\"name\":\"eq_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"eq_site_deductible\",\"scale\":0}},{\"name\":\"hu_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"hu_site_deductible\",\"scale\":0}},{\"name\":\"fl_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"fl_site_deductible\",\"scale\":0}},{\"name\":\"fr_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"fr_site_deductible\",\"scale\":0}},{\"name\":\"latitude\",\"type\":\"decimal(6,6)\",\"nullable\":true,\"metadata\":{\"name\":\"latitude\",\"scale\":6}},{\"name\":\"longitude\",\"type\":\"decimal(6,6)\",\"nullable\":true,\"metadata\":{\"name\":\"longitude\",\"scale\":6}},{\"name\":\"line\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"line\",\"scale\":0}},{\"name\":\"construction\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"construction\",\"scale\":0}},{\"name\":\"point_granularity\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"point_granularity\",\"scale\":0}}]}',
  'transient_lastDdlTime'='1518335598')
Time taken: 0.13 seconds, Fetched: 35 row(s)

On Sun, Feb 11, 2018 at 2:36 AM, Shmuel Blitz <shmuel.blitz@similarweb.com<mailto:shmuel.blitz@similarweb.com>>
wrote:
Please run the following command, and paste the result:
SHOW CREATE TABLE <<TABLE-NAME>>

On Sun, Feb 11, 2018 at 7:56 AM, ☼ R Nair (रविशंकर नायर) <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:
No, No luck.

Thanks

On Sun, Feb 11, 2018 at 12:48 AM, Deepak Sharma <deepakmca05@gmail.com<mailto:deepakmca05@gmail.com>>
wrote:
In hive cli:
msck repair table 《table_name》;

Thanks
Deepak

On Feb 11, 2018 11:14, "☼ R Nair (रविशंकर नायर)" <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:
NO, can you pease explain the command ? Let me try now.

Best,

On Sun, Feb 11, 2018 at 12:40 AM, Deepak Sharma <deepakmca05@gmail.com<mailto:deepakmca05@gmail.com>>
wrote:
I am not sure about the exact issue bjt i see you are partioning while writing from spark.
Did you tried msck repair on the table before reading it in hive ?

Thanks
Deepak

On Feb 11, 2018 11:06, "☼ R Nair (रविशंकर नायर)" <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:
All,

Thanks for the inputs. Again I am not successful. I think, we need to resolve this, as this
is a very common requirement.

Please go through my complete code:

STEP 1:  Started Spark shell as spark-shell --master yarn

STEP 2: Flowing code is being given as inout to shark shell

import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession
val warehouseLocation ="/user/hive/warehouse"

val spark = SparkSession.builder().appName("Spark Hive Example").config("spark.sql.warehouse.dir",
warehouseLocation).enableHiveSupport().getOrCreate()

import org.apache.spark.sql._
var passion_df = spark.read.
format("jdbc").
option("url", "jdbc:mysql://localhost:3307/policies").
option("driver" ,"com.mysql.jdbc.Driver").
option("user", "root").
option("password", "root").
option("dbtable", "insurancedetails").
option("partitionColumn", "policyid").
option("lowerBound", "1").
option("upperBound", "100000").
option("numPartitions", "4").
load()
//Made sure that passion_df is created, as passion_df.show(5) shows me correct data.
passion_df.write.saveAsTable("default.mine") //Default parquet

STEP 3: Went to HIVE. Started HIVE prompt.

hive> show tables;
OK
callcentervoicelogs
mine
Time taken: 0.035 seconds, Fetched: 2 row(s)
//As you can see HIVE is showing the table "mine" in default schema.

STEP 4: HERE IS THE PROBLEM.

hive> select * from mine;
OK
Time taken: 0.354 seconds
hive>
//Where is the data ???

STEP 5:

See the below command on HIVE

describe formatted mine;
OK
# col_name             data_type           comment

policyid             int
statecode           string
socialid             string
county               string
eq_site_limit       decimal(10,2)
hu_site_limit       decimal(10,2)
fl_site_limit       decimal(10,2)
fr_site_limit       decimal(10,2)
tiv_2014             decimal(10,2)
tiv_2015             decimal(10,2)
eq_site_deductible   int
hu_site_deductible   int
fl_site_deductible   int
fr_site_deductible   int
latitude             decimal(6,6)
longitude           decimal(6,6)
line                 string
construction         string
point_granularity   int

# Detailed Table Information
Database:           default
Owner:               ravishankarnair
CreateTime:         Sun Feb 11 00:26:40 EST 2018
LastAccessTime:     UNKNOWN
Protect Mode:       None
Retention:           0
Location:           file:/Users/ravishankarnair/spark-warehouse/mine
Table Type:         MANAGED_TABLE
Table Parameters:
spark.sql.sources.provider parquet
spark.sql.sources.schema.numParts 1
spark.sql.sources.schema.part.0 {\"type\":\"struct\",\"fields\":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}},{\"name\":\"statecode\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name\":\"Socialid\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\"county\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"county\",\"scale\":0}},{\"name\":\"eq_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"eq_site_limit\",\"scale\":2}},{\"name\":\"hu_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"hu_site_limit\",\"scale\":2}},{\"name\":\"fl_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"fl_site_limit\",\"scale\":2}},{\"name\":\"fr_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"fr_site_limit\",\"scale\":2}},{\"name\":\"tiv_2014\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"tiv_2014\",\"scale\":2}},{\"name\":\"tiv_2015\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\":\"tiv_2015\",\"scale\":2}},{\"name\":\"eq_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"eq_site_deductible\",\"scale\":0}},{\"name\":\"hu_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"hu_site_deductible\",\"scale\":0}},{\"name\":\"fl_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"fl_site_deductible\",\"scale\":0}},{\"name\":\"fr_site_deductible\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"fr_site_deductible\",\"scale\":0}},{\"name\":\"latitude\",\"type\":\"decimal(6,6)\",\"nullable\":true,\"metadata\":{\"name\":\"latitude\",\"scale\":6}},{\"name\":\"longitude\",\"type\":\"decimal(6,6)\",\"nullable\":true,\"metadata\":{\"name\":\"longitude\",\"scale\":6}},{\"name\":\"line\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"line\",\"scale\":0}},{\"name\":\"construction\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\":\"construction\",\"scale\":0}},{\"name\":\"point_granularity\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\":\"point_granularity\",\"scale\":0}}]}
transient_lastDdlTime 1518326800

# Storage Information
SerDe Library:       org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.serde.ParquetHiveSerDe
InputFormat:         org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.MapredParquetInputFormat
OutputFormat:       org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.parquet.MapredParquetOutputFormat
Compressed:         No
Num Buckets:         -1
Bucket Columns:     []
Sort Columns:       []
Storage Desc Params:
path                 hdfs://localhost:8020/user/hive/warehouse/mine
serialization.format 1
Time taken: 0.077 seconds, Fetched: 48 row(s)

Now, I see your advise and support. Whats the issue? Am I doing wrong, it it a bug ? I am
using Spark 2.2.1, HIVE 1.2.1, HADOOP 2.7.3. All class path, configuration are set properly.

Best,

Ravion

On Fri, Feb 9, 2018 at 1:29 PM, Nicholas Hakobian <nicholas.hakobian@rallyhealth.com<mailto:nicholas.hakobian@rallyhealth.com>>
wrote:
Its possible that the format of your table is not compatible with your version of hive, so
Spark saved it in a way such that only Spark can read it. When this happens it prints out
a very visible warning letting you know this has happened.

We've seen it most frequently when trying to save a parquet file with a column in date format
into a Hive table. In older versions of hive, its parquet reader/writer did not support Date
formats (among a couple others).

Nicholas Szandor Hakobian, Ph.D.
Staff Data Scientist
Rally Health
nicholas.hakobian@rallyhealth.com<mailto:nicholas.hakobian@rallyhealth.com>


On Fri, Feb 9, 2018 at 9:59 AM, Prakash Joshi <prakashcjoshi5@gmail.com<mailto:prakashcjoshi5@gmail.com>>
wrote:
Ravi,

Can you send the result of
Show create table your_table_name

Thanks
Prakash

On Feb 9, 2018 8:20 PM, "☼ R Nair (रविशंकर नायर)" <ravishankar.nair@gmail.com<mailto:ravishankar.nair@gmail.com>>
wrote:
All,

It has been three days continuously I am on this issue. Not getting any clue.

Environment: Spark 2.2.x, all configurations are correct. hive-site.xml is in spark's conf.

1) Step 1: I created a data frame DF1 reading a csv file.

2) Did  manipulations on DF1. Resulting frame is passion_df.

3) passion_df.write.format("orc").saveAsTable("sampledb.passion")

4) The metastore shows the hive table., when I do "show tables" in HIVE, I can see table name

5) I can't select in HIVE, though I can select from SPARK as spark.sql("select * from sampledb.passion")

Whats going on here? Please help. Why I am not seeing data from HIVE prompt?
The "describe formatted " command on the table in HIVE shows he data is is in default warehouse
location ( /user/hive/warehouse) since I set it.

I am not getting any definite answer anywhere. Many suggestions and answers given in Stackoverflow
et al.Nothing really works.

So asking experts here for some light on this, thanks

Best,
Ravion






--
[http://www.forourson.com/signature/signature.gif]



--
[http://www.forourson.com/signature/signature.gif]



--
[http://www.forourson.com/signature/signature.gif]



--
Shmuel Blitz
Big Data Developer
[https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/similarweb.gif]
Email: shmuel.blitz@similarweb.com<mailto:shmuel.blitz@similarweb.com>
www.similarweb.com<https://www.similarweb.com>
[https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/F_72dpi.png]<https://www.facebook.com/SimilarWeb/>
[https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/Linkedin_72dpi.png]
<https://www.linkedin.com/company/429838/>  [https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/Twitter_144dpi.png]
<https://twitter.com/similarweb>



--
[http://www.forourson.com/signature/signature.gif]



--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>



--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>



--
[http://www.forourson.com/signature/signature.gif]



--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>



--
Shmuel Blitz
Big Data Developer
[https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/similarweb.gif]
Email: shmuel.blitz@similarweb.com<mailto:shmuel.blitz@similarweb.com>
www.similarweb.com<https://www.similarweb.com>
[https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/F_72dpi.png]<https://www.facebook.com/SimilarWeb/>
[https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/Linkedin_72dpi.png]
<https://www.linkedin.com/company/429838/>  [https://s3.amazonaws.com/landing.wisestamp.com/7c730cd01a3a3804b74371fee221c457/Twitter_144dpi.png]
<https://twitter.com/similarweb>



--
[http://www.forourson.com/signature/signature.gif]
Mime
View raw message