spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ABHISHEK KUMAR GUPTA (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-24176) The hdfs file path with wildcard can not be identified when loading data
Date Fri, 04 May 2018 05:23:00 GMT
ABHISHEK KUMAR GUPTA created SPARK-24176:
--------------------------------------------

             Summary: The hdfs file path with wildcard can not be identified when loading
data
                 Key: SPARK-24176
                 URL: https://issues.apache.org/jira/browse/SPARK-24176
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.3.0
         Environment: OS: SUSE11

Spark Version:2.3
            Reporter: ABHISHEK KUMAR GUPTA


# Launch spark-sql
 # create table wild1 (time timestamp, name string, isright boolean, datetoday date, num binary,
height double, score float, decimaler decimal(10,0), id tinyint, age int, license bigint,
length smallint) row format delimited fields terminated by ',' stored as textfile;
 # loaded data in table as below and it failed some cases not consistent
 # load data inpath '/user/testdemo1/user1/?ype* ' into table wild1; - Success
load data inpath '/user/testdemo1/user1/t??eddata60.txt' into table wild1; - *Failed*
load data inpath '/user/testdemo1/user1/?ypeddata60.txt' into table wild1; - Success

Exception as below

> load data inpath '/user/testdemo1/user1/t??eddata61.txt' into table wild1;
2018-05-04 13:16:25 INFO HiveMetaStore:746 - 0: get_database: one
2018-05-04 13:16:25 INFO audit:371 - ugi=spark/hadoop@HADOOP.COM ip=unknown-ip-addr cmd=get_database:
one
2018-05-04 13:16:25 INFO HiveMetaStore:746 - 0: get_table : db=one tbl=wild1
2018-05-04 13:16:25 INFO audit:371 - ugi=spark/hadoop@HADOOP.COM ip=unknown-ip-addr cmd=get_table
: db=one tbl=wild1
2018-05-04 13:16:25 INFO HiveMetaStore:746 - 0: get_table : db=one tbl=wild1
2018-05-04 13:16:25 INFO audit:371 - ugi=spark/hadoop@HADOOP.COM ip=unknown-ip-addr cmd=get_table
: db=one tbl=wild1
*Error in query: LOAD DATA input path does not exist: /user/testdemo1/user1/t??eddata61.txt;*
spark-sql>

Behavior is not consistent. Need to fix with all combination of wild card char as it is not
consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message