impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Behm <alex.b...@cloudera.com>
Subject Re: Impala Failed to read file from HDFS
Date Mon, 13 Mar 2017 18:25:00 GMT
Might be an issue with your 'fs.defaultFS' configuration in core-site.xml.
It should point to your NameNode.

On Sun, Mar 12, 2017 at 7:38 PM, 俊杰陈 <cjjnjust@gmail.com> wrote:

> The issue might due to original parquet_data schema was created against
> local path. But I tried again to create a new schema without specifying the
> LOCATION parameter, "desc database parquet_data" shows that it stored at
> HDFS location. I'm not sure how I created a database store at local file
> system.
>
> 2017-03-13 9:42 GMT+08:00 俊杰陈 <cjjnjust@gmail.com>:
>
>> Hi
>> Please see following:
>> [bdpe30-cjj:21000] > create table test2 like parquet
>> 'hdfs:///data/2.parquet' stored as parquet;
>> Query: create table test2 like parquet 'hdfs:///data/2.parquet' stored as
>> parquet
>>
>> Fetched 0 row(s) in 0.21s
>> [bdpe30-cjj:21000] > show create table test2;
>> Query: show create table test2
>> +------------------------------------------------------------+
>> | result                                                     |
>> +------------------------------------------------------------+
>> | CREATE TABLE parquet_data.test2 (                          |
>> |   a STRING COMMENT 'Inferred from Parquet file.',          |
>> |   b STRING COMMENT 'Inferred from Parquet file.',          |
>> |   c BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   d INT COMMENT 'Inferred from Parquet file.',             |
>> |   e INT COMMENT 'Inferred from Parquet file.',             |
>> |   f BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   g INT COMMENT 'Inferred from Parquet file.',             |
>> |   aa STRING COMMENT 'Inferred from Parquet file.',         |
>> |   bb BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   cc INT COMMENT 'Inferred from Parquet file.',            |
>> |   dd STRING COMMENT 'Inferred from Parquet file.',         |
>> |   ee BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   ff BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   gg BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   h BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   i STRING COMMENT 'Inferred from Parquet file.',          |
>> |   j STRING COMMENT 'Inferred from Parquet file.',          |
>> |   k INT COMMENT 'Inferred from Parquet file.',             |
>> |   l STRING COMMENT 'Inferred from Parquet file.',          |
>> |   m STRING COMMENT 'Inferred from Parquet file.',          |
>> |   n STRING COMMENT 'Inferred from Parquet file.',          |
>> |   hh STRING COMMENT 'Inferred from Parquet file.',         |
>> |   ii STRING COMMENT 'Inferred from Parquet file.',         |
>> |   jj INT COMMENT 'Inferred from Parquet file.',            |
>> |   kk INT COMMENT 'Inferred from Parquet file.',            |
>> |   ll BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   mm INT COMMENT 'Inferred from Parquet file.',            |
>> |   nn BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   o STRING COMMENT 'Inferred from Parquet file.',          |
>> |   p BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   q BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   r BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   s INT COMMENT 'Inferred from Parquet file.',             |
>> |   t INT COMMENT 'Inferred from Parquet file.',             |
>> |   u INT COMMENT 'Inferred from Parquet file.',             |
>> |   v INT COMMENT 'Inferred from Parquet file.',             |
>> |   w INT COMMENT 'Inferred from Parquet file.',             |
>> |   oo INT COMMENT 'Inferred from Parquet file.',            |
>> |   pp BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   qq INT COMMENT 'Inferred from Parquet file.',            |
>> |   rr BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   ss INT COMMENT 'Inferred from Parquet file.',            |
>> |   tt BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   u1 INT COMMENT 'Inferred from Parquet file.',            |
>> |   v1 INT COMMENT 'Inferred from Parquet file.',            |
>> |   w1 INT COMMENT 'Inferred from Parquet file.',            |
>> |   x STRING COMMENT 'Inferred from Parquet file.',          |
>> |   y INT COMMENT 'Inferred from Parquet file.',             |
>> |   z INT COMMENT 'Inferred from Parquet file.',             |
>> |   uu STRING COMMENT 'Inferred from Parquet file.',         |
>> |   vv STRING COMMENT 'Inferred from Parquet file.',         |
>> |   ww INT COMMENT 'Inferred from Parquet file.',            |
>> |   xx INT COMMENT 'Inferred from Parquet file.',            |
>> |   yy BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   zz BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   aaa STRING COMMENT 'Inferred from Parquet file.',        |
>> |   bbb STRING COMMENT 'Inferred from Parquet file.',        |
>> |   ccc STRING COMMENT 'Inferred from Parquet file.'         |
>> | )                                                          |
>> | STORED AS PARQUET                                          |
>> | LOCATION 'file:/user/hive/warehouse/parquet_data.db/test2' |
>> |                                                            |
>> +------------------------------------------------------------+
>> Fetched 1 row(s) in 3.22s
>> [bdpe30-cjj:21000] > select * from test2 limit 5;
>> Query: select * from test2 limit 5
>> Query submitted at: 2017-03-13 02:05:40 (Coordinator:
>> http://bdpe30-cjj:25000)
>> Query progress can be monitored at: http://bdpe30-cjj:25000/query_
>> plan?query_id=e444cb64be71c69:96546e2600000000
>>
>> Fetched 0 row(s) in 0.28s
>>
>>
>> 2017-03-10 17:12 GMT+08:00 Jeszy <jeszyb@gmail.com>:
>>
>>> The above looks like you accidentally created the table in a different
>>> database - can you repro the 'file:/' error and paste 'show create
>>> table' of that table?
>>>
>>> On Fri, Mar 10, 2017 at 8:25 AM, 俊杰陈 <cjjnjust@gmail.com> wrote:
>>> > Hi
>>> > Please see the following output. In node bdpe822n2, it worked well. I
>>> don't
>>> > know why it looks weird today.
>>> >
>>> > [bdpe822n2:21000] > create table test like parquet
>>> 'hdfs:///data/1.parquet'
>>> > stored as parquet;
>>> > Query: create table test like parquet 'hdfs:///data/1.parquet' stored
>>> as
>>> > parquet
>>> >
>>> > Fetched 0 row(s) in 0.14s
>>> > [bdpe822n2:21000] > load data inpath "hdfs:///data/1.parquet" into
>>> table
>>> > test;
>>> > Query: load data inpath "hdfs:///data/1.parquet" into table test
>>> > +----------------------------------------------------------+
>>> > | summary                                                  |
>>> > +----------------------------------------------------------+
>>> > | Loaded 1 file(s). Total files in destination location: 2 |
>>> > +----------------------------------------------------------+
>>> > Fetched 1 row(s) in 3.39s
>>> > [bdpe822n2:21000] > refresh test;
>>> > Query: refresh test
>>> > Query submitted at: 2017-03-10 14:46:54 (Coordinator:
>>> > http://bdpe822n2:25000)
>>> > Query progress can be monitored at:
>>> > http://bdpe822n2:25000/query_plan?query_id=4d4ad8038a0362d3:
>>> 8c7b326a00000000
>>> >
>>> > Fetched 0 row(s) in 0.09s
>>> > [bdpe822n2:21000] > show create table parquet_data.test;
>>> > Query: show create table parquet_data.test
>>> > ERROR: AnalysisException: Table does not exist: parquet_data.test
>>> >
>>> > [bdpe822n2:21000] > use parquet_data;
>>> > Query: use parquet_data
>>> > [bdpe822n2:21000] > show tables;
>>> > Query: show tables
>>> >
>>> > Fetched 0 row(s) in 0.02s
>>> >
>>> >
>>> >
>>> >
>>> > 2017-03-10 15:13 GMT+08:00 Sailesh Mukil <sailesh@cloudera.com>:
>>> >>
>>> >> Hi,
>>> >>
>>> >> Can you do a 'show create table parquet_data.test;'  and paste the
>>> output?
>>> >>
>>> >> On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cjjnjust@gmail.com>
wrote:
>>> >>>
>>> >>> Plus:
>>> >>>
>>> >>> In my root directory I found
>>> >>> user/hive/warehouse/parquet_data.db/test/2.parquet. So it seems
>>> impalad is
>>> >>> manipulating on local file system.  How do I configure this?
>>> >>>
>>> >>> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cjjnjust@gmail.com>:
>>> >>>>
>>> >>>> Thanks from quick reply:)
>>> >>>>
>>> >>>> 1.parquet is always in the hdfs. I also did following command
for
>>> you
>>> >>>> reference, please note the URI which is start with file:. It
looks
>>> weird.
>>> >>>>
>>> >>>> [bdpe30-cjj:21000] > use parquet_data;
>>> >>>> Query: use parquet_data
>>> >>>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet"
into
>>> >>>> table test;
>>> >>>> Query: load data inpath "hdfs:///data/2.parquet" into table
test
>>> >>>> +----------------------------------------------------------+
>>> >>>> | summary                                                  |
>>> >>>> +----------------------------------------------------------+
>>> >>>> | Loaded 1 file(s). Total files in destination location: 2 |
>>> >>>> +----------------------------------------------------------+
>>> >>>> Fetched 1 row(s) in 0.50s
>>> >>>> [bdpe30-cjj:21000] > select count(*) from test;
>>> >>>> Query: select count(*) from test
>>> >>>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
>>> >>>> http://bdpe30-cjj:25000)
>>> >>>> Query progress can be monitored at:
>>> >>>> http://bdpe30-cjj:25000/query_plan?query_id=5d4ecce7d21182cc
>>> :e2dd7f5700000000
>>> >>>> WARNINGS:
>>> >>>> Failed to open HDFS file
>>> >>>> file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>> >>>> Error(2): No such file or directory
>>> >>>>
>>> >>>>
>>> >>>> It seems like the load operation read data from hdfs, but not
put
>>> into
>>> >>>> right place for query. Also the impalad seems access the file
in
>>> local file
>>> >>>> system.
>>> >>>>
>>> >>>>
>>> >>>> 2017-03-10 14:48 GMT+08:00 Jeszy <jeszyb@gmail.com>:
>>> >>>>>
>>> >>>>> Hello,
>>> >>>>>
>>> >>>>> Sounds like Impala expected 1.parquet to be in the folder,
but it
>>> >>>>> wasn't.
>>> >>>>> You probably forgot to do 'refresh <table>' after
altering data
>>> from
>>> >>>>> the outside.
>>> >>>>>
>>> >>>>> HTH
>>> >>>>>
>>> >>>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cjjnjust@gmail.com>
wrote:
>>> >>>>> > Hi,
>>> >>>>> > I'm using latest impala built from github,  and setup
impala
>>> cluster
>>> >>>>> > with
>>> >>>>> > 2-nodes like below:
>>> >>>>> > node-1: statestored, catalogd, namenode,datanode.
>>> >>>>> > node-2: impalad, datanode.
>>> >>>>> >
>>> >>>>> > Then I created database and table, loaded data from
external
>>> parquet
>>> >>>>> > file
>>> >>>>> > into table. Everything was OK, but when I executed
a query it
>>> failed
>>> >>>>> > with
>>> >>>>> > following message:
>>> >>>>> >
>>> >>>>> > Failed to open HDFS file
>>> >>>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>> >>>>> > Error(2): No such file or directory
>>> >>>>> >
>>> >>>>> > But I can still ‘desc test’. Anyone met with this?
Thanks in
>>> >>>>> > advanced.
>>> >>>>> >
>>> >>>>> >
>>> >>>>> >
>>> >>>>> > --
>>> >>>>> > Thanks & Best Regards
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Thanks & Best Regards
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Thanks & Best Regards
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks & Best Regards
>>>
>>
>>
>>
>> --
>> Thanks & Best Regards
>>
>
>
>
> --
> Thanks & Best Regards
>

Mime
View raw message