phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aritra Nayak (Jira)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-5361) FileNotFoundException found when schema is in lowercase
Date Fri, 06 Dec 2019 03:57:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aritra Nayak updated PHOENIX-5361:
----------------------------------
    Description: 
The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in lowercase.

 

Steps to reproduce:

 1. Create the Hive table:
{code:java}
CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, "firstName"
VARCHAR, "lastName" VARCHAR);
{code}

 2. Upload the CSV file in your preferred HDFS location{code}
{code:java}
/data/s01/DUMMY_DATA/1.csv{code}

 3. Run the hadoop jar command to bulk upload{code}
{code:java}
hadoop jar /opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA --input /data/s01/DUMMY_DATA/1.csv
--zookeeper zk-journalnode-lv-101:2181
{code}
Getting the below error:
{code:java}
Exception in thread "main" java.io.FileNotFoundException: Bulkload dir /tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA
not found
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

  The Map Reduce job reads 100_000 records, but does not write any

 
{code:java}
19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
    File System Counters
        FILE: Number of bytes read=20
        FILE: Number of bytes written=315801
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=41666811
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=39894
        Total time spent by all reduces in occupied slots (ms)=56216
        Total time spent by all map tasks (ms)=19947
        Total time spent by all reduce tasks (ms)=14054
        Total vcore-seconds taken by all map tasks=19947
        Total vcore-seconds taken by all reduce tasks=14054
        Total megabyte-seconds taken by all map tasks=40851456
        Total megabyte-seconds taken by all reduce tasks=57565184
    Map-Reduce Framework
        Map input records=1000000
        Map output records=0   <----- see here
        Map output bytes=0
        Map output materialized bytes=16
        Input split bytes=123
        Combine input records=0
        Combine output records=0
        Reduce input groups=0
        Reduce shuffle bytes=16
        Reduce input records=0
        Reduce output records=0
        Spilled Records=0
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=914
        CPU time spent (ms)=49240
        Physical memory (bytes) snapshot=2022809600
        Virtual memory (bytes) snapshot=8064647168
        Total committed heap usage (bytes)=3589275648
    Phoenix MapReduce Import
        Upserts Done=1000000
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=41666688
    File Output Format Counters
        Bytes Written=0
{code}
   {color:#14892c}Same steps (1-3) when followed with schema name S01, passes and data gets
successfully uploaded into the table{color}

  was:
The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in lowercase.

 

Steps to reproduce:

 # Create the Hive table:
{code:java}
CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, "firstName"
VARCHAR, "lastName" VARCHAR);
{code}

 # Upload the CSV file in your preferred HDFS location{code}
{code:java}
/data/s01/DUMMY_DATA/1.csv{code}

 # Run the hadoop jar command to bulk upload{code}
{code:java}
hadoop jar /opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA --input /data/s01/DUMMY_DATA/1.csv
--zookeeper zk-journalnode-lv-101:2181
{code}
Getting the below error:
{code:java}
Exception in thread "main" java.io.FileNotFoundException: Bulkload dir /tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA
not found
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
    at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
    at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

  The Map Reduce job reads 100_000 records, but does not write any

 
{code:java}
19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
    File System Counters
        FILE: Number of bytes read=20
        FILE: Number of bytes written=315801
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=41666811
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=39894
        Total time spent by all reduces in occupied slots (ms)=56216
        Total time spent by all map tasks (ms)=19947
        Total time spent by all reduce tasks (ms)=14054
        Total vcore-seconds taken by all map tasks=19947
        Total vcore-seconds taken by all reduce tasks=14054
        Total megabyte-seconds taken by all map tasks=40851456
        Total megabyte-seconds taken by all reduce tasks=57565184
    Map-Reduce Framework
        Map input records=1000000
        Map output records=0   <----- see here
        Map output bytes=0
        Map output materialized bytes=16
        Input split bytes=123
        Combine input records=0
        Combine output records=0
        Reduce input groups=0
        Reduce shuffle bytes=16
        Reduce input records=0
        Reduce output records=0
        Spilled Records=0
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=914
        CPU time spent (ms)=49240
        Physical memory (bytes) snapshot=2022809600
        Virtual memory (bytes) snapshot=8064647168
        Total committed heap usage (bytes)=3589275648
    Phoenix MapReduce Import
        Upserts Done=1000000
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=41666688
    File Output Format Counters
        Bytes Written=0
{code}
   {color:#14892c}Same steps (1-3) when followed with schema name S01, passes and data gets
successfully uploaded into the table{color}


> FileNotFoundException found when schema is in lowercase
> -------------------------------------------------------
>
>                 Key: PHOENIX-5361
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5361
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.13.0
>         Environment: *Hadoop*: 2.6.0-cdh5.9.2
> *Phoenix*: 4.13
> *HBase*: 1.2.0-cdh5.9.2
> *Java*: 8
>            Reporter: Aritra Nayak
>            Priority: Major
>
> The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in lowercase.
>  
> Steps to reproduce:
>  1. Create the Hive table:
> {code:java}
> CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, "firstName"
VARCHAR, "lastName" VARCHAR);
> {code}
>  2. Upload the CSV file in your preferred HDFS location{code}
> {code:java}
> /data/s01/DUMMY_DATA/1.csv{code}
>  3. Run the hadoop jar command to bulk upload{code}
> {code:java}
> hadoop jar /opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA --input /data/s01/DUMMY_DATA/1.csv
--zookeeper zk-journalnode-lv-101:2181
> {code}
> Getting the below error:
> {code:java}
> Exception in thread "main" java.io.FileNotFoundException: Bulkload dir /tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA
not found
>     at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
>     at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
>     at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
>     at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
>     at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
>     at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
>     at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
>     at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
>   The Map Reduce job reads 100_000 records, but does not write any
>  
> {code:java}
> 19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
>     File System Counters
>         FILE: Number of bytes read=20
>         FILE: Number of bytes written=315801
>         FILE: Number of read operations=0
>         FILE: Number of large read operations=0
>         FILE: Number of write operations=0
>         HDFS: Number of bytes read=41666811
>         HDFS: Number of bytes written=0
>         HDFS: Number of read operations=4
>         HDFS: Number of large read operations=0
>         HDFS: Number of write operations=0
>     Job Counters
>         Launched map tasks=1
>         Launched reduce tasks=1
>         Data-local map tasks=1
>         Total time spent by all maps in occupied slots (ms)=39894
>         Total time spent by all reduces in occupied slots (ms)=56216
>         Total time spent by all map tasks (ms)=19947
>         Total time spent by all reduce tasks (ms)=14054
>         Total vcore-seconds taken by all map tasks=19947
>         Total vcore-seconds taken by all reduce tasks=14054
>         Total megabyte-seconds taken by all map tasks=40851456
>         Total megabyte-seconds taken by all reduce tasks=57565184
>     Map-Reduce Framework
>         Map input records=1000000
>         Map output records=0   <----- see here
>         Map output bytes=0
>         Map output materialized bytes=16
>         Input split bytes=123
>         Combine input records=0
>         Combine output records=0
>         Reduce input groups=0
>         Reduce shuffle bytes=16
>         Reduce input records=0
>         Reduce output records=0
>         Spilled Records=0
>         Shuffled Maps =1
>         Failed Shuffles=0
>         Merged Map outputs=1
>         GC time elapsed (ms)=914
>         CPU time spent (ms)=49240
>         Physical memory (bytes) snapshot=2022809600
>         Virtual memory (bytes) snapshot=8064647168
>         Total committed heap usage (bytes)=3589275648
>     Phoenix MapReduce Import
>         Upserts Done=1000000
>     Shuffle Errors
>         BAD_ID=0
>         CONNECTION=0
>         IO_ERROR=0
>         WRONG_LENGTH=0
>         WRONG_MAP=0
>         WRONG_REDUCE=0
>     File Input Format Counters
>         Bytes Read=41666688
>     File Output Format Counters
>         Bytes Written=0
> {code}
>    {color:#14892c}Same steps (1-3) when followed with schema name S01, passes and data
gets successfully uploaded into the table{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message