hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshan Pradeep <codeva...@gmail.com>
Subject Re: Incremental import from PostgreSQL to Hive having issues
Date Fri, 13 Apr 2012 12:42:00 GMT
Hadoop - 0.20.2
Hive - 0.8.1

Thanks.

On Fri, Apr 13, 2012 at 5:03 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:

> can you tell us what is
> 1) hive version
> 2) hadoop version that you are using?
>
>
>
>
>
> On Fri, Apr 13, 2012 at 12:23 PM, Roshan Pradeep <codevally@gmail.com>wrote:
>
>> Hi
>>
>> I want to import the updated data from my source (PostgreSQL) to hive
>> based on a column (lastmodifiedtime) in postgreSQL
>>
>> *The command I am using*
>>
>> /app/sqoop/bin/sqoop import --hive-table users --connect
>> jdbc:postgresql:/<server_url>/<database> --table users --username XXXXXXX
>> --password YYYYYY --hive-home /app/hive --hive-import --incremental
>> lastmodified --check-column lastmodifiedtime
>>
>> *With the above command, I am getting the below error*
>>
>> 12/04/13 16:31:21 INFO orm.CompilationManager: Writing jar file:
>> /tmp/sqoop-root/compile/11ce8600a5656ed49e631a260c387692/users.jar
>> 12/04/13 16:31:21 INFO tool.ImportTool: Incremental import based on
>> column "lastmodifiedtime"
>> 12/04/13 16:31:21 INFO tool.ImportTool: Upper bound value: '2012-04-13
>> 16:31:21.865429'
>> 12/04/13 16:31:21 WARN manager.PostgresqlManager: It looks like you are
>> importing from postgresql.
>> 12/04/13 16:31:21 WARN manager.PostgresqlManager: This transfer can be
>> faster! Use the --direct
>> 12/04/13 16:31:21 WARN manager.PostgresqlManager: option to exercise a
>> postgresql-specific fast path.
>> 12/04/13 16:31:21 INFO mapreduce.ImportJobBase: Beginning import of users
>> 12/04/13 16:31:23 ERROR tool.ImportTool: Encountered IOException running
>> import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output
>> directory users already exists
>>         at
>> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:770)
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>>         at
>> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141)
>>         at
>> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:201)
>>         at
>> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413)
>>         at
>> org.apache.sqoop.manager.PostgresqlManager.importTable(PostgresqlManager.java:102)
>>         at
>> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380)
>>         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
>>         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>>         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
>>
>> According to the above, it identify the updated data from postgreSQL, but
>> it says output directory already exists. Could someone please help me to
>> correct this issue.
>>
>> Thanks.
>>
>
>
>
> --
> Nitin Pawar
>
>

Mime
View raw message