hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Parolkar <abhis...@viki.com>
Subject Re: Postgres JDBC + dboutput UDF to export from Hive to remote Postgres
Date Fri, 30 Mar 2012 08:03:11 GMT
forgot to add the sceenshot in last email :)


On Fri, Mar 30, 2012 at 4:02 PM, Abhishek Parolkar <abhishek@viki.com>wrote:

> I even tried sqoop, but with no luck. It complains for connection manager
> even though my mysql connector jar is in lib folder of sqoop's instalation
> dir.
>
> Any help?
>
> If sqoop's purpose is to allow import/export from RDBMS, why the basic
> mysql/pg connectors bundled with it?
>
> -v_abhi_v
>
>
> On Fri, Mar 30, 2012 at 9:50 AM, Abhishek Parolkar <abhishek@viki.com>wrote:
>
>> I am definitely getting the no-driver error
>> http://screencast.com/t/OipV14n9FgF
>> so its not even at the point of executing statements, my return value
>> from UDF is 2.
>>
>> I can confirm that the postgres driver jar is added to my
>> hadoop_classpath and also to get this working I am working on 1 node local
>> cluster.
>>
>> -v_abhi_v
>>
>>
>> On Thu, Mar 29, 2012 at 10:11 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:
>>
>>> You have to look at the code to see what the return numbers mean for
>>> the UDF. In some cases the return is normal. For example hive maps use
>>> speculative execution and the same insert happens twice violating a
>>> primary key. The second insert "fails" and produces non 0 but in
>>> reality all that means is already inserted/
>>>
>>> On Thu, Mar 29, 2012 at 6:23 AM, Abhishek Parolkar <abhishek@viki.com>
>>> wrote:
>>> > My situation requires me to run hive query every hour and insert
>>> selected
>>> > records to postgres table. It would be nice if dboutput works so that
>>> reduce
>>> > jobs (created by hive) can directly write to DB.
>>> >
>>> > With sqoop, I will have to create a table everytime in hive and export
>>> it to
>>> > a table in DB. Wondering if that can be avoided?
>>> >
>>> > -v_abhi_v
>>> >
>>> >
>>> > On Thu, Mar 29, 2012 at 6:12 PM, Bejoy KS <bejoy_ks@yahoo.com> wrote:
>>> >>
>>> >> Hi Abshiek
>>> >> To transfer data between rdbms and hadoop Sqoop is the preferred and
>>> >> recommended option. Once you have the process done in hive the output
>>> data
>>> >> can be exported to PG with sqoop export command.
>>> >> Regards
>>> >> Bejoy KS
>>> >>
>>> >> Sent from handheld, please excuse typos.
>>> >> ________________________________
>>> >> From: Abhishek Parolkar <abhishek@viki.com>
>>> >> Date: Thu, 29 Mar 2012 16:25:08 +0800
>>> >> To: <user@hive.apache.org>
>>> >> ReplyTo: user@hive.apache.org
>>> >> Subject: Postgres JDBC + dboutput UDF to export from Hive to remote
>>> >> Postgres
>>> >>
>>> >> Hi There,
>>> >>   I am trying to get dboutput() UDF to work so that it can write
>>> result to
>>> >> a PG DB table.
>>> >>
>>> >> ==This is what I did in hive shell==
>>> >>
>>> >>   add jar /location/hive_contrib.jar;
>>> >>   add jar /location/postgresql9jdbc3.jar;
>>> >>   set jdbc.drivers = org.postgresql.Driver;
>>> >>
>>> >>   CREATE TEMPORARY FUNCTION dboutput
>>> >>
>>> AS  'org.apache.hadoop.hive.contrib.genericudf.example.GenericUDFDBOutput';
>>> >>
>>> >>   select dboutput('jdbc:postgresql//localhost:5432/test','','','insert
>>> >> into test_tbl(cnt) values(?)',hex(count(*)))
>>> >>   from some_hive_table
>>> >>
>>> >> ===========end of snip=======
>>> >>
>>> >> 1.) I am on single node cluster
>>> >> 2.) I am using Hive 0.8.1
>>> >> 3.) I on hadoop 1.0.0
>>> >> 4.) query runs fine but doesnt write to DB, it returns number 2
>>> >> (http://screencast.com/t/eavnbBHR1x)
>>> >>
>>> >> I get no suitable driver error (http://screencast.com/t/OipV14n9FgF)
>>> , can
>>> >> some one tell me how can I load postgres JDBC such
>>> >> that dboutput recognizes my postgres.
>>> >>
>>> >> Any help?
>>> >>
>>> >> -v_abhi_v
>>> >
>>> >
>>>
>>
>>
>

Mime
View raw message