hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruslan Al-Fakikh <>
Subject Re: Quering RDBMS table in a Hive query
Date Fri, 15 Jun 2012 17:28:13 GMT
Thanks Jan

On Fri, Jun 15, 2012 at 4:35 PM, Jan Dolin√°r <> wrote:
> On 6/15/12, Ruslan Al-Fakikh <> wrote:
>> I didn't know InputFormat and LineReader could help, though I didn't
>> look at them closely. I was thinking about implementing a
>> Table-Generating Function (UDTF) if there is no an already implemented
>> solution.
> Both is possible, InputFormat and/or UD(T)F. It all depends on what
> you need. I actually use both - in Input format I load lists of
> allowed values to check the data and in UDF I query some other
> database for values necessary only in some queries. Generally, I'd use
>  InputFormat for situations where all jobs over given table would
> require the additional data from RDBMS. Oppositely, in situations
> where only few jobs out of many requires the RDBMS connection, I would
> use UDF.
> I think that the difference in performance between the two is rather
> small, if any. Also UDF is easier to write, so it might be the "weapon
> of choice", at least if you don't already use custom InputFormat.
> Jan

View raw message