hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bharath vissapragada <bharathvissapragada1...@gmail.com>
Subject Re: Automatic creation of Hive output table based on SELECT statement
Date Mon, 23 Dec 2013 18:07:43 GMT
I had the same problem sometime back and I found a dirty workaround that
actually works.

1) create view foo as <select statement> # this automatically creates foo
view with select output schema, this can contain joins, sel conditions,
group/count operators etc.
2) create table bar like foo # creates bar with schema like foo
3) drop view foo # you don't need the view anymore, so jus drop it

I just used these steps to automate selects in a script. Maybe you can find
it useful too though it doesn't make any sense.

 As already said, I just used it for automating a few select jobs on my
test cluster, if you intend to use it in production, I would suggest to
look into the code of how "view" creation works as it has this logic and


On Mon, Dec 23, 2013 at 10:12 PM, Sean Colgan <Sean.Colgan@nasdaqomx.com>wrote:

> Very interesting.  At least for the time being, I think this may be our
> way forward.  Too bad the create-as-select statement wont work for
> external tables.  Because that would really be our solution.
>
> Thanks for the help! :)
>
> Sean
>
> *Sean Colgan*
> NASDAQ OMX® | FinQloud
>
> Direct: +1 720 889 5169
> Mobile: +1 720 775 7390
> sean.colgan@nasdaqomx.com <mailto:sean.colgan@nasdaqomx.com>
> www.nasdaqomx.com/finqloud <http://www.nasdaqomx.com/finqloud>
>
>
>
> On Fri 20 Dec 2013 04:54:15 PM MST, Stephen Sprague wrote:
> > its a not via an API but what if you experimented with "create table
> > XXX as select ..... where 1=0" followed by "desc XXX".
> >
> > Maybe something like that can tide you over until a more programmatic
> > way comes to light?
> >
> >
> > On Fri, Dec 20, 2013 at 3:01 PM, Sean Colgan
> > <Sean.Colgan@nasdaqomx.com <mailto:Sean.Colgan@nasdaqomx.com>> wrote:
> >
> >     Hi all,
> >     I'm looking for a way to automatically generate an output table
> >     schema based on the input and query.  Basically, a given input
> >     schema and SELECT statement produce a deterministic result
> >     schema.  And Hive will complain if a given output table does not
> >     have a matching schema.  Like so:
> >
> >     FAILED: SemanticException [Error 10044]: Line 3:23 Cannot insert
> >     into target table because column number/types are different
> >     'test_out': Table insclause-0 has 1 columns, but query has 3 columns.
> >
> >     I am developing a process that will accept input data with an
> >     arbitrary known schema, and an HQL SELECT statement to query that
> >     data.  I'm looking for a way to automatically generate the correct
> >     output table schema.  I'm starting to poke through the code in
> >     SemanticAnalyzer.java (which generates the above error string) to
> >     see if I can use something in there.  But I certainly welcome any
> >     thoughts or suggestions on this topic.
> >
> >     Thanks!
> >     Sean
> >
> >     --
> >
> >     sean.colgan@nasdaqomx.com <mailto:sean.colgan@nasdaqomx.com>
> >     www.nasdaqomx.com/finqloud <http://www.nasdaqomx.com/finqloud>
> >
> >

Mime
View raw message