hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandeep Singh ...@chandeep.com>
Subject Re: how to create an array from two columns?
Date Sun, 13 Mar 2016 01:55:57 GMT
Writing your own UDF is always an option :)

> On Mar 13, 2016, at 1:46 AM, Chandeep Singh <cs@chandeep.com> wrote:
> 
> Since data is stored in HDFS you have very limited scope to directly append. 
> 
> As a workaround you could get the contents of the original array by their index and then
create a new array. This would only make sense if you know the number of elements in your
array and it doesn’t change across rows.
> 
> select array(ab[0],ab[1],"blah") from table2;
> OK
> ["temp1","temp2","blah”]
> 
> 
>> On Mar 13, 2016, at 1:26 AM, Rex X <dnsring@gmail.com <mailto:dnsring@gmail.com>>
wrote:
>> 
>> Thank you, Chandeep. Yes, my first problem solved. 
>> How about the second one? Is there any way to append an element to an existing array?
>> 
>> 
>> 
>> On Sat, Mar 12, 2016 at 5:10 PM, Chandeep Singh <cs@chandeep.com <mailto:cs@chandeep.com>>
wrote:
>> If you only want the array while you’re querying table1 your example should work.
If you want to add AB to the table you’ll probably need to create a new table by selecting
everything you need from table1.
>> 
>> hive> select * from table1 limit 1;
>> OK
>> temp1	temp2	temp3
>> 
>> hive> select f1, array(f2, f3) AS AB from table1 limit 1;
>> OK
>> temp1	[“temp2”,"temp3"]
>> 
>> 
>>> On Mar 13, 2016, at 12:33 AM, Rex X <dnsring@gmail.com <mailto:dnsring@gmail.com>>
wrote:
>>> 
>>> How to make the following work?
>>> 
>>> 1. combine columns A and B to make one array as a new column AB. Both column
A and B are string types.
>>> 
>>>   select 
>>> string_columnA, 
>>> string_columnB, 
>>> array(string_columnA, string_columnB) as AB
>>> from Table1;
>>> 
>>> 2. append columnA to an existing array-type column B
>>> 
>>> select
>>> string_columnA,
>>> array_columnB,
>>> array_flatmerge(string_columnA, array_columnB) as AB
>>> from Table2;
>>> 
>>> In fact, I should say "set" instead of "array" above, since I expect no duplicates.
>>> 
>>> Any idea?
>>> 
>> 
>> 
> 


Mime
View raw message