hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <>
Subject Re: Combine multiple row values based upon a condition.
Date Sun, 03 Feb 2013 12:05:14 GMT
Well there are some methods that may work, but I'd have to understand your
data and your constraints more. You want to be able to (As it sounds) sort
by offset, and then look at the one row, and then the next row, to
determine if the the two items should be joined. It "looks" like you  are
doing a string comparison between numbers ("100 "to "104" there is only one
"position" out of three that is different (0 vs 4).  Trouble is, look at id
3 and id 4.  150 to 160 is only one position different as well, are you
looking for Klaas Jan?  Also, is the ID fields filled from the first match?
It seems like you have some very odd data here. I don't think you've
provided enough information on the data for us to be able to help you.

On Sat, Feb 2, 2013 at 1:21 PM, Martijn van Leeuwen <>wrote:

> Hi all,
> I new to Apache Hive and I am doing some test to see if it fits my needs,
> one of the questions I have if it is possible to "peek" for the next row in
> order to find out if the values should be combined. Let me explain by an
> example.
> Let say my data looks like this
> Id name offset
> 1 Jan 100
> 2 Janssen 104
> 3 Klaas 150
> 4 Jan 160
> 5 Janssen 164
> An my output to another table should be this
> Id fullname offsets
> 1 Jan Janssen [ 100, 160 ]
> I would like to combine the name values from two rows where the offset of
> the two rows are no more then 1 character apart.
> Is this type of data manipulation is possible and if it is could someone
> point me to the right direction hopefully with some explaination?
> Kind regards
> Martijn

View raw message