hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Mohamed <sam.moha...@voltage.com>
Subject RE: Hive Query with UDF
Date Thu, 18 Oct 2012 01:12:49 GMT
Thanks for the quick response.

The idea is that we are selling the encryption product for customers who use HDFS.  Hence,
encryption is a requirement.

Any other suggestions.

Sam
________________________________________
From: Michael Segel [michael_segel@hotmail.com]
Sent: Wednesday, October 17, 2012 6:10 PM
To: user@hadoop.apache.org
Subject: Re: Hive Query with UDF

You don't need an UDF...

You encrypt the string 'Ann' first then use that encrypted value in the Select statement.

That should make things a bit simpler.



On Oct 17, 2012, at 8:04 PM, Sam Mohamed <sam.mohamed@voltage.com> wrote:

> I have some encrypted data in an HDFS csv, that I've created a Hive table for, and I
want to run a Hive query that first encrypts the query param, then does the lookup.  I have
a UDF that does encryption as follows:
>
> public class ParamEncrypt extends UDF {
>
>  public Text evaluate(String name) throws Exception {
>
>      String result = new String();
>
>      if (name == null) { return null; }
>
>      result = ParamData.encrypt(name);
>
>      return new Text(result);
>  }
> }
>
> Then I run the Hive query as:
>
>  select * from cc_details where first_name = encrypt('Ann');
>
> The problem is, it's running encrypt('Ann') across every single record in the table.
 I want it do the encryption once, then do the matchup.  I've tried:
>
>  select * from cc_details where first_name in (select encrypt('Ann') from cc_details
limit 1);
>
> But Hive doesn't support **IN** or select queries in the where clause.
>
> What can I do?
>
> Can I do something like:
>
>  select encrypt('Ann') as ann from cc_details where first_name = ann;
>
> That also doesn't work because the query parser throws an error saying **ann** is not
a known column
>
> Thanks,
>
> Sam


Mime
View raw message