hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Mohamed <sam.moha...@voltage.com>
Subject Hive Query with UDF
Date Thu, 18 Oct 2012 01:04:07 GMT
I have some encrypted data in an HDFS csv, that I've created a Hive table for, and I want to
run a Hive query that first encrypts the query param, then does the lookup.  I have a UDF
that does encryption as follows:

public class ParamEncrypt extends UDF {

  public Text evaluate(String name) throws Exception {

      String result = new String();

      if (name == null) { return null; }

      result = ParamData.encrypt(name);

      return new Text(result);
  }
}

Then I run the Hive query as:

  select * from cc_details where first_name = encrypt('Ann');

The problem is, it's running encrypt('Ann') across every single record in the table.  I want
it do the encryption once, then do the matchup.  I've tried:

  select * from cc_details where first_name in (select encrypt('Ann') from cc_details limit
1);

But Hive doesn't support **IN** or select queries in the where clause.

What can I do?

Can I do something like:

  select encrypt('Ann') as ann from cc_details where first_name = ann;

That also doesn't work because the query parser throws an error saying **ann** is not a known
column

Thanks,

Sam
Mime
View raw message