hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: De-identification_in Hive
Date Thu, 17 Mar 2016 14:43:56 GMT
Are you loading your CSV file from an External table into Hive table.?

Basically you want to scramble that column before putting into Hive table?

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 17 March 2016 at 14:37, Ajay Chander <hadoopdev18@gmail.com> wrote:

> Tustin, Is there anyway I can deidentify it in hive ?
>
>
> On Thursday, March 17, 2016, Marcin Tustin <mtustin@handybook.com> wrote:
>
>> This is a classic transform-load problem. You'll want to anonymise it
>> once before making it available for analysis.
>>
>> On Thursday, March 17, 2016, Ajay Chander <hadoopdev18@gmail.com> wrote:
>>
>>> Hi Everyone,
>>>
>>> I have a csv.file which has some sensitive data in a particular column
>>> in it.  Now I have to create a table in hive and load the data into it. But
>>> when loading the data I have to make sure that the data is masked. Is there
>>> any built in function is used ch supports this or do I have to write UDF ?
>>> Any suggestions are appreciated. Thanks
>>
>>
>> Want to work at Handy? Check out our culture deck and open roles
>> <http://www.handy.com/careers>
>> Latest news <http://www.handy.com/press> at Handy
>> Handy just raised $50m
>> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/>
led
>> by Fidelity
>>
>>

Mime
View raw message