hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adriaan Tijsseling <>
Subject Re: How to load lines into Hive while breaking them by words?
Date Tue, 27 Sep 2011 07:20:53 GMT
Use a regexserde to split the text in words. There's documentation on the hive wiki. 
But it might be better to use a script. See the post by Shouguo Li earlier on this mailing
After all, when you use a Python script, for example, you could use the Natural Language Processing
Toolkit to get much better function that splits text into a list of words.


On 2011/09/27, at 05:40, Mark Kerzner wrote:

> Hi,
> a simple question - if I have a book as a text, and I want to load it into a
> Hive table, with one word forming one entry, how should I do it?
> Thank you,
> Mark

View raw message