hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Novogrodsky <>
Subject using Hive to create tables from unstructured data.
Date Wed, 12 Nov 2014 20:07:31 GMT
I am trying to ingest unstructured data into Hive so it can be queried.  I
am trying to follow the steps in Tutorial Exercise 3 in the Cloudera
Quickstart VM.  I have not changed any of the configurations of the VM.,

I am having some problems.  The created tables has no data in it.  Here is
a sample of the unstructured data&colon;

560)211-5250 437)810-5830 04:35 21 May 2014 17:26:39
356)539-2237 889)650-7326 30:29 26 Feb 2014 11:56:08

the data is tab-delimited.

Here are the steps I am following:

1. a. make destination folder
sudo -u hdfs hadoop fs -mkdir /user/cloudera/vector/callRecords

b. copy data into destination folder
sudo -u hdfs hadoop fs -copyFromLocal ~/Desktop/CDRecords.txt

2. create Hive tables using the command line:

CREATE EXTERNAL TABLE intermediate_call_records (
callFrom STRING,
callTo STRING,
callDuration STRING,
date STRING,
timeOfCall STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
"input.regex" = "([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)\n",
"output.format.string" = "%1$s %2$s %3$s %4$s %5$s"
LOCATION '/user/cloudera/vector/callRecords';

David Novogrodsky

View raw message