incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Crobak <>
Subject pig schema empty - how to debug?
Date Sun, 15 Jul 2012 19:36:37 GMT
I have a custom InputFormat/SerDe for Hive, and I'd like to access data via
Pig. I was able to get the classpaths setup correctly so that I can view
the table via the hcat command line, as well as load the table from pig.
Unfortunately, the schema for the relation I load is empty. If I dump the
data, I get back empty tuples. And I can't project any columns, because
HCat says the field doesn't exist in the schema.

Has anyone seen this problem before? What's the best way to debug? I've
tried tweaking numerous logging settings, but I can't get the hcat server
or pig to output any debug level logging from hcatalog. Do I need to run a
trunk build with HCATALOG-68, or is there another way? Investigating the MR
job, my serde and input format are definitely being used -- I can see INFO
level logging statements from them in the task attempt logs.

Here's an example session:

grunt> checkins = LOAD 'checkins' USING
2012-07-15 19:19:49,739 [main] INFO  hive.metastore - Trying to connect to
metastore with URI thrift://localhost:9999
2012-07-15 19:19:49,780 [main] INFO  hive.metastore - Connected to
grunt> describe checkins;
checkins: {}
grunt> checkins = LOAD 'default.checkins' USING
grunt> describe checkins;
checkins: {}
grunt> x = limit checkins 5
>> ;
grunt> dump x;


View raw message