hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Sprague <sprag...@gmail.com>
Subject Re: msck repair table and hive v2.1.0
Date Fri, 15 Jul 2016 03:26:46 GMT
Hi Rajesh,
sure. i'll give that setting a try. thanks.

re: s3 vs. hdfs.  indeed. I figured i'd eliminate the s3 angle when posting
here given the msck repair table failed in both cases. but yeah my real use
case is using s3.

ok. just tried that setting and got a slightly different stack trace but
end result still was the NPE.

its a strange one.

Cheers,
Stephen.

2016-07-15T03:13:08,102 DEBUG [main]: parse.ParseDriver (:()) - Parse
Completed
2016-07-15T03:13:08,119 INFO  [main]: ql.Driver (:()) - Semantic Analysis
Completed
2016-07-15T03:13:08,119 INFO  [main]: ql.Driver (:()) - Returning Hive
schema: Schema(fieldSchemas:null, properties:null)
2016-07-15T03:13:08,119 INFO  [main]: metadata.Hive (:()) - Dumping
metastore api call timing information for : compilation phase
2016-07-15T03:13:08,119 DEBUG [main]: metadata.Hive (:()) - Total time
spent in each metastore function (ms): {isCompatibleWith_(HiveConf, )=0,
getTable_(String, String, )=16, flushCache_()=0}
2016-07-15T03:13:08,119 INFO  [main]: ql.Driver (:()) - Completed compiling
command(queryId=ubuntu_20160715031308_bdf29227-ee7e-417f-834d-dae397d4eb9b);
Time taken: 0.018 seconds
2016-07-15T03:13:08,119 INFO  [main]: ql.Driver (:()) - Executing
command(queryId=ubuntu_20160715031308_bdf29227-ee7e-417f-834d-dae397d4eb9b):
msck repair table foo
2016-07-15T03:13:08,119 INFO  [main]: ql.Driver (:()) - Starting task
[Stage-0:DDL] in serial mode
2016-07-15T03:13:08,138 DEBUG [main]: ipc.Client (:()) - The ping interval
is 60000 ms.
2016-07-15T03:13:08,138 DEBUG [main]: ipc.Client (:()) - Connecting to /
10.12.15.12:8020
2016-07-15T03:13:08,140 DEBUG [IPC Parameter Sending Thread #3]: ipc.Client
(:()) - IPC Client (1990733619) connection to /10.12.15.12:8020 from ubuntu
sending #35
2016-07-15T03:13:08,138 DEBUG [IPC Client (1990733619) connection to /
10.12.15.12:8020 from ubuntu]: ipc.Client (:()) - IPC Client (1990733619)
connection to /10.12.15.12:8020 from ubuntu: starting, having connections 1
2016-07-15T03:13:08,140 DEBUG [IPC Client (1990733619) connection to /
10.12.15.12:8020 from ubuntu]: ipc.Client (:()) - IPC Client (1990733619)
connection to /10.12.15.12:8020 from ubuntu got value #35
2016-07-15T03:13:08,144 DEBUG [main]: ipc.ProtobufRpcEngine (:()) - Call:
getFileInfo took 7ms
2016-07-15T03:13:08,144 DEBUG [main]: metadata.HiveMetaStoreChecker
(:()) - *Not-using
threaded version of MSCK-GetPaths*
2016-07-15T03:13:08,144 DEBUG [IPC Parameter Sending Thread #3]: ipc.Client
(:()) - IPC Client (1990733619) connection to /10.12.15.12:8020 from ubuntu
sending #36
2016-07-15T03:13:08,145 DEBUG [IPC Client (1990733619) connection to /
10.12.15.12:8020 from ubuntu]: ipc.Client (:()) - IPC Client (1990733619)
connection to /10.12.15.12:8020 from ubuntu got value #36
2016-07-15T03:13:08,145 DEBUG [main]: ipc.ProtobufRpcEngine (:()) - Call:
getListing took 1ms
2016-07-15T03:13:08,146 ERROR [main]: exec.DDLTask (:()) -
java.lang.NullPointerException
        at
java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011)
        at
java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
        at
org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:409)
        at
org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:388)
        at
org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:309)
        at
org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:285)
        at
org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:230)
        at
org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:109)
        at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1814)
        at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:403)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)


table ddl:

CREATE EXTERNAL TABLE `foo`(
  `a` int)
PARTITIONED BY (
  `date_key` bigint)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION
  'hdfs://10.12.15.12:8020/tmp/foo'
TBLPROPERTIES (
  'transient_lastDdlTime'='1468469502')

On Thu, Jul 14, 2016 at 6:55 PM, Rajesh Balamohan <
rajesh.balamohan@gmail.com> wrote:

> Hi Stephen,
>
> Can you try by turning off multi-threaded approach by setting
> "hive.mv.files.thread=0"?  You mentioned that your tables tables are in s3,
> but the external table created was pointing to HDFS. Was that intentional?
>
> ~Rajesh.B
>
> On Fri, Jul 15, 2016 at 6:58 AM, Stephen Sprague <spragues@gmail.com>
> wrote:
>
>> in the meantime given my tables are in s3 i've written a utility to do a
>> 'aws s3 ls' on the bucket and folder in question, change the folder syntax
>> to partition syntax and then issued my own 'alter table ... add partition'
>> for each partition.
>>
>>
>> so essentially it does what msck repair tables does but in a non-portable
>> way.  oh well.  gotta do what ya gotta do.
>>
>> On Wed, Jul 13, 2016 at 9:29 PM, Stephen Sprague <spragues@gmail.com>
>> wrote:
>>
>>> hey guys,
>>> i'm using hive version 2.1.0 and i can't seem to get msck repair table
>>> to work.  no matter what i try i get the 'ol NPE.  I've set the log level
>>> to 'DEBUG' but yet i still am not seeing any smoking gun.
>>>
>>> would anyone here have any pointers or suggestions to figure out what's
>>> going wrong?
>>>
>>> thanks,
>>> Stephen.
>>>
>>>
>>>
>>> hive> create external table foo (a int) partitioned by (date_key bigint)
>>> location 'hdfs:/tmp/foo';
>>> OK
>>> Time taken: 3.359 seconds
>>>
>>> hive> msck repair table foo;
>>> FAILED: Execution Error, return code 1 from
>>> org.apache.hadoop.hive.ql.exec.DDLTask
>>>
>>>
>>> from the log...
>>>
>>> 2016-07-14T04:08:02,431 DEBUG [MSCK-GetPaths-1]:
>>> httpclient.RestStorageService (:()) - Found 13 objects in one batch
>>> 2016-07-14T04:08:02,431 DEBUG [MSCK-GetPaths-1]:
>>> httpclient.RestStorageService (:()) - Found 0 common prefixes in one batch
>>> 2016-07-14T04:08:02,433 ERROR [main]: metadata.HiveMetaStoreChecker
>>> (:()) - java.lang.NullPointerException
>>> 2016-07-14T04:08:02,434 WARN  [main]: exec.DDLTask (:()) - Failed to run
>>> metacheck:
>>> org.apache.hadoop.hive.ql.metadata.HiveException:
>>> java.lang.NullPointerException
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:444)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:448)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:388)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:309)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:285)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:230)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:109)
>>>         at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1814)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:403)
>>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>>>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
>>>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
>>>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>>>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>
>>
>>
>
>
> --
> ~Rajesh.B
>

Mime
View raw message