hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Guo <guomaof...@gmail.com>
Subject Re: hive jobs pending so long
Date Wed, 19 Dec 2012 18:15:07 GMT
Yes,  hadoop replication factor is already set to 1

Space on the datenode,

*11291 files and directories, 39205 blocks = 50496 total. Heap Size is
858.94 MB / 2.6 GB (32%)
*
Configured Capacity : 9.92 TB DFS Used : 2.12 TB Non DFS Used : 6.35 TB DFS
Remaining : 1.45 TB DFS Used% : 21.39 % DFS Remaining% : 14.58 % Live Nodes
: 1 Dead Nodes : 0 Decommissioning Nodes : 0 Number of Under-Replicated
Blocks : 164
Result of jmap NN,

Attaching to process ID 15949, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 19.0-b09

using thread-local object allocation.
Parallel GC with 13 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 3145728000 (3000.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 174063616 (166.0MB)

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 527171584 (502.75MB)
   used     = 138223192 (131.81990814208984MB)
   free     = 388948392 (370.93009185791016MB)
   26.219772877591218% used
>From Space:
   capacity = 21954560 (20.9375MB)
   used     = 8190336 (7.8109130859375MB)
   free     = 13764224 (13.1265869140625MB)
   37.30585354477612% used
To Space:
   capacity = 21954560 (20.9375MB)
   used     = 0 (0.0MB)
   free     = 21954560 (20.9375MB)
   0.0% used
PS Old Generation
   capacity = 351535104 (335.25MB)
   used     = 33554448 (32.00001525878906MB)
   free     = 317980656 (303.24998474121094MB)
   9.545120136849832% used
PS Perm Generation
   capacity = 21757952 (20.75MB)
   used     = 20201176 (19.265342712402344MB)
   free     = 1556776 (1.4846572875976562MB)
   92.8450251200113% used


Resutes of JT,

Attaching to process ID 16342, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 19.0-b09

using thread-local object allocation.
Parallel GC with 13 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 3145728000 (3000.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 174063616 (166.0MB)

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 131792896 (125.6875MB)
   used     = 72740936 (69.37116241455078MB)
   free     = 59051960 (56.31633758544922MB)
   55.1933664163507% used
>From Space:
   capacity = 21954560 (20.9375MB)
   used     = 4555392 (4.3443603515625MB)
   free     = 17399168 (16.5931396484375MB)
   20.749183768656717% used
To Space:
   capacity = 21954560 (20.9375MB)
   used     = 0 (0.0MB)
   free     = 21954560 (20.9375MB)
   0.0% used
PS Old Generation
   capacity = 351535104 (335.25MB)
   used     = 0 (0.0MB)
   free     = 351535104 (335.25MB)
   0.0% used
PS Perm Generation
   capacity = 24313856 (23.1875MB)
   used     = 24162376 (23.04303741455078MB)
   free     = 151480 (0.14446258544921875MB)
   99.37698076356132% used


Not very familiar with Java, if I miss something, please just let me know.

Thanks.







On Thu, Dec 20, 2012 at 1:47 AM, Nitin Pawar <nitinpawar432@gmail.com>wrote:

> i hope hadoop replication factor is set to 1
> apart from this can you check how much disk space is empty on the datanode?
>
> how are the memory stats on JT and NN?
>
>
>
> On Wed, Dec 19, 2012 at 10:57 PM, Neil Guo <guomaofeng@gmail.com> wrote:
>
>> No, actually there's only one datanode.
>>
>> On Thu, Dec 20, 2012 at 1:03 AM, Nitin Pawar <nitinpawar432@gmail.com>wrote:
>>
>>> Did you retire/remove few datanodes from your cluster in hurry ?
>>>
>>>
>>> On Wed, Dec 19, 2012 at 9:57 PM, Neil Guo <guomaofeng@gmail.com> wrote:
>>>
>>>> hi,
>>>>
>>>> My hive jobs became very slow from yesterday,  it keeps about 5 minutes
>>>> on pending, but it was only  40sec before. I didn't modify any
>>>> configuration and there's no other jobs on hadoop cluster.
>>>>
>>>>
>>>> My envirment,
>>>> hadoop-0.20.203.0
>>>> hive-0.8.1
>>>>
>>>> [neil@host logs@master]$ hive -e 'select count(*) from neiltest;'
>>>>
>>>> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated.
>>>> Please use org.apache.hadoop.log.metrics.EventCounter in all the
>>>> log4j.properties files.
>>>> Logging initialized using configuration in
>>>> jar:file:/opt/hadoop/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties
>>>> Hive history file=/tmp/neil/hive_job_log_neil_201212192350_640804332.txt
>>>> Total MapReduce jobs = 1
>>>> Launching Job 1 out of 1
>>>> Number of reduce tasks determined at compile time: 1
>>>> In order to change the average load for a reducer (in bytes):
>>>>   set hive.exec.reducers.bytes.per.reducer=<number>
>>>> In order to limit the maximum number of reducers:
>>>>   set hive.exec.reducers.max=<number>
>>>> In order to set a constant number of reducers:
>>>>   set mapred.reduce.tasks=<number>
>>>> Starting Job = job_201212191724_0012, Tracking URL =
>>>> http://localhost:50030/jobdetails.jsp?jobid=job_201212191724_0012
>>>> Kill Command = /opt/hadoop/hadoop/bin/../bin/hadoop job
>>>>  -Dmapred.job.tracker=localhost:9001 -kill job_201212191724_0012
>>>> Hadoop job information for Stage-1: number of mappers: 1; number of
>>>> reducers: 1
>>>> 2012-12-19 23:53:21,894 Stage-1 map = 0%,  reduce = 0%
>>>> 2012-12-19 23:53:27,940 Stage-1 map = 100%,  reduce = 0%
>>>> 2012-12-19 23:53:37,001 Stage-1 map = 100%,  reduce = 33%
>>>> 2012-12-19 23:53:40,026 Stage-1 map = 100%,  reduce = 100%
>>>> Ended Job = job_201212191724_0012
>>>> MapReduce Jobs Launched:
>>>> Job 0: Map: 1  Reduce: 1   HDFS Read: 22131 HDFS Write: 4 SUCESS
>>>> Total MapReduce CPU Time Spent: 0 msec
>>>> OK
>>>> 100
>>>> Time taken: 305.199 seconds
>>>>
>>>>
>>>> When the job was running, the datanode log file got this,
>>>>
>>>> 2012-12-20 00:00:18,408 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-5694413837496700253_48905
>>>> 2012-12-20 00:01:11,408 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_1151904922466051083_48776
>>>> 2012-12-20 00:01:51,855 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 39205
>>>> blocks took 7578 msec to generate and 124 msecs for RPC and NN processing
>>>> 2012-12-20 00:02:04,608 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-7641590199699424252_48775
>>>> 2012-12-20 00:02:57,607 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-7818849592980221590_48773
>>>> 2012-12-20 00:03:50,608 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_9074790179047774257_48905
>>>> 2012-12-20 00:04:43,608 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-3172554843058932436_17003
>>>> 2012-12-20 00:05:36,608 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8421435495571059078_48776
>>>> 2012-12-20 00:06:29,808 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-4414438232543819004_48775
>>>> 2012-12-20 00:07:22,808 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-6640428384205363780_48773
>>>> 2012-12-20 00:08:15,808 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_-5620587770589188815_48905
>>>> 2012-12-20 00:09:08,808 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8043462665698224618_48776
>>>> 2012-12-20 00:10:02,008 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8925335995905328229_16987
>>>>
>>>>
>>>> Does this slow dows the hive jobs or any other suggestion ?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Neil
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>>
>>
>
>
> --
> Nitin Pawar
>
>

Mime
View raw message