Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9999D1199E for ; Wed, 17 Sep 2014 11:04:06 +0000 (UTC) Received: (qmail 3558 invoked by uid 500); 17 Sep 2014 11:03:59 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 3429 invoked by uid 500); 17 Sep 2014 11:03:59 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 3413 invoked by uid 99); 17 Sep 2014 11:03:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Sep 2014 11:03:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of akelpe@concurrentinc.com designates 209.85.220.178 as permitted sender) Received: from [209.85.220.178] (HELO mail-vc0-f178.google.com) (209.85.220.178) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Sep 2014 11:03:33 +0000 Received: by mail-vc0-f178.google.com with SMTP id hy4so1110113vcb.37 for ; Wed, 17 Sep 2014 04:03:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=concurrentinc.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=bvVWquvGC+lnaUp4qcKL/x2ADjrsASlGpnq/fejSq6o=; b=b/BMSlr2fs87xQwOzgjWCB3wnnMW34MRqXoT4vkr8/tFR32SUP+bTX0wjF543CiA6C tLM9PZMdQjkEhnBSy7W5uqPg1JjvUTeFMq+EG188VSyaYrCDXr53dVaDErYzuBh8I1s7 Tm3Zsint15MTeo6SRFNv24wkIVa1KFPgNjjJQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=bvVWquvGC+lnaUp4qcKL/x2ADjrsASlGpnq/fejSq6o=; b=MW6KXo8EPu3Nq1GdkVHFfTjPmc4FqM5G1y/O1I7Ytso1z+YkBFD2pRaPvzw+JDi11E AmUzQUWXKLbSE6IKl6gLtbbWoG35EFDDTKxZbdg+0OLHOqNOV2FqO4jD0GZczTz5lTsp D0d2dXrrGq7Zjk1DVu62trzzc3a65wDuZdxUd+i8amPlwbkpIjO20i16gPelzfcVbVBc V769mjifJxz0Qz2aptp6e+j9kPndBqkGsHCLGpFP3q1Wi216EPY3L7zBLr/OwHfHgxnA b+WUmTJbCepQL1/QaVyRRZkOHivLP7nrmogBak0A/NIfoRS10Na6wvrgd9yUalmieoGa EZHg== X-Gm-Message-State: ALoCoQnVFgnCDAE62hmtiMgnxy+94LXV/Th9Qff1On0ixmtbDTkmmVgKc9yuqYbEZnMr7NpYWh3C MIME-Version: 1.0 X-Received: by 10.52.94.36 with SMTP id cz4mr210227vdb.75.1410951811058; Wed, 17 Sep 2014 04:03:31 -0700 (PDT) Received: by 10.52.35.170 with HTTP; Wed, 17 Sep 2014 04:03:30 -0700 (PDT) In-Reply-To: References: Date: Wed, 17 Sep 2014 13:03:30 +0200 Message-ID: Subject: Re: hadoop cluster crash problem From: Andre Kelpe To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf307f31667e66ae050340d1d3 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307f31667e66ae050340d1d3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable virtualbox is known for causing instabilities in the host-kernel (or at least, it used to). You might be better off asking for support there: https://www.virtualbox.org/wiki/Bugtracker - Andr=C3=A9 On Wed, Sep 17, 2014 at 4:25 AM, Li Li wrote: > hi all, > I know it's not a problem related to hadoop but administrator can > not find any clues. > I have a machine with 24 core and 64GB memory with ubuntu 12.04 > LTS. we use virtual box to create 4 virtual machine. Each vm has 10GB > memory and 6 core. > I have setup a small hadoop 1.2.1 cluster with one > jobtracker/namenode and 3 tasktracker/datanode. Each tasktrack has 4 > mapper slots and 4 reducers slot. > But it always crashs(the host machine crash, not vm crash). > Sometimes it crashes for the first map-reduce job. Sometimes it can > run a few jobs. > is there any clues? I have checked the sys log and can find any > thing useful. Using monitor system, The cpu and io is not abnormal. > The only abnormal phenomenon is context switch is high. about 40k. > --=20 Andr=C3=A9 Kelpe andre@concurrentinc.com http://concurrentinc.com --20cf307f31667e66ae050340d1d3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
virtualbox is known for causing instabilities in the = host-kernel (or at least, it used to). You might be better off asking for s= upport there: https:= //www.virtualbox.org/wiki/Bugtracker

- Andr=C3=A9

On Wed, Sep 17, = 2014 at 4:25 AM, Li Li <fancyerii@gmail.com> wrote:
hi all,
=C2=A0 =C2=A0 I know it's not a problem related to hadoop but administr= ator can
not find any clues.
=C2=A0 =C2=A0 I have a machine with 24 core and 64GB memory with ubuntu 12.= 04
LTS. we use virtual box to create 4 virtual machine. Each vm has 10GB
memory and 6 core.
=C2=A0 =C2=A0 I have setup a small hadoop 1.2.1 cluster with one
jobtracker/namenode and 3 tasktracker/datanode. Each tasktrack has 4
mapper slots and 4 reducers slot.
=C2=A0 =C2=A0 But it always crashs(the host machine crash, not vm crash). Sometimes it crashes for the first map-reduce job. Sometimes it can
run a few jobs.
=C2=A0 =C2=A0 is there any clues? I have checked the sys log and can find a= ny
thing useful. Using monitor system, The cpu and io is not abnormal.
The only abnormal phenomenon is context switch is high. about 40k.



--
Andr=C3=A9 Kelpe
andre@concurrenti= nc.com
http:/= /concurrentinc.com
--20cf307f31667e66ae050340d1d3--