Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D35F81812A for ; Thu, 8 Oct 2015 18:58:50 +0000 (UTC) Received: (qmail 20607 invoked by uid 500); 8 Oct 2015 18:58:42 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 20572 invoked by uid 500); 8 Oct 2015 18:58:42 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 20562 invoked by uid 99); 8 Oct 2015 18:58:42 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Oct 2015 18:58:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 08C85180DDC for ; Thu, 8 Oct 2015 18:58:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.898 X-Spam-Level: ** X-Spam-Status: No, score=2.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id AzxH7D1zyGqQ for ; Thu, 8 Oct 2015 18:58:40 +0000 (UTC) Received: from mail-yk0-f172.google.com (mail-yk0-f172.google.com [209.85.160.172]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id B3EFE439BA for ; Thu, 8 Oct 2015 18:58:40 +0000 (UTC) Received: by ykft14 with SMTP id t14so58059726ykf.0 for ; Thu, 08 Oct 2015 11:58:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=OqNHULZOBIXFBE8gyPilQ+44hOOS5NGd2JsnzVpCvHw=; b=dgsdVup28ucyQa38m6SU3tNVZTNrhBn+dAO3D6WfybRdjRmqDouCjoadPrjkEqx4dv HJmoE3kB3Wv1O1WiDmHNx2ji5n/RIZ+FX/c/Baf4LNhr34FGprjCyWXYx/qfhsINyMpw JtyTEKokedOT9rSkovhSLoZVswrm3f4ywYBbvF2XkrsEhtpXN5/jBl/2q1GkW6RF7NoX Uw1s/5yA72z/7WLk+rmolq4JuGI6WKCFatk1IHreiw8g2yXmaQkY0L3XOCXISbmVHvLQ P33ni1ccmjlXhlj61VkQQa92hokcBxRyplqrQmih94py0JYMK0grVFNJ9fg3acViIYhi 68kQ== MIME-Version: 1.0 X-Received: by 10.129.158.133 with SMTP id v127mr6401899ywg.135.1444330720380; Thu, 08 Oct 2015 11:58:40 -0700 (PDT) Received: by 10.37.230.88 with HTTP; Thu, 8 Oct 2015 11:58:40 -0700 (PDT) In-Reply-To: References: Date: Thu, 8 Oct 2015 15:58:40 -0300 Message-ID: Subject: Re: [cassandra 2.1.3] Missing host ID From: Eduardo Cusa To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=94eb2c0b6ec886cf0405219c7309 --94eb2c0b6ec886cf0405219c7309 Content-Type: text/plain; charset=UTF-8 Hello Paulo, this issue started today and always happened in the same node. Run the following command seems to solve the problem : $ nodetool truncatehints Now the node is up. Regards Eduardo 2015-10-08 15:41 GMT-03:00 Paulo Motta : > Hello Eduardo, > > Your node is trying to write a hint to another node (after a timed out > write), but because of some race condition it does not have its token table > updated soon after startup, so it cannot locate the node with that ID. You > should not be worried, as the only consequence is that one hint was lost > and data consistency can be fixed with a simple repair (or during read > repairs). > > Some other people have reported a similar condition so I opened a JIRA > ticket: https://issues.apache.org/jira/browse/CASSANDRA-10485 > > Some questions to help troubleshooting: > > - Does it happen always with the same node or any node that you restart? > - Was that node ever replaced or upgraded? > - With what frequency does it happen? > > Thanks, > > Paulo > > 2015-10-08 10:45 GMT-07:00 Eduardo Cusa : > >> Hi Guys, I have a custer with 12 nodes. >> >> when I restart one of them I receive the error "Missing host ID": >> >> >> >> WARN [SharedPool-Worker-1] 2015-10-08 13:15:33,882 >> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread >> Thread[SharedPool-Worker-1,5,main]: {} >> java.lang.AssertionError: Missing host ID for 63.251.156.141 >> at >> org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:978) >> ~[apache-cassandra-2.1.3.jar:2.1.3] >> at >> org.apache.cassandra.service.StorageProxy$6.runMayThrow(StorageProxy.java:950) >> ~[apache-cassandra-2.1.3.jar:2.1.3] >> at >> org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:2235) >> ~[apache-cassandra-2.1.3.jar:2.1.3] >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> ~[na:1.8.0_60] >> at >> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) >> ~[apache-cassandra-2.1.3.jar:2.1.3] >> at >> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) >> [apache-cassandra-2.1.3.jar:2.1.3] >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] >> >> >> >> >> If I made nodetool status, the problematic node has ID: >> >> UN 10.10.10.12 1.3 TB 1 ? >> 4d5c8fd2-a909-4f09-a23c-4cd6040f338a rack3 >> >> >> >> >> Any idea what could be happening? >> >> >> Regards >> Eduardo >> >> >> > --94eb2c0b6ec886cf0405219c7309 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello Paulo, this issue started today and always= happened in the same node.

Run the follow= ing command seems to solve = the problem :

$ nodetool trun= catehints


Now the node is up.

<= p>

Regards

Eduardo





2015-10-08 15:= 41 GMT-03:00 Paulo Motta <pauloricardomg@gmail.com>:<= br>
Hello Eduardo,
<= br>
Your node is trying to write a hint to another node (after a = timed out write), but because of some race condition it does not have its t= oken table updated soon after startup, so it cannot locate the node with th= at ID. You should not be worried, as the only consequence is that one hint = was lost and data consistency can be fixed with a simple repair (or during = read repairs).

Some other people have reported a similar condition s= o I opened a JIRA ticket: https://issues.apache.org/jira/browse/CA= SSANDRA-10485

Some questions to help troubleshooting:=

- Does it happen always with the same node or= any node that you restart?
- Was that node ever replaced or = upgraded?
- With what frequency does it happen?

=
Thanks,

Paulo
<= div class=3D"h5">

= 2015-10-08 10:45 GMT-07:00 Eduardo Cusa <eduardo.cusa@gmail.com&g= t;:
Hi = Guys, I have a custer with 12 nodes.

<= span>when I restart one of them I re= ceive the error "Missing host ID":



<= font size=3D"2">WARN=C2=A0 = [SharedPool-Worker-1] 2015-10-08 13:15:33,882 AbstractTracingAwareExecutorS= ervice.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-1,5= ,main]: {}
java.lang.AssertionError: Missing host ID for 63.251.156.141
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.service.StorageProxy.write= HintForMutation(StorageProxy.java:978) ~[apache-cassandra-2.1.3.jar:2.1.3]<= br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.servi= ce.StorageProxy$6.runMayThrow(StorageProxy.java:950) ~[apache-cassandra-2.1= .3.jar:2.1.3]
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.c= assandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:2235) ~[ap= ache-cassandra-2.1.3.jar:2.1.3]
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:5= 11) ~[na:1.8.0_60]
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apa= che.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run= (AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.1.3.jar= :2.1.3]
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassand= ra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.3.jar= :2.1.3]
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.Thread.r= un(Thread.java:745) [na:1.8.0_60]





If I made nodetool status, the problematic node has ID:

UN=C2=A0 10.10.10.12=C2=A0 1.3 TB=C2=A0=C2=A0=C2=A0=C2= =A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 4d5c8fd2-a909-4f09-a23c-4cd6040f338a=C2=A0 rack3




Any idea what could be happening?

Regards
Eduardo

<= div>


--94eb2c0b6ec886cf0405219c7309--