cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gurpreet Singh <gurpreet.si...@gmail.com>
Subject Re: Bootstrapping stays stuck
Date Tue, 14 Sep 2010 18:40:26 GMT
Hi Vineet,
I have tracked the nodetool streams to completion each time. Below are the
logs on the source and destination node. There are 3 sstables being
transferred, and the transfer seems to be successful. However, after the
streams finish, the source prints out messages about the dropped messages,
which may point to the problem. ideas? I checked port 7000 is open for
communication. 9160 is not up on the node being bootstrapped, but that comes
up after the node is bootstrapped, is that right?

Thanks a ton,
/G

*Logs on the source node (IP2):*
*
*
INFO [STREAM-STAGE:1] 2010-09-14 09:54:07,900 StreamOut.java (line 79)
Flushing memtables for userdata...
 INFO [STREAM-STAGE:1] 2010-09-14 09:54:07,900 StreamOut.java (line 95)
Performing anticompaction ...
 INFO [COMPACTION-POOL:1] 2010-09-14 09:54:07,900 CompactionManager.java
(line 339) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/data/cassandra/datadir/cassandradb/userdata/user_list_items-5823-Data.db')]
 INFO [GC inspection] 2010-09-14 09:56:54,712 GCInspector.java (line 129) GC
for ParNew: 212 ms, 29033016 reclaimed leaving 579419360 used; max is
4415946752
 INFO [COMPACTION-POOL:1] 2010-09-14 10:18:06,508 CompactionManager.java
(line 396) AntiCompacted to
/data/cassandra/datadir/cassandradb/userdata/stream/user_list_items-5825-Data.db.
 49074138589/36770836242 bytes for 5990912 keys.  Time: 1438607ms.
 INFO [COMPACTION-POOL:1] 2010-09-14 10:18:06,528 CompactionManager.java
(line 339) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/data/cassandra/datadir/cassandradb/userdata/user-22-Data.db')]
 INFO [COMPACTION-POOL:1] 2010-09-14 10:18:08,839 CompactionManager.java
(line 396) AntiCompacted to
/data/mysql/cassandrastorage/userdata/stream/user-24-Data.db.
 28185244/21126422 bytes for 47722 keys.  Time: 2310ms.
 INFO [COMPACTION-POOL:1] 2010-09-14 10:18:08,840 CompactionManager.java
(line 339) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/data/cassandra/datadir/cassandradb/userdata/user_lists-502-Data.db')]
 INFO [COMPACTION-POOL:1] 2010-09-14 10:21:08,606 CompactionManager.java
(line 396) AntiCompacted to
/data/mysql/cassandrastorage/userdata/stream/user_lists-504-Data.db.
 2927724285/2195768325 bytes for 3976118 keys.  Time: 179766ms.
 INFO [STREAM-STAGE:1] 2010-09-14 10:21:08,607 StreamOut.java (line 127)
Stream context metadata
/data/cassandra/datadir/cassandradb/userdata/stream/user_list_items-5825-Index.db:522051369,
 3
sstables./data/cassandra/datadir/cassandradb/userdata/stream/user_list_items-5825-Filter.db:7489045,
 3
sstables./data/cassandra/datadir/cassandradb/userdata/stream/user_list_items-5825-Data.db:36770836242,
 3
sstables./data/mysql/cassandrastorage/userdata/stream/user-24-Index.db:3373143,
 3
sstables./data/mysql/cassandrastorage/userdata/stream/user-24-Filter.db:59965,
 3
sstables./data/mysql/cassandrastorage/userdata/stream/user-24-Data.db:21126422,
 3
sstables./data/mysql/cassandrastorage/userdata/stream/user_lists-504-Index.db:282956452,
 3
sstables./data/mysql/cassandrastorage/userdata/stream/user_lists-504-Filter.db:4970125,
 3
sstables./data/mysql/cassandrastorage/userdata/stream/user_lists-504-Data.db:2195768325
 INFO [STREAM-STAGE:1] 2010-09-14 10:21:08,608 StreamOut.java (line 132)
Sending a stream initiate message to IP1...
 INFO [STREAM-STAGE:1] 2010-09-14 10:21:08,608 StreamOut.java (line 137)
Waiting for transfer to IP1 to complete
 *WARN [DroppedMessagesLogger] 2010-09-14 10:28:00,592 MessagingService.java
(line 501) Dropped 9 messages in the last 1000ms*
* *INFO [STREAM-STAGE:1] 2010-09-14 10:28:00,605 StreamOut.java (line 141)
Done with transfer to IP1
 INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 10:28:00,670
SSTableDeletingReference.java (line 104) Deleted
/data/cassandra/datadir/cassandradb/system/LocationInfo-17-Data.db
 *WARN [DroppedMessagesLogger] 2010-09-14 10:28:01,602 MessagingService.java
(line 501) Dropped 1 messages in the last 1000ms*
* *INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 10:28:06,133
SSTableDeletingReference.java (line 104) Deleted
/data/mysql/cassandrastorage/system/LocationInfo-19-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 10:28:06,134
SSTableDeletingReference.java (line 104) Deleted
/data/cassandra/datadir/cassandradb/system/LocationInfo-18-Data.db
 INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 10:28:06,134
SSTableDeletingReference.java (line 104) Deleted
/data/cassandra/datadir/cassandradb/system/LocationInfo-20-Data.db



*Logs on new node being bootstrapped (IP1):*

INFO [main] 2010-09-14 09:53:37,788 BootStrapper.java (line 104) New token
will be 149298847080838649048722691811739653739 to assume load from IP2
 INFO [main] 2010-09-14 09:53:37,789 StorageService.java (line 388) Joining:
sleeping 30000 ms for pending range setup
 INFO [main] 2010-09-14 09:54:07,792 StorageService.java (line 388)
Bootstrapping
 INFO [Thread-17] 2010-09-14 10:26:40,699 SSTableReader.java (line 120)
Sampling index for
/data/mysql/cassandrastorage/userdata/user_list_items-2-Data.db
 INFO [Thread-17] 2010-09-14 10:26:51,453 StreamCompletionHandler.java (line
64) Streaming added
/data/mysql/cassandrastorage/userdata/user_list_items-2-Data.db
 INFO [Thread-23] 2010-09-14 10:26:52,219 SSTableReader.java (line 120)
Sampling index for
/data/cassandra/datadir/cassandradb/userdata/user-2-Data.db
 INFO [Thread-23] 2010-09-14 10:26:52,311 StreamCompletionHandler.java (line
64) Streaming added
/data/cassandra/datadir/cassandradb/userdata/user-2-Data.db
 INFO [Timer-0] 2010-09-14 10:27:13,676 Gossiper.java (line 180) InetAddress
IP2 is now dead.
 INFO [Thread-26] 2010-09-14 10:27:18,858 SSTableReader.java (line 120)
Sampling index for
/data/cassandra/datadir/cassandradb/userdata/user_lists-2-Data.db
 INFO [Thread-26] 2010-09-14 10:27:24,645 StreamCompletionHandler.java (line
64) Streaming added
/data/cassandra/datadir/cassandradb/userdata/user_lists-2-Data.db
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-14 10:28:00,605
HintedHandOffManager.java (line 165) Started hinted handoff for endPoint IP2
 INFO [GMFD:1] 2010-09-14 10:28:00,605 Gossiper.java (line 578) InetAddress
IP2 is now UP
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-14 10:28:00,606
HintedHandOffManager.java (line 222) Finished hinted handoff of 0 rows to
endpoint IP2
 INFO [FLUSH-TIMER] 2010-09-14 10:57:52,381 ColumnFamilyStore.java (line
357) LocationInfo has reached its threshold; switching in a fresh Memtable
at
CommitLogContext(file='/data/cassandra/datadir/commitlog/CommitLog-1284483127473.log',
position=1353)
 INFO [FLUSH-TIMER] 2010-09-14 10:57:52,382 ColumnFamilyStore.java (line
609) Enqueuing flush of Memtable-LocationInfo@2057079871(210 bytes, 6
operations)
 INFO [FLUSH-WRITER-POOL:1] 2010-09-14 10:57:52,382 Memtable.java (line 148)
Writing Memtable-LocationInfo@2057079871(210 bytes, 6 operations)
 INFO [FLUSH-WRITER-POOL:1] 2010-09-14 10:57:52,464 Memtable.java (line 162)
Completed flushing
/data/cassandra/datadir/cassandradb/system/LocationInfo-10-Data.db


On Tue, Sep 14, 2010 at 10:52 AM, vineet daniel <vineetdaniel@gmail.com>wrote:

> Hi Gurpreet
>
> What is the output of  nodetool -h <hostname/IP> streams  -->( to see what
> is going on between the nodes) . If you dont see anything happening try
> switching off firewall or iptables.
>
>
> Regards
> Vineet Daniel
> Cell          : +918106217121
> Websites :
> Blog <http://vinetedaniel.blogspot.com>   |   Linkedin<http://in.linkedin.com/in/vineetdaniel>
> |  Twitter <https://twitter.com/vineetdaniel>
>
>
>
>
>
> On Tue, Sep 14, 2010 at 11:11 PM, Gurpreet Singh <gurpreet.singh@gmail.com
> > wrote:
>
>> I tried this again, it happenned yet again.
>> This time while the transfer messages seemed tobe in order, i also noticed
>> that the source logs talk about having 9 dropped messages in the last 1000
>> ms. The only activity on the whole cluster is this bootstrapping, there is
>> no read/write traffic going on.
>>
>> /G
>>
>> On Tue, Sep 14, 2010 at 10:05 AM, Gurpreet Singh <
>> gurpreet.singh@gmail.com> wrote:
>>
>>> I am using cassandra 0.6.5.
>>>
>>>
>>> On Tue, Sep 14, 2010 at 9:16 AM, Gurpreet Singh <
>>> gurpreet.singh@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I have a cassandra cluster of 4 machines, and I am trying to bootstrap 2
>>>> more machines, one at a time.
>>>> For both these machines, the bootstrapping stays stuck after the
>>>> streaming is done.
>>>>
>>>> When the nodes come up for bootstrapping, I see all the relevant
>>>> messages about getting a new token, assuming load from a particular host.
I
>>>> see a couple of nodes anticompacting data to send, and at a later point the
>>>> node that is bootstrapping prints the right streaming mesgs. However, once
>>>> the streaming is over, the node just doesnt do anything. Previously while
>>>> bootstrapping, I have seen that after the streaming is done, the node
>>>> restarts and becomes part of the ring by itself. I dont see this behaviour
>>>> with both the nodes I tried today.
>>>> I even restarted all the nodes in the cluster, and tried bootstrapping
>>>> one of the nodes again, but it again was stuck after streaming. It seems
to
>>>> have copied the relevant load as well.
>>>> Any ideas as to what could be going on here?
>>>>
>>>> /G
>>>>
>>>
>>>
>>
>

Mime
View raw message