Sure... now that I understand what is going on it is easy to see the signs (looking in the data/usertable/stream directory, then looking for the tmp files). A small script (or some special logic in nodetool) that just looked for those signs even, and said “things are in progress,” would be helpful.

As a data point, anti-compaction of 70 GB of data down to 35 GB for transfer to the new node took about 40 minutes in an unloaded system, and about 8 hours in a heavily loaded system. Hopefully streaming directly will reduce this time...

Thanks...

brian


On 3/3/10 5:19 AM, "Jonathan Ellis" <jbellis@gmail.com> wrote:

Providing "what is going on, nothing seems to be happening" visibility
is something we have struggled with here.

When we get https://issues.apache.org/jira/browse/CASSANDRA-579 done
for 0.7 we won't have the big "waiting to stream" problem since we'll
stream directly from the data files w/o anticompaction first.

Maybe there is something simpler we can do for 0.6 though.

-Jonathan

On Wed, Mar 3, 2010 at 12:26 AM, Brian Frank Cooper
<cooperb@yahoo-inc.com> wrote:
> Oops, looks like I just wasn’t patient enough:
>
> INFO - Sampling index for
> /home/cooperb/cassandra/data/usertable/data-1-Data.db
> INFO - Streaming added /home/cooperb/cassandra/data/usertable/data-1-Data.db
> INFO - Bootstrap/move completed! Now serving reads.
> INFO - Cassandra starting up...
>
> Thanks and sorry for the list noise...
>
> brian
>
>
> On 3/2/10 10:06 PM, "Brian Frank Cooper" <cooperb@yahoo-inc.com> wrote:
>
> Thanks for the idea, but I don’t see ‘streaming’ on either of the existing
> nodes:
>
> % ls -l
> total 12
> drwxr-xr-x  2 cooperb users 4096 Mar  2 21:33 commitlog
> drwxr-xr-x  4 cooperb users 4096 Mar  1 20:51 data
> drwxr-xr-x  2 cooperb users 4096 Mar  2 21:00 logs
>
> I see the anti-compaction, but then nothing:
>
> INFO - Node /98.137.30.39 is now part of the cluster
> INFO - InetAddress /98.137.30.39 is now UP
> INFO - AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-706-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-711-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-717-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-586-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-685-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-623-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-327-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-716-Data.db'),org.apache.cassandra.io.SSTableReader(path='/home/cooperb/cassandra/data/usertable/data-675-Data.db')]
> INFO - Node /98.137.30.39 state jump to leaving
>
> Thanks...
>
> brian
>
>
> On 3/2/10 10:00 PM, "Stu Hood" <stu.hood@rackspace.com> wrote:
>
> You are probably in the portion of bootstrap where data to be transferred is
> split out to disk, which can take a while: see
> https://issues.apache.org/jira/browse/CASSANDRA-579
>
> Look for a 'streaming' subdirectory in your data directories to confirm.
>
> -----Original Message-----
> From: "Brian Frank Cooper" <cooperb@yahoo-inc.com>
> Sent: Tuesday, March 2, 2010 11:50pm
> To: "cassandra-user@incubator.apache.org"
> <cassandra-user@incubator.apache.org>
> Subject: Re: Connect during bootstrapping?
>
> Thanks for the note.
>
> Can you help me with something else? I can't seem to get any data to
> transfer during bootstrapping...I must be doing something wrong.
>
> Here is what I did: I took 0.6.0-beta2, loaded 2 machines with 60-70GB each.
> Then I started a third node, with AutoBootstrap true. The node claims it is
> bootstrapping:
>
> INFO - Auto DiskAccessMode determined to be mmap
> INFO - Saved Token not found. Using Rb0mePN3PheW3haA
> INFO - Creating new commitlog segment
> /home/cooperb/cassandra/commitlog/CommitLog-1267594407761.log
> INFO - Starting up server gossip
> INFO - Joining: getting load information
> INFO - Sleeping 90000 ms to wait for load information...
> INFO - Node /98.137.30.37 is now part of the cluster
> INFO - Node /98.137.30.38 is now part of the cluster
> INFO - InetAddress /98.137.30.37 is now UP
> INFO - InetAddress /98.137.30.38 is now UP
> INFO - Joining: getting bootstrap token
> INFO - New token will be user148315419 to assume load from /98.137.30.38
> INFO - Joining: sleeping 30000 for pending range setup
> INFO - Bootstrapping
>
> But when I run nodetool streams, no streams are transferring:
>
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.
>
> And it doesn't look like the node is getting any data. Any ideas?
>
> Thanks for the help...
>
> Brian
>
>
> On 3/2/10 12:22 PM, "Jonathan Ellis" <jbellis@gmail.com> wrote:
>
> On Tue, Mar 2, 2010 at 1:54 PM, Brian Frank Cooper
> <cooperb@yahoo-inc.com> wrote:
>> Hi folks,
>>
>> I'm running 0.5 and I had 2 nodes up and running, then added a 3rd node in
>> bootstrap mode. I understand from other discussion list threads that the
>> new
>> node doesn't serve reads while it is bootstrapping, but does that mean it
>> won't connect at all?
>
> it doesn't start the thrift listener until it is bootstrapped, so yes.
>
> (you can tell when it's bootstrapped by when it appears in nodeprobe
> ring.  0.6 also adds bootstrap progress reporting via jmx.)
>
>> When I try to connect from my java client, or
>> cassandra-cli, I get the exception below. Is it the expected behavior?
>> (Also, cassandra-cli says "Connected to xxx.yahoo.com" even though it
>> isn't
>> really connected...)
>
> This is fixed in https://issues.apache.org/jira/browse/CASSANDRA-807
> for 0.6, fwiw.
>
> -Jonathan
>
>
> --
> Brian Cooper
> Principal Research Scientist
> Yahoo! Research
>
>
>
>
>
> --
> Brian Cooper
> Principal Research Scientist
> Yahoo! Research
>
>


--
Brian Cooper
Principal Research Scientist
Yahoo! Research