Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5166110410 for ; Wed, 25 Sep 2013 13:04:31 +0000 (UTC) Received: (qmail 78120 invoked by uid 500); 25 Sep 2013 13:04:25 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 77960 invoked by uid 500); 25 Sep 2013 13:04:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 77782 invoked by uid 99); 25 Sep 2013 13:04:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Sep 2013 13:04:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of chris.wirt@struq.com designates 74.125.83.49 as permitted sender) Received: from [74.125.83.49] (HELO mail-ee0-f49.google.com) (74.125.83.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Sep 2013 13:04:07 +0000 Received: by mail-ee0-f49.google.com with SMTP id d41so3135415eek.22 for ; Wed, 25 Sep 2013 06:03:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:references:in-reply-to:subject:date :message-id:mime-version:content-type:thread-index:content-language; bh=Kb0H8G0FfbwlVaNR0HuISutj4XEB4CaIZLtgRWh917o=; b=Ptf6J7nPML138/2psjI7Gjukxhnfird1EIcwDPUYaanHooaf68BJQa6Y9gK+53xKp5 mpJ4xRnxC3MZGqyu+8s+i8LL28a3QUcTiI5q72w0l0NGkOlMqUfZgkcT3PWOhZYT7UWA AVKtvOmf5Q8YhkBxBC5byGBv63SDvoIrbJOmKQuCCwaLDakox8dKn/ZZybwklfCW9EWF YTFoic71m7m7FT91G+bCKjbXgFDfYcaL2Z7v2mNfvZCLxZ5Ft81fI0D4ng16DErg6LbO uv29fG4ANBrfuIgnPHNGOBatVKFAygHjuxkH9iZVKwR/VUbcCCZ8dUIRuZg0xCgS9Jcd XMlg== X-Gm-Message-State: ALoCoQmyqGd28J2yzZw2L/moJDvOp1h5ibEL/viylG4f0vMdHOzr7fKgGU8o9Zs5/iYakG1vPQ7w X-Received: by 10.15.32.136 with SMTP id a8mr3005816eev.71.1380114227346; Wed, 25 Sep 2013 06:03:47 -0700 (PDT) Received: from StevePereiraPC (host81-133-200-21.in-addr.btopenworld.com. [81.133.200.21]) by mx.google.com with ESMTPSA id x47sm65368767eea.16.1969.12.31.16.00.00 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 25 Sep 2013 06:03:46 -0700 (PDT) From: "Christopher Wirt" To: References: <002401ceb980$85a26d10$90e74730$@struq.com> <009601ceb9de$50fa43e0$f2eecba0$@struq.com> In-Reply-To: Subject: RE: 1.2.10 -> 2.0.1 migration issue Date: Wed, 25 Sep 2013 14:03:45 +0100 Message-ID: <00b301ceb9ef$ab6972e0$023c58a0$@struq.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_00B4_01CEB9F8.0D3491A0" X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQKNSJKDQv/mVsrXuuRf3wXL9wFexAKNaK5xAl5wT7wAplZ9EQHGlNx0mB6DRNA= Content-Language: en-gb X-Virus-Checked: Checked by ClamAV on apache.org This is a multipart message in MIME format. ------=_NextPart_000_00B4_01CEB9F8.0D3491A0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi Marcus, I've seen your patch. This works with what I'm seeing. The first data directory only contained the JSON manifest at that time. As a workaround I've made sure that each of the snapshot directories now exist before starting up. I still end up with the second exception I posted regarding a duplicate hard link. Possibly two unrelated exceptions. After getting this error. Looking at the datadirs Data1 contains JSON manifests Loads of data files Snapshot directory Data2 contains Just the snapshot directory Data3 contains Just the snapshot directory INFO 12:56:22,766 Migrating manifest for struqrealtime/impressionstorev2 INFO 12:56:22,767 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration ERROR 12:56:22,787 Exception encountered during startup java.lang.RuntimeException: Tried to create duplicate hard link to /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem etamigration/impressionstorev2.json at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS( LegacyLeveledManifest.java:138) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(Le gacyLeveledManifest.java:91) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:4 43) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:486) java.lang.RuntimeException: Tried to create duplicate hard link to /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem etamigration/impressionstorev2.json at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS( LegacyLeveledManifest.java:138) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(Le gacyLeveledManifest.java:91) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:4 43) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:486) Exception encountered during startup: Tried to create duplicate hard link to /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem etamigration/impressionstorev2.json Thanks, Chris From: Marcus Eriksson [mailto:krummas@gmail.com] Sent: 25 September 2013 13:11 To: user@cassandra.apache.org Subject: Re: 1.2.10 -> 2.0.1 migration issue you are probably reading trunk NEWS.txt read the ticket for explanation of what the issue was (it is a proper bug) On Wed, Sep 25, 2013 at 12:59 PM, Christopher Wirt wrote: Hi Marcus, Thanks for having a look at this. Just noticed this in the NEWS.txt For leveled compaction users, 2.0 must be atleast started before upgrading to 2.1 due to the fact that the old JSON leveled manifest is migrated into the sstable metadata files on startup in 2.0 and this code is gone from 2.1. Basically, my fault for skimming over this too quickly. We will move from 1.2.10 -> 2.0 -> 2.1 Thanks, Chris From: Marcus Eriksson [mailto:krummas@gmail.com] Sent: 25 September 2013 09:37 To: user@cassandra.apache.org Subject: Re: 1.2.10 -> 2.0.1 migration issue cant really reproduce, could you update the ticket with a bit more info about your setup? do you have multiple .json files in your data dirs? On Wed, Sep 25, 2013 at 10:07 AM, Marcus Eriksson wrote: this is most likely a bug, filed https://issues.apache.org/jira/browse/CASSANDRA-6093 and will try to have a look today. On Wed, Sep 25, 2013 at 1:48 AM, Christopher Wirt wrote: Hi, Just had a go at upgrading a node to the latest stable c* 2 release and think I ran into some issues with manifest migration. On initial start up I hit this error as it starts to load the first of my CF. INFO [main] 2013-09-24 22:56:01,018 LegacyLeveledManifest.java (line 89) Migrating manifest for struqrealtime/impressionstorev2 INFO [main] 2013-09-24 22:56:01,019 LegacyLeveledManifest.java (line 119) Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration ERROR [main] 2013-09-24 22:56:01,030 CassandraDaemon.java (line 459) Exception encountered during startup FSWriteError in /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem etamigration/impressionstorev2.json at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:83) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS( LegacyLeveledManifest.java:138) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(Le gacyLeveledManifest.java:91) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:4 42) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: java.nio.file.NoSuchFileException: /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem etamigration/impressionstorev2.json -> /disk1/cassandra/data/struqrealtime/impressionstorev2/impressionstorev2.json at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:474 ) at java.nio.file.Files.createLink(Files.java:1037) at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:79) ... 5 more I had already successful run a test migration on our dev server. Only real difference I can see if the number of data directories defined and the amount of data being held. I've run upgradesstables under 1.2.10. I have always been using vnodes and CQL3. I recently moved to using LZ4 instead of Snappy.. I tried to startup again and it gave me a slightly different error INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 89) Migrating manifest for struqrealtime/impressionstorev2 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 119) Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration ERROR [main] 2013-09-24 22:58:28,222 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Tried to create duplicate hard link to /disk3/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem etamigration/struqrealtime-impressionstorev2-ic-1030-TOC.txt at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS( LegacyLeveledManifest.java:129) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(Le gacyLeveledManifest.java:91) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:4 42) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Will have a go recreating this tomorrow. Any insight or guesses at what the issue might be are always welcome. Thanks, Chris ------=_NextPart_000_00B4_01CEB9F8.0D3491A0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi Marcus,

 

I’ve seen your patch. This works with what I’m seeing. = The first data directory only contained the JSON manifest at that = time.

 

As a workaround I’ve made sure that each of the snapshot = directories now exist before starting up.

 

I still end up with the second exception I posted regarding a = duplicate hard link. Possibly two unrelated = exceptions.

 

After getting this error. Looking at the = datadirs

Data1 contains

JSON manifests

Loads of data files

Snapshot directory

Data2 contains

           &nbs= p;    Just the snapshot directory

Data3 contains

           &nbs= p;    Just the snapshot directory

 

INFO 12:56:22,766 Migrating manifest for = struqrealtime/impressionstorev2

INFO 12:56:22,767 Snapshotting struqrealtime, impressionstorev2 to = pre-sstablemetamigration

ERROR 12:56:22,787 Exception encountered during = startup

java.lang.RuntimeException: Tried to create duplicate hard link to = /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstab= lemetamigration/impressionstorev2.json

        at = org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)<= o:p>

        at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutC= FS(LegacyLeveledManifest.java:138)

        at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests= (LegacyLeveledManifest.java:91)

        at = org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:2= 47)

        at = org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.jav= a:443)

        at = org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:48= 6)

java.lang.RuntimeException: Tried to create duplicate hard link to = /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstab= lemetamigration/impressionstorev2.json

        at = org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)<= o:p>

        at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutC= FS(LegacyLeveledManifest.java:138)

        at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests= (LegacyLeveledManifest.java:91)

        at = org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:2= 47)

        at = org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.jav= a:443)

        at = org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:48= 6)

Exception encountered during startup: Tried to create duplicate hard = link to = /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstab= lemetamigration/impressionstorev2.json

 

Thanks,

 

Chris

 

 

From:= Marcus = Eriksson [mailto:krummas@gmail.com]
Sent: 25 September 2013 = 13:11
To: user@cassandra.apache.org
Subject: Re: = 1.2.10 -> 2.0.1 migration issue

 

you are = probably reading trunk NEWS.txt

 

read the ticket for explanation of what the issue was = (it is a proper bug)

 

On Wed, Sep 25, 2013 at 12:59 PM, Christopher Wirt = <chris.wirt@struq.com> = wrote:

Hi Marcus,

Thanks for having a look at this.

 

Just noticed this in the NEWS.txt

 <= /p>

For = leveled compaction users, 2.0 must be atleast started = before

     = upgrading to 2.1 due to the fact that the old JSON = leveled

     = manifest is migrated into the sstable metadata files on = startup

     = in 2.0 and this code is gone from 2.1.

 

Basically, my fault for skimming over this too quickly. =

 

We will move from 1.2.10 -> 2.0 -> 2.1

 

Thanks,

Chris

 

 

From:= Marcus = Eriksson [mailto:krummas@gmail.com]
Sent: 25 September = 2013 09:37
To: user@cassandra.apache.org
Subject: Re: = 1.2.10 -> 2.0.1 migration issue

 <= /o:p>

cant really = reproduce, could you update the ticket with a bit more info about your = setup?

 <= /o:p>

do you have = multiple .json files in your data = dirs?

 <= /p>

On Wed, Sep = 25, 2013 at 10:07 AM, Marcus Eriksson <krummas@gmail.com> wrote:

 <= /p>

On Wed, Sep = 25, 2013 at 1:48 AM, Christopher Wirt <chris.wirt@struq.com> = wrote:

Hi,

 <= /o:p>

Just had a = go at upgrading a node to the latest stable c* 2 release and think I ran = into some issues with manifest migration.

 <= /o:p>

On initial = start up I hit this error as it starts to load the first of my CF. =

 <= /o:p>

INFO [main] = 2013-09-24 22:56:01,018 LegacyLeveledManifest.java (line 89) Migrating = manifest for struqrealtime/impressionstorev2

INFO [main] = 2013-09-24 22:56:01,019 LegacyLeveledManifest.java (line 119) = Snapshotting struqrealtime, impressionstorev2 to = pre-sstablemetamigration

ERROR = [main] 2013-09-24 22:56:01,030 CassandraDaemon.java (line 459) Exception = encountered during startup

FSWriteError= in = /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstab= lemetamigration/impressionstorev2.json

  =       at = org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:83)<= o:p>

  =       at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutC= FS(LegacyLeveledManifest.java:138)

  =       at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests= (LegacyLeveledManifest.java:91)

  =       at = org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:2= 46)

  =       at = org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.jav= a:442)

  =       at = org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:48= 5)

Caused by: = java.nio.file.NoSuchFileException: = /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstab= lemetamigration/impressionstorev2.json -> = /disk1/cassandra/data/struqrealtime/impressionstorev2/impressionstorev2.j= son

  =       at = sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)

  =       at = sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)

  =       at = sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:= 474)

  =       at = java.nio.file.Files.createLink(Files.java:1037)

  =       at = org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:79)<= o:p>

  =       ... 5 more

 <= /o:p>

I had = already successful run a test migration on our dev server. Only real = difference I can see if the number of data directories defined and the = amount of data being held.

 <= /o:p>

I’ve = run upgradesstables under 1.2.10. I have always been using vnodes and = CQL3. I recently moved to using LZ4 instead of Snappy..

 <= /o:p>

I tried to = startup again and it gave me a slightly different error

 <= /o:p>

INFO [main] = 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 89) Migrating = manifest for struqrealtime/impressionstorev2

INFO [main] = 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 119) = Snapshotting struqrealtime, impressionstorev2 to = pre-sstablemetamigration

ERROR = [main] 2013-09-24 22:58:28,222 CassandraDaemon.java (line 459) Exception = encountered during startup

java.lang.Ru= ntimeException: Tried to create duplicate hard link to = /disk3/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstab= lemetamigration/struqrealtime-impressionstorev2-ic-1030-TOC.txt

  =       at = org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)<= o:p>

  =       at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutC= FS(LegacyLeveledManifest.java:129)

  =       at = org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests= (LegacyLeveledManifest.java:91)

  =       at = org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:2= 46)

  =       at = org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.jav= a:442)

  =       at = org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:48= 5)

 <= /o:p>

Will have a = go recreating this tomorrow.

 <= /o:p>

Any insight = or guesses at what the issue might be are always = welcome.

 <= /o:p>

Thanks,=

Chris

 <= /o:p>

 <= /o:p>

 

------=_NextPart_000_00B4_01CEB9F8.0D3491A0--