incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manu Zhang <owenzhang1...@gmail.com>
Subject Re: unable to read saved rowcache from disk
Date Wed, 14 Nov 2012 07:38:57 GMT
actually, I'm thinking about a bug or something


On Wed, Nov 14, 2012 at 3:13 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> http://wiki.apache.org/cassandra/LargeDataSetConsiderations
>
> A negative side-effect of a large row-cache is start-up time. The
> periodic saving of the row cache information only saves the keys that
> are cached; the data has to be pre-fetched on start-up. On a large
> data set, this is probably going to be seek-bound and the time it
> takes to warm up the row cache will be linear with respect to the row
> cache size (assuming sufficiently large amounts of data that the seek
> bound I/O is not subject to optimization by disks)
>
> Assuming a row cache 15MB and the average row is 300 bytes, that could
> be 50,000 entries. 4 hours seems like a long time to read back 50K
> entries. Unless the source table was very large and you can only do a
> small number / reads/sec.
>
> On Tue, Nov 13, 2012 at 9:47 PM, Manu Zhang <owenzhang1990@gmail.com>
> wrote:
> > "incorrect"... what do you mean? I think it's only 15MB, which is not
> big.
> >
> >
> > On Wed, Nov 14, 2012 at 10:38 AM, Edward Capriolo <edlinuxguru@gmail.com
> >
> > wrote:
> >>
> >> Yes the row cache "could be" incorrect so on startup cassandra verify
> they
> >> saved row cache by re reading. It takes a long time so do not save a
> big row
> >> cache.
> >>
> >>
> >> On Tuesday, November 13, 2012, Manu Zhang <owenzhang1990@gmail.com>
> wrote:
> >> > I have a rowcache provieded by SerializingCacheProvider.
> >> > The data that has been read into it is about 500MB, as claimed by
> >> > jconsole. After saving cache, it is around 15MB on disk. Hence, I
> suppose
> >> > the size from jconsole is before serializing.
> >> > Now while restarting Cassandra, it's unable to read saved rowcache
> back.
> >> > By "unable", I mean around 4 hours and I have to abort it and remove
> cache
> >> > so as not to suspend other tasks.
> >> > Since the data aren't huge, why Cassandra can't read it back?
> >> > My Cassandra is 1.2.0-beta2.
> >
> >
>

Mime
View raw message