ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Pavlov <dpavlov....@gmail.com>
Subject Re: Cache scan efficiency
Date Tue, 18 Sep 2018 21:09:44 GMT
Hi,

I totally support the idea of cache preload.

IMO it can be expanded. We can iterate over local partitions of the cache
group and preload each.

But it should be really clear documented methods so a user can be aware of
the benefits of such method (e.g. if RAM region is big enough, etc).

Sincerely,
Dmitriy Pavlov

вт, 18 сент. 2018 г. в 21:36, Denis Magda <dmagda@apache.org>:

> Folks,
>
> Since we're adding a method that would preload a certain partition, can we
> add the one which will preload the whole cache? Ignite persistence users
> I've been working with look puzzled once they realize there is no way to
> warm up RAM after the restart. There are use cases that require this.
>
> Can the current optimizations be expanded to the cache preloading use case?
>
> --
> Denis
>
> On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov <
> alexey.scherbakoff@gmail.com> wrote:
>
> > Summing up, I suggest adding new public
> > method IgniteCache.preloadPartition(partId).
> >
> > I will start preparing PR for IGNITE-8873
> > <https://issues.apache.org/jira/browse/IGNITE-8873> if no more
> objections
> > follow.
> >
> >
> >
> > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> > >:
> >
> > > Dmitriy,
> > >
> > > In my understanding, the proper fix for the scan query looks like a big
> > > change and it is unlikely that we include it in Ignite 2.7. On the
> other
> > > hand, the method suggested by Alexei is quite simple  and it definitely
> > > fits Ignite 2.7, which will provide a better user experience. Even
> > having a
> > > proper scan query implemented this method can be useful in some
> specific
> > > scenarios, so we will not have to deprecate it.
> > >
> > > --AG
> > >
> > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov <dpavlov.spb@gmail.com>:
> > >
> > > > As I understood it is not a hack, it is an advanced feature for
> warming
> > > up
> > > > the partition. We can build warm-up of the overall cache by calling
> its
> > > > partitions warm-up. Users often ask about this feature and are not
> > > > confident with our lazy upload.
> > > >
> > > > Please correct me if I misunderstood the idea.
> > > >
> > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan <
> dsetrakyan@apache.org
> > >:
> > > >
> > > > > I would rather fix the scan than hack the scan. Is there any
> > technical
> > > > > reason for hacking it now instead of fixing it properly? Can some
> of
> > > the
> > > > > experts in this thread provide an estimate of complexity and
> > difference
> > > > in
> > > > > work that would be required for each approach?
> > > > >
> > > > > D.
> > > > >
> > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > > > alexey.goncharuk@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I think it would be beneficial for some Ignite users if we added
> > > such a
> > > > > > partition warmup method to the public API. The method should
be
> > > > > > well-documented and state that it may invalidate existing page
> > cache.
> > > > It
> > > > > > will be a very effective instrument until we add the proper
scan
> > > > ability
> > > > > > that Vladimir was referring to.
> > > > > >
> > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov <
> maxmuzaf@gmail.com
> > >:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > Such warming up can be an effective technique for performing
> > > > > calculations
> > > > > > > which required large cache
> > > > > > > data reads, but I think it's the single narrow use case
of all
> > over
> > > > > > Ignite
> > > > > > > store usages. Like all other
> > > > > > > powerfull techniques, we should use it wisely. In the general
> > > case, I
> > > > > > think
> > > > > > > we should consider other
> > > > > > > techniques mentioned by Vladimir and may create something
like
> > > > `global
> > > > > > > statistics of cache data usage`
> > > > > > > to choose the best technique in each case.
> > > > > > >
> > > > > > > For instance, it's not obvious what would take longer:
> > multi-block
> > > > > reads
> > > > > > or
> > > > > > > 50 single-block reads issues
> > > > > > > sequentially. It strongly depends on used hardware under
the
> hood
> > > and
> > > > > > might
> > > > > > > depend on workload system
> > > > > > > resources (CPU-intensive calculations and I\O access) as
well.
> > But
> > > > > > > `statistics` will help us to choose
> > > > > > > the right way.
> > > > > > >
> > > > > > >
> > > > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov <
> > dpavlov.spb@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Alexei,
> > > > > > > >
> > > > > > > > I did not find any PRs associated with the ticket
for check
> > code
> > > > > > changes
> > > > > > > > behind this idea. Are there any PRs?
> > > > > > > >
> > > > > > > > If we create some forwards scan of pages, it should
be a very
> > > > > > > intellectual
> > > > > > > > algorithm including a lot of parameters (how much
RAM is
> free,
> > > how
> > > > > > > probably
> > > > > > > > we will need next page, etc). We had the private talk
about
> > such
> > > > idea
> > > > > > > some
> > > > > > > > time ago.
> > > > > > > >
> > > > > > > > By my experience, Linux systems already do such forward
> reading
> > > of
> > > > > file
> > > > > > > > data (for corresponding sequential flagged file descriptors),
> > but
> > > > > some
> > > > > > > > prefetching of data at the level of application may
be useful
> > for
> > > > > > > O_DIRECT
> > > > > > > > file descriptors.
> > > > > > > >
> > > > > > > > And one more concern from me is about selecting a
right place
> > in
> > > > the
> > > > > > > system
> > > > > > > > to do such prefetch.
> > > > > > > >
> > > > > > > > Sincerely,
> > > > > > > > Dmitriy Pavlov
> > > > > > > >
> > > > > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov
<
> > > > vozerov@gridgain.com
> > > > > >:
> > > > > > > >
> > > > > > > > > HI Alex,
> > > > > > > > >
> > > > > > > > > This is good that you observed speedup. But I
do not think
> > this
> > > > > > > solution
> > > > > > > > > works for the product in general case. Amount
of RAM is
> > > limited,
> > > > > and
> > > > > > > > even a
> > > > > > > > > single partition may need more space than RAM
available.
> > > Moving a
> > > > > lot
> > > > > > > of
> > > > > > > > > pages to page memory for scan means that you
evict a lot of
> > > other
> > > > > > > pages,
> > > > > > > > > what will ultimately lead to bad performance
of subsequent
> > > > queries
> > > > > > and
> > > > > > > > > defeat LRU algorithms, which are of great improtance
for
> good
> > > > > > database
> > > > > > > > > performance.
> > > > > > > > >
> > > > > > > > > Database vendors choose another approach - skip
BTrees,
> > iterate
> > > > > > > direclty
> > > > > > > > > over data pages, read them in multi-block fashion,
use
> > separate
> > > > > scan
> > > > > > > > buffer
> > > > > > > > > to avoid excessive evictions of other hot pages.
> > Corresponding
> > > > > ticket
> > > > > > > for
> > > > > > > > > SQL exists [1], but idea is common for all parts
of the
> > system,
> > > > > > > requiring
> > > > > > > > > scans.
> > > > > > > > >
> > > > > > > > > As far as proposed solution, it might be good
idea to add
> > > special
> > > > > API
> > > > > > > to
> > > > > > > > > "warmup" partition with clear explanation of
pros (fast
> scan
> > > > after
> > > > > > > > warmup)
> > > > > > > > > and cons (slowdown of any other operations).
But I think we
> > > > should
> > > > > > not
> > > > > > > > make
> > > > > > > > > this approach part of normal scans.
> > > > > > > > >
> > > > > > > > > Vladimir.
> > > > > > > > >
> > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov
<
> > > > > > > > > alexey.scherbakoff@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > Igniters,
> > > > > > > > > >
> > > > > > > > > > My use case involves scenario where it's
necessary to
> > iterate
> > > > > over
> > > > > > > > > > large(many TBs) persistent cache doing some
calculation
> on
> > > read
> > > > > > data.
> > > > > > > > > >
> > > > > > > > > > The basic solution is to iterate cache using
ScanQuery.
> > > > > > > > > >
> > > > > > > > > > This turns out to be slow because iteration
over cache
> > > > involves a
> > > > > > lot
> > > > > > > > of
> > > > > > > > > > random disk access for reading data pages
referenced from
> > > leaf
> > > > > > pages
> > > > > > > by
> > > > > > > > > > links.
> > > > > > > > > >
> > > > > > > > > > This is especially true when data is stored
on disks with
> > > slow
> > > > > > random
> > > > > > > > > > access, like SAS disks. In my case on modern
SAS disks
> > array
> > > > > > reading
> > > > > > > > > speed
> > > > > > > > > > was like several MB/sec while sequential
read speed in
> perf
> > > > test
> > > > > > was
> > > > > > > > > about
> > > > > > > > > > GB/sec.
> > > > > > > > > >
> > > > > > > > > > I was able to fix the issue by using ScanQuery
with
> > explicit
> > > > > > > partition
> > > > > > > > > set
> > > > > > > > > > and running simple warmup code before each
partition
> scan.
> > > > > > > > > >
> > > > > > > > > > The code pins cold pages in memory in sequential
order
> thus
> > > > > > > eliminating
> > > > > > > > > > random disk access. Speedup was like x100
magnitude.
> > > > > > > > > >
> > > > > > > > > > I suggest adding the improvement to the
product's core
> by
> > > > always
> > > > > > > > > > sequentially preloading pages for all internal
partition
> > > > > iterations
> > > > > > > > > (cache
> > > > > > > > > > iterators, scan queries, sql queries with
scan plan) if
> > > > partition
> > > > > > is
> > > > > > > > cold
> > > > > > > > > > (low number of pinned pages).
> > > > > > > > > >
> > > > > > > > > > This also should speed up rebalancing from
cold
> partitions.
> > > > > > > > > >
> > > > > > > > > > Ignite JIRA ticket [1]
> > > > > > > > > >
> > > > > > > > > > Thoughts ?
> > > > > > > > > >
> > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-8873
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > > Alexei Scherbakov
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > --
> > > > > > > --
> > > > > > > Maxim Muzafarov
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message