ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Pavlov <dpavlov....@gmail.com>
Subject Re: Cache scan efficiency
Date Tue, 18 Sep 2018 21:26:20 GMT
Even better, if RAM is exhausted page replacement process will be started.
https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pagereplacement(rotationwithdisk)

Effect of the preloading will be still markable, but not as excelled as
with full-fitting into RAM. Later I can review or improve javadocs if it is
necessary.

ср, 19 сент. 2018 г. в 0:18, Denis Magda <dmagda@apache.org>:

> Agree, it's just a matter of the documentation. If a user stores 100% in
> RAM and on disk, and just wants to warm RAM up after a restart then he
> knows everything will fit there. If during the preloading we detect that
> the RAM is exhausted we can halt it and print out a warning.
>
> --
> Denis
>
> On Tue, Sep 18, 2018 at 2:10 PM Dmitriy Pavlov <dpavlov.spb@gmail.com>
> wrote:
>
> > Hi,
> >
> > I totally support the idea of cache preload.
> >
> > IMO it can be expanded. We can iterate over local partitions of the cache
> > group and preload each.
> >
> > But it should be really clear documented methods so a user can be aware
> of
> > the benefits of such method (e.g. if RAM region is big enough, etc).
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вт, 18 сент. 2018 г. в 21:36, Denis Magda <dmagda@apache.org>:
> >
> > > Folks,
> > >
> > > Since we're adding a method that would preload a certain partition, can
> > we
> > > add the one which will preload the whole cache? Ignite persistence
> users
> > > I've been working with look puzzled once they realize there is no way
> to
> > > warm up RAM after the restart. There are use cases that require this.
> > >
> > > Can the current optimizations be expanded to the cache preloading use
> > case?
> > >
> > > --
> > > Denis
> > >
> > > On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov <
> > > alexey.scherbakoff@gmail.com> wrote:
> > >
> > > > Summing up, I suggest adding new public
> > > > method IgniteCache.preloadPartition(partId).
> > > >
> > > > I will start preparing PR for IGNITE-8873
> > > > <https://issues.apache.org/jira/browse/IGNITE-8873> if no more
> > > objections
> > > > follow.
> > > >
> > > >
> > > >
> > > > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk <
> > > alexey.goncharuk@gmail.com
> > > > >:
> > > >
> > > > > Dmitriy,
> > > > >
> > > > > In my understanding, the proper fix for the scan query looks like
a
> > big
> > > > > change and it is unlikely that we include it in Ignite 2.7. On the
> > > other
> > > > > hand, the method suggested by Alexei is quite simple  and it
> > definitely
> > > > > fits Ignite 2.7, which will provide a better user experience. Even
> > > > having a
> > > > > proper scan query implemented this method can be useful in some
> > > specific
> > > > > scenarios, so we will not have to deprecate it.
> > > > >
> > > > > --AG
> > > > >
> > > > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov <
> dpavlov.spb@gmail.com
> > >:
> > > > >
> > > > > > As I understood it is not a hack, it is an advanced feature
for
> > > warming
> > > > > up
> > > > > > the partition. We can build warm-up of the overall cache by
> calling
> > > its
> > > > > > partitions warm-up. Users often ask about this feature and are
> not
> > > > > > confident with our lazy upload.
> > > > > >
> > > > > > Please correct me if I misunderstood the idea.
> > > > > >
> > > > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan <
> > > dsetrakyan@apache.org
> > > > >:
> > > > > >
> > > > > > > I would rather fix the scan than hack the scan. Is there
any
> > > > technical
> > > > > > > reason for hacking it now instead of fixing it properly?
Can
> some
> > > of
> > > > > the
> > > > > > > experts in this thread provide an estimate of complexity
and
> > > > difference
> > > > > > in
> > > > > > > work that would be required for each approach?
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > > > > > alexey.goncharuk@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I think it would be beneficial for some Ignite users
if we
> > added
> > > > > such a
> > > > > > > > partition warmup method to the public API. The method
should
> be
> > > > > > > > well-documented and state that it may invalidate existing
> page
> > > > cache.
> > > > > > It
> > > > > > > > will be a very effective instrument until we add the
proper
> > scan
> > > > > > ability
> > > > > > > > that Vladimir was referring to.
> > > > > > > >
> > > > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov
<
> > > maxmuzaf@gmail.com
> > > > >:
> > > > > > > >
> > > > > > > > > Folks,
> > > > > > > > >
> > > > > > > > > Such warming up can be an effective technique
for
> performing
> > > > > > > calculations
> > > > > > > > > which required large cache
> > > > > > > > > data reads, but I think it's the single narrow
use case of
> > all
> > > > over
> > > > > > > > Ignite
> > > > > > > > > store usages. Like all other
> > > > > > > > > powerfull techniques, we should use it wisely.
In the
> general
> > > > > case, I
> > > > > > > > think
> > > > > > > > > we should consider other
> > > > > > > > > techniques mentioned by Vladimir and may create
something
> > like
> > > > > > `global
> > > > > > > > > statistics of cache data usage`
> > > > > > > > > to choose the best technique in each case.
> > > > > > > > >
> > > > > > > > > For instance, it's not obvious what would take
longer:
> > > > multi-block
> > > > > > > reads
> > > > > > > > or
> > > > > > > > > 50 single-block reads issues
> > > > > > > > > sequentially. It strongly depends on used hardware
under
> the
> > > hood
> > > > > and
> > > > > > > > might
> > > > > > > > > depend on workload system
> > > > > > > > > resources (CPU-intensive calculations and I\O
access) as
> > well.
> > > > But
> > > > > > > > > `statistics` will help us to choose
> > > > > > > > > the right way.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov <
> > > > dpavlov.spb@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Alexei,
> > > > > > > > > >
> > > > > > > > > > I did not find any PRs associated with the
ticket for
> check
> > > > code
> > > > > > > > changes
> > > > > > > > > > behind this idea. Are there any PRs?
> > > > > > > > > >
> > > > > > > > > > If we create some forwards scan of pages,
it should be a
> > very
> > > > > > > > > intellectual
> > > > > > > > > > algorithm including a lot of parameters
(how much RAM is
> > > free,
> > > > > how
> > > > > > > > > probably
> > > > > > > > > > we will need next page, etc). We had the
private talk
> about
> > > > such
> > > > > > idea
> > > > > > > > > some
> > > > > > > > > > time ago.
> > > > > > > > > >
> > > > > > > > > > By my experience, Linux systems already
do such forward
> > > reading
> > > > > of
> > > > > > > file
> > > > > > > > > > data (for corresponding sequential flagged
file
> > descriptors),
> > > > but
> > > > > > > some
> > > > > > > > > > prefetching of data at the level of application
may be
> > useful
> > > > for
> > > > > > > > > O_DIRECT
> > > > > > > > > > file descriptors.
> > > > > > > > > >
> > > > > > > > > > And one more concern from me is about selecting
a right
> > place
> > > > in
> > > > > > the
> > > > > > > > > system
> > > > > > > > > > to do such prefetch.
> > > > > > > > > >
> > > > > > > > > > Sincerely,
> > > > > > > > > > Dmitriy Pavlov
> > > > > > > > > >
> > > > > > > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir
Ozerov <
> > > > > > vozerov@gridgain.com
> > > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > HI Alex,
> > > > > > > > > > >
> > > > > > > > > > > This is good that you observed speedup.
But I do not
> > think
> > > > this
> > > > > > > > > solution
> > > > > > > > > > > works for the product in general case.
Amount of RAM is
> > > > > limited,
> > > > > > > and
> > > > > > > > > > even a
> > > > > > > > > > > single partition may need more space
than RAM
> available.
> > > > > Moving a
> > > > > > > lot
> > > > > > > > > of
> > > > > > > > > > > pages to page memory for scan means
that you evict a
> lot
> > of
> > > > > other
> > > > > > > > > pages,
> > > > > > > > > > > what will ultimately lead to bad performance
of
> > subsequent
> > > > > > queries
> > > > > > > > and
> > > > > > > > > > > defeat LRU algorithms, which are of
great improtance
> for
> > > good
> > > > > > > > database
> > > > > > > > > > > performance.
> > > > > > > > > > >
> > > > > > > > > > > Database vendors choose another approach
- skip BTrees,
> > > > iterate
> > > > > > > > > direclty
> > > > > > > > > > > over data pages, read them in multi-block
fashion, use
> > > > separate
> > > > > > > scan
> > > > > > > > > > buffer
> > > > > > > > > > > to avoid excessive evictions of other
hot pages.
> > > > Corresponding
> > > > > > > ticket
> > > > > > > > > for
> > > > > > > > > > > SQL exists [1], but idea is common
for all parts of the
> > > > system,
> > > > > > > > > requiring
> > > > > > > > > > > scans.
> > > > > > > > > > >
> > > > > > > > > > > As far as proposed solution, it might
be good idea to
> add
> > > > > special
> > > > > > > API
> > > > > > > > > to
> > > > > > > > > > > "warmup" partition with clear explanation
of pros (fast
> > > scan
> > > > > > after
> > > > > > > > > > warmup)
> > > > > > > > > > > and cons (slowdown of any other operations).
But I
> think
> > we
> > > > > > should
> > > > > > > > not
> > > > > > > > > > make
> > > > > > > > > > > this approach part of normal scans.
> > > > > > > > > > >
> > > > > > > > > > > Vladimir.
> > > > > > > > > > >
> > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei
Scherbakov <
> > > > > > > > > > > alexey.scherbakoff@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Igniters,
> > > > > > > > > > > >
> > > > > > > > > > > > My use case involves scenario
where it's necessary to
> > > > iterate
> > > > > > > over
> > > > > > > > > > > > large(many TBs) persistent cache
doing some
> calculation
> > > on
> > > > > read
> > > > > > > > data.
> > > > > > > > > > > >
> > > > > > > > > > > > The basic solution is to iterate
cache using
> ScanQuery.
> > > > > > > > > > > >
> > > > > > > > > > > > This turns out to be slow because
iteration over
> cache
> > > > > > involves a
> > > > > > > > lot
> > > > > > > > > > of
> > > > > > > > > > > > random disk access for reading
data pages referenced
> > from
> > > > > leaf
> > > > > > > > pages
> > > > > > > > > by
> > > > > > > > > > > > links.
> > > > > > > > > > > >
> > > > > > > > > > > > This is especially true when data
is stored on disks
> > with
> > > > > slow
> > > > > > > > random
> > > > > > > > > > > > access, like SAS disks. In my
case on modern SAS
> disks
> > > > array
> > > > > > > > reading
> > > > > > > > > > > speed
> > > > > > > > > > > > was like several MB/sec while
sequential read speed
> in
> > > perf
> > > > > > test
> > > > > > > > was
> > > > > > > > > > > about
> > > > > > > > > > > > GB/sec.
> > > > > > > > > > > >
> > > > > > > > > > > > I was able to fix the issue by
using ScanQuery with
> > > > explicit
> > > > > > > > > partition
> > > > > > > > > > > set
> > > > > > > > > > > > and running simple warmup code
before each partition
> > > scan.
> > > > > > > > > > > >
> > > > > > > > > > > > The code pins cold pages in memory
in sequential
> order
> > > thus
> > > > > > > > > eliminating
> > > > > > > > > > > > random disk access. Speedup was
like x100 magnitude.
> > > > > > > > > > > >
> > > > > > > > > > > > I suggest adding the improvement
to the product's
> core
> > > by
> > > > > > always
> > > > > > > > > > > > sequentially preloading pages
for all internal
> > partition
> > > > > > > iterations
> > > > > > > > > > > (cache
> > > > > > > > > > > > iterators, scan queries, sql queries
with scan plan)
> if
> > > > > > partition
> > > > > > > > is
> > > > > > > > > > cold
> > > > > > > > > > > > (low number of pinned pages).
> > > > > > > > > > > >
> > > > > > > > > > > > This also should speed up rebalancing
from cold
> > > partitions.
> > > > > > > > > > > >
> > > > > > > > > > > > Ignite JIRA ticket [1]
> > > > > > > > > > > >
> > > > > > > > > > > > Thoughts ?
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> https://issues.apache.org/jira/browse/IGNITE-8873
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Best regards,
> > > > > > > > > > > > Alexei Scherbakov
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > --
> > > > > > > > > --
> > > > > > > > > Maxim Muzafarov
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best regards,
> > > > Alexei Scherbakov
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message