ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Suntsov <isunt...@gridgain.com>
Subject Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC
Date Wed, 11 Apr 2018 06:57:07 GMT
Dmitriy,

I've measured performance on the current master and haven't found any
problems with in-memory mode.

On Tue, Apr 10, 2018, 20:33 Dmitriy Setrakyan <dsetrakyan@apache.org> wrote:

> I am not convinced that the performance degradation is only due to the new
> change that fixes the incorrect behavior. To my knowledge, there is also a
> drop in memory-only mode. Can someone explain why do we have such a drop?
>
> D.
>
> On Tue, Apr 10, 2018 at 9:08 AM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
>
> > 16% looks perfectly ok to me provided that we compare correct
> > implementation with incorrect one.
> >
> > вт, 10 апр. 2018 г. в 18:24, Dmitriy Setrakyan <d@gridgain.com>:
> >
> > > Ilya, can we find out why pure in-memory scenario also had a
> performance
> > > drop and which commit caused it? It should not be affected by changes
> in
> > > persistence at all.
> > >
> > > D.
> > >
> > > On Tue, Apr 10, 2018 at 7:56 AM, Ilya Suntsov <isuntsov@gridgain.com>
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > Looks like commit:
> > > >
> > > > d0adb61ecd9af0d9907e480ec747ea1465f97cd7 is the first bad commit
> > > > > commit d0adb61ecd9af0d9907e480ec747ea1465f97cd7
> > > > > Author: Ivan Rakov <ivan.glukos@gmail.com>
> > > > > Date:   Tue Mar 27 20:11:52 2018 +0300
> > > > >     IGNITE-7754 WAL in LOG_ONLY mode doesn't execute fsync on
> > > checkpoint
> > > > > begin - Fixes #3656.
> > > >
> > > >
> > > > was the cause of performance drop ( > 10% vs AI 2.4.0) on the
> following
> > > > benchmarks (LOG_ONLY):
> > > >
> > > >    - atomic-put  (16 %)
> > > >    - atomic-putAll (14 %)
> > > >    - tx-putAll (11 %)
> > > >
> > > > As I understand it is greater than initial assessment.
> > > >
> > > > Thoughts?
> > > >
> > > > 2018-03-27 20:13 GMT+03:00 Dmitry Pavlov <dpavlov.spb@gmail.com>:
> > > >
> > > > > Ivan, sure :)
> > > > >
> > > > > Thank you for this contribution, merged to master.
> > > > >
> > > > > вт, 27 мар. 2018 г. в 20:08, Ivan Rakov <ivan.glukos@gmail.com>:
> > > > >
> > > > > > Dmitry,
> > > > > >
> > > > > > Firstly PR contained dirty fix for performance measurement,
but
> now
> > > it
> > > > > > contains good fix. :) Sorry for inconvenience.
> > > > > > I've renamed the PR.
> > > > > >
> > > > > > Best Regards,
> > > > > > Ivan Rakov
> > > > > >
> > > > > > On 27.03.2018 19:40, Dmitry Pavlov wrote:
> > > > > > > Hi Eduard, thank you for review.
> > > > > > >
> > > > > > > Hi Ivan,
> > > > > > >
> > > > > > > I'm confused on PR naming
> > > > > > > https://github.com/apache/ignite/pull/3656
> > > > > > >
> > > > > > > Could you rename?
> > > > > > >
> > > > > > > Sincerely,
> > > > > > > Dmitriy Pavlov
> > > > > > >
> > > > > > > вт, 27 мар. 2018 г. в 19:38, Eduard Shangareev <
> > > > > > eduard.shangareev@gmail.com
> > > > > > >> :
> > > > > > >> Ivan, I have reviewed your changes, looks good.
> > > > > > >>
> > > > > > >> On Tue, Mar 27, 2018 at 2:56 PM, Ivan Rakov <
> > > ivan.glukos@gmail.com>
> > > > > > wrote:
> > > > > > >>
> > > > > > >>> Igniters,
> > > > > > >>>
> > > > > > >>> I've completed development of https://issues.apache.org/jira
> > > > > > >>> /browse/IGNITE-7754. TeamCity state is ok. Please,
review my
> > > > changes.
> > > > > > >>> Please note that it will be possible to track time
of WAL
> fsync
> > > on
> > > > > > >>> checkpoint begin by *walCpRecordFsyncDuration *metric
in
> > > > "Checkpoint
> > > > > > >>> started" message.
> > > > > > >>>
> > > > > > >>> Also, I've created https://issues.apache.org/
> > > > jira/browse/IGNITE-8057
> > > > > > >> with
> > > > > > >>> description of possible further improvement of
WAL fsync on
> > > > > checkpoint
> > > > > > >>> begin.
> > > > > > >>>
> > > > > > >>> Best Regards,
> > > > > > >>> Ivan Rakov
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On 26.03.2018 23:45, Valentin Kulichenko wrote:
> > > > > > >>>
> > > > > > >>>> Ivan,
> > > > > > >>>>
> > > > > > >>>> It's all good then :) Thanks!
> > > > > > >>>>
> > > > > > >>>> -Val
> > > > > > >>>>
> > > > > > >>>> On Mon, Mar 26, 2018 at 1:50 AM, Ivan Rakov
<
> > > > ivan.glukos@gmail.com>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>> Val,
> > > > > > >>>>> There's no any sense to use WalMode.NONE
in production
> > > > environment,
> > > > > > >> it's
> > > > > > >>>>> kept for testing and debugging purposes
(including possible
> > > user
> > > > > > >>>>> activities
> > > > > > >>>>> like capacity planning).
> > > > > > >>>>> We already print a warning at node start
in case
> WalMode.NONE
> > > is
> > > > > set:
> > > > > > >>>>>
> > > > > > >>>>> U.quietAndWarn(log,"Started write-ahead
log manager in NONE
> > > mode,
> > > > > > >>>>>
> > > > > > >>>>>> persisted data may be lost in " +
> > > > > > >>>>>>        "a case of unexpected node failure.
Make sure to
> > > > deactivate
> > > > > > the
> > > > > > >>>>>> cluster before shutdown.");
> > > > > > >>>>>>
> > > > > > >>>>>> Best Regards,
> > > > > > >>>>> Ivan Rakov
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On 24.03.2018 1:40, Valentin Kulichenko
wrote:
> > > > > > >>>>>
> > > > > > >>>>> Dmitry,
> > > > > > >>>>>> Thanks for clarification. So it sounds
like if we fix all
> > > other
> > > > > > modes
> > > > > > >> as
> > > > > > >>>>>> we
> > > > > > >>>>>> discuss here, NONE would be the only
one allowing
> > corruption.
> > > I
> > > > > also
> > > > > > >>>>>> don't
> > > > > > >>>>>> see much sense in this and I think
we should clearly state
> > > this
> > > > in
> > > > > > the
> > > > > > >>>>>> doc,
> > > > > > >>>>>> as well print out a warning if NONE
mode is used.
> > Eventually,
> > > if
> > > > > > it's
> > > > > > >>>>>> confirmed that there are no reasonable
use cases for it,
> we
> > > can
> > > > > > >>>>>> deprecate
> > > > > > >>>>>> it.
> > > > > > >>>>>>
> > > > > > >>>>>> -Val
> > > > > > >>>>>>
> > > > > > >>>>>> On Fri, Mar 23, 2018 at 3:26 PM, Dmitry
Pavlov <
> > > > > > dpavlov.spb@gmail.com
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>> Hi Val,
> > > > > > >>>>>>
> > > > > > >>>>>>> NONE means that the WAL log is
disabled and not written
> at
> > > all.
> > > > > Use
> > > > > > >> of
> > > > > > >>>>>>> the
> > > > > > >>>>>>> mode is at your own risk. It is
possible that restore
> state
> > > > after
> > > > > > the
> > > > > > >>>>>>> crash
> > > > > > >>>>>>> at the middle of checkpoint will
not succeed. I do not
> see
> > > much
> > > > > > sence
> > > > > > >>>>>>> in
> > > > > > >>>>>>> it, especially in production.
> > > > > > >>>>>>>
> > > > > > >>>>>>> BACKGROUND is full functional WAL
mode, but allows some
> > delay
> > > > > > before
> > > > > > >>>>>>> flush
> > > > > > >>>>>>> to disk.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Sincerely,
> > > > > > >>>>>>> Dmitriy Pavlov
> > > > > > >>>>>>>
> > > > > > >>>>>>> сб, 24 мар. 2018 г. в 1:07,
Valentin Kulichenko <
> > > > > > >>>>>>> valentin.kulichenko@gmail.com>:
> > > > > > >>>>>>>
> > > > > > >>>>>>> I agree. In my view, any possibility
to get a corrupted
> > > storage
> > > > > is
> > > > > > a
> > > > > > >>>>>>> bug
> > > > > > >>>>>>>
> > > > > > >>>>>>>> which needs to be fixed.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> BTW, can someone explain semantics
of NONE mode? What is
> > the
> > > > > > >>>>>>>> difference
> > > > > > >>>>>>>> from BACKGROUND from user's
perspective? Is there any
> > > > particular
> > > > > > use
> > > > > > >>>>>>>> case
> > > > > > >>>>>>>> where it can be used?
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> -Val
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On Fri, Mar 23, 2018 at 2:49
AM, Dmitry Pavlov <
> > > > > > >> dpavlov.spb@gmail.com
> > > > > > >>>>>>>> wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Hi Ivan,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> IMO we have to add extra
FSYNCS for BACKGROUND WAL.
> > Agree?
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Sincerely,
> > > > > > >>>>>>>>> Dmitriy Pavlov
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> пт, 23 мар. 2018 г.
в 12:23, Ivan Rakov <
> > > > ivan.glukos@gmail.com
> > > > > >:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Igniters, there's another
important question about this
> > > > matter.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>> Do we want to add extra
FSYNCS for BACKGROUND WAL
> mode?
> > I
> > > > > think
> > > > > > >> that
> > > > > > >>>>>>>>>> we
> > > > > > >>>>>>>> have to do it: it will cause
similar performance drop,
> but
> > > if
> > > > we
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> consider LOG_ONLY broken
without these fixes,
> BACKGROUND
> > is
> > > > > > broken
> > > > > > >> as
> > > > > > >>>>>>>>>> well.
> > > > > > >>>>>>>>> Best Regards,
> > > > > > >>>>>>>>>> Ivan Rakov
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> On 23.03.2018 10:27,
Ivan Rakov wrote:
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Fixes are quite simple.
> > > > > > >>>>>>>>>>> I expect them to
be merged in master in a week in
> worst
> > > > case.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Best Regards,
> > > > > > >>>>>>>>>>> Ivan Rakov
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> On 22.03.2018 17:49,
Denis Magda wrote:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Ivan,
> > > > > > >>>>>>>>>>>> How quick are
you going to merge the fix into the
> > > master?
> > > > > Many
> > > > > > >>>>>>>>>>>> persistence
> > > > > > >>>>>>>>>>>> related optimizations
have already stacked up.
> > Probably,
> > > > we
> > > > > > can
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> release
> > > > > > >>>>>>>>>> them sooner if the
community agrees.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> --
> > > > > > >>>>>>>>>>>> Denis
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> On Thu, Mar
22, 2018 at 5:22 AM, Ivan Rakov <
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> ivan.glukos@gmail.com>
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>> Thanks all!
> > > > > > >>>>>>>>>>>>> We seem
to have reached a consensus on this issue.
> > I'll
> > > > > just
> > > > > > >> add
> > > > > > >>>>>>>>>>>>> necessary
> > > > > > >>>>>>>>>>>>> fsyncs
under IGNITE-7754.
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Best Regards,
> > > > > > >>>>>>>>>>>>> Ivan Rakov
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> On 22.03.2018
15:13, Ilya Lantukh wrote:
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> +1 for
fixing LOG_ONLY. If current implementation
> > > doesn't
> > > > > > >>>>>>>>>>>>> protect
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> from
> > > > > > >>>>>>>>> data
> > > > > > >>>>>>>>>>> corruption, it
doesn't make sence.
> > > > > > >>>>>>>>>>>>>> On
Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> dmagda@apache.org>
> > > > > > >>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>> +1 for the fix of LOG_ONLY
> > > > > > >>>>>>>>>>>>>> On
Wed, Mar 21, 2018 at 11:23 AM, Alexey
> Goncharuk <
> > > > > > >>>>>>>>>>>>>>>
alexey.goncharuk@gmail.com> wrote:
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>
+1 for fixing LOG_ONLY to enforce corruption
> safety
> > > > given
> > > > > > the
> > > > > > >>>>>>>>>>>>>>>
provided
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>
performance results.
> > > > > > >>>>>>>>>>>>>>>>
2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <
> > > > > > >>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>
vozerov@gridgain.com
> > > > > > >>>>>>>>>>>>>> :
> > > > > > >>>>>>>>>> +1 for accepting drop
in LOG_ONLY. 7% is not that much
> > and
> > > > > > >>>>>>>>>>>> not a
> > > > > > >>>>>>>>>>>>>> drop
> > > > > > >>>>>>>>> at
> > > > > > >>>>>>>>>>>>>>>>
all, provided that we fixing a bug. I.e. should
> we
> > > > > > implement
> > > > > > >>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>
it
> > > > > > >>>>>>>>>>>>>> correctly
> > > > > > >>>>>>>>> in the first place we would
never notice any "drop".
> > > > > > >>>>>>>>>>>>>>>>
I do not understand why someone would like to
> use
> > > > > current
> > > > > > >>>>>>>>>>>>>>>>>
broken
> > > > > > >>>>>>>>>>>>>>>
mode.
> > > > > > >>>>>>>>>> On Wed, Mar 21, 2018
at 6:11 PM, Dmitry Pavlov
> > > > > > >>>>>>>>>>>>>>>>>
<dpavlov.spb@gmail.com>
> > > > > > >>>>>>>>>>>>>>>>>
wrote:
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
Hi, I think option 1 is better. As Val said any
> > > mode
> > > > > that
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
allows
> > > > > > >>>>>>>>>>>>>>>
corruption
> > > > > > >>>>>>>>>> does not make much
sense.
> > > > > > >>>>>>>>>>>>>>>>>>
What Ivan mentioned here as drop, in relation
> to
> > > old
> > > > > > mode
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
DEFAULT
> > > > > > >>>>>>>>>>>>>>>>
(FSYNC
> > > > > > >>>>>>>>>>> now), is still
significant perfromance boost.
> > > > > > >>>>>>>>>>>>>>>>>
Sincerely,
> > > > > > >>>>>>>>>>>>>>>>>>
Dmitriy Pavlov
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
ivan.glukos@gmail.com
> > > > > > >>>>>>>>>>>>>>>>
:
> > > > > > >>>>>>>>>> I've attached benchmark
results to the JIRA ticket.
> > > > > > >>>>>>>>>>>> We observe
~7% drop in "fair" LOG_ONLY_SAFE mode,
> > > > > > >>>>>>>>>>>>>>>>>>>
independent
> > > > > > >>>>>>>>>>>>>>>>>
of
> > > > > > >>>>>>>>> WAL
> > > > > > >>>>>>>>>>> compaction enabled
flag. It's pretty significant
> drop:
> > > WAL
> > > > > > >>>>>>>>>>>>>>>>>
compaction
> > > > > > >>>>>>>>>>>>>>>>>
itself gives only ~3% drop.
> > > > > > >>>>>>>>>>>>>>>>>
I see two options here:
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
1) Change LOG_ONLY behavior. That implies that
> > > we'll
> > > > > be
> > > > > > >>>>>>>>>>>>>>>>>>>
ready
> > > > > > >>>>>>>>>>>>>>>>>
to
> > > > > > >>>>>>>>> release
> > > > > > >>>>>>>>>>> AI 2.5 with 7%
drop.
> > > > > > >>>>>>>>>>>>>>>>>>
2) Introduce LOG_ONLY_SAFE, make it default,
> add
> > > > > release
> > > > > > >>>>>>>>>>>>>>>>>>>
note
> > > > > > >>>>>>>>>>>>>>>>>
to AI
> > > > > > >>>>>>>>> 2.5
> > > > > > >>>>>>>>>>>>>>>>>>
that we added power loss durability in default
> > > mode,
> > > > > but
> > > > > > >>>>>>>>>>>>>>>>>
user
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
may
> > > > > > >>>>>>>>>>>>>>>
fallback to previous LOG_ONLY in order to retain
> > > > > > >>>>>>>>>> performance.
> > > > > > >>>>>>>>>>>>>>>>>
Thoughts?
> > > > > > >>>>>>>>> Best Regards,
> > > > > > >>>>>>>>>>>>>>>>>>>
Ivan Rakov
> > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
On 20.03.2018 16:00, Ivan Rakov wrote:
> > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
Val,
> > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>
If a storage is in
> > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>
corrupted state, does it mean that it needs
> to
> > > be
> > > > > > >>>>>>>>>>>>>>>>>>>>>
completely
> > > > > > >>>>>>>>>>>>>>>>>>>
removed
> > > > > > >>>>>>>>>> and
> > > > > > >>>>>>>>>>>>>>>>>>
cluster needs to be restarted without data?
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
Yes, there's a chance that in LOG_ONLY all
> > local
> > > > data
> > > > > > >> will
> > > > > > >>>>>>>>>>>>>>>>>>>>
be
> > > > > > >>>>>>>>>>>>>>>>>>
lost,
> > > > > > >>>>>>>>>> but only in *power
loss**/ OS crash* case.
> > > > > > >>>>>>>>>>>>>>>>>
kill -9, JVM crash, death of critical system
> > thread
> > > > and
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
all
> > > > > > >>>>>>>>>>>>>>>>>>
other
> > > > > > >>>>>>>>> cases that usually take
place are variations of
> *process
> > > > > > >>>>>>>>>>>>>>>>>>>>
crash*.
> > > > > > >>>>>>>>>>>>>>>>>>
All
> > > > > > >>>>>>>>>>> WAL modes (except
NONE, of course) ensure
> > > corruption-safety
> > > > > > >>>>>>>>>>>>>>>>>
in
> > > > > > >>>>>>>>>>>>>>>
case
> > > > > > >>>>>>>>> of
> > > > > > >>>>>>>>>>>>>>>>>
process crash.
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
If so, I'm not sure any mode
> > > > > > >>>>>>>>>>>>>>>>>>>>
that allows corruption makes much sense to
> me.
> > > > > > >>>>>>>>>>>>>>>>>>>>>
It depends on performance impact of
> enforcing
> > > > > > >> power-loss
> > > > > > >>>>>>>>>>>>>>>>>>>>
corruption
> > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
safety. Price of full protection from power
> > loss
> > > is
> > > > > > high
> > > > > > >> -
> > > > > > >>>>>>>>>>>>>>>>>
FSYNC
> > > > > > >>>>>>>>>>>>>> is
> > > > > > >>>>>>>>> way slower (2-10 times)
than other WAL modes. The
> > question
> > > is
> > > > > > >>>>>>>>>>>>>>>>>
whether
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
ensuring weaker guarantees (corruption can't
> > > happen,
> > > > > but
> > > > > > >>>>>>>>>>>>>>>>>>
loss
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
of
> > > > > > >>>>>>>>>>>>>>>
last
> > > > > > >>>>>>>>>> updates can) will affect
performance as badly as
> strong
> > > > > > >>>>>>>>>>>>>>>>>>
guarantees.
> > > > > > >>>>>>>>>>>>>>>>>>
I'll share benchmark results soon.
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
Best Regards,
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
Ivan Rakov
> > > > > > >>>>>>>>>>>>>>>>>>>>
On 20.03.2018 5:09, Valentin Kulichenko
> wrote:
> > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>
Guys,
> > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>
What do we understand under "data
> corruption"
> > > > here?
> > > > > > If
> > > > > > >> a
> > > > > > >>>>>>>>>>>>>>>>>>>>>
storage
> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>
is
> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>
in
> > > > > > >>>>>>>>>>>>>>>>>>
corrupted state, does it mean that it needs to
> > be
> > > > > > >> completely
> > > > > > >>>>>>>>>>>>>>>>>>
removed
> > > > > > >>>>>>>>>>>>>>>>>>>
and
> > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
cluster needs to be restarted without data? If
> > so,
> > > > I'm
> > > > > > not
> > > > > > >>>>>>>>>>>>>>>>>>
sure
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
any
> > > > > > >>>>>>>>>> mode
> > > > > > >>>>>>>>>>>>>>>>>>
that allows corruption makes much sense to me.
> > How
> > > > am
> > > > > I
> > > > > > >>>>>>>>>>>>>>>>>>
supposed
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>
to
> > > > > > >>>>>>>>>>> use a
> > > > > > >>>>>>>>>>>>>>>>>>
database, if virtually any failure can end
> with
> > > > > complete
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
loss of
> > > > > > >>>>>>>>>>>>>>>>>>>>>
data?
> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>
In any case, this definitely should not be a
> > > > default
> > > > > > >>>>>>>>>>>>>>>>>>
behavior.
> > > > > > >>>>>>>>>>>>>>>>
If
> > > > > > >>>>>>>>> user ever
> > > > > > >>>>>>>>>>>>>>>>>>
switches to corruption-unsafe mode, there
> should
> > > be
> > > > a
> > > > > > >>>>>>>>>>>>>>>>>>
clear
> > > > > > >>>>>>>>>>>>>>>>>>>
warning
> > > > > > >>>>>>>>> about
> > > > > > >>>>>>>>>>>>>>>>>>
this.
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
-Val
> > > > > > >>>>>>>>>>>>>>>>>>>>>
On Fri, Mar 16, 2018 at 1:06 AM, Ivan
> Rakov <
> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>
ivan.glukos@gmail.com>
> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>
wrote:
> > > > > > >>>>>>>>>>>>>>>>>>
Ticket to track changes:
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>
> > > https://issues.apache.org/jira/browse/IGNITE-7754
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
Best Regards,
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
Ivan Rakov
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
On 16.03.2018 10:58, Dmitriy Setrakyan
> > wrote:
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
On Fri, Mar 16, 2018 at 12:55 AM, Ivan
> > Rakov <
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
ivan.glukos@gmail.com
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
wrote:
> > > > > > >>>>>>>>>>>>>>>>>>>>
Vladimir,
> > > > > > >>>>>>>>>>>>>>>>>>>>
Unlike BACKGROUND, LOG_ONLY provides strict
> > > write
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
guarantees
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
unless power
> > > > > > >>>>>>>>>>> loss has happened.
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
Seems like we need to measure
> performance
> > > > > > difference
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
to
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
decide
> > > > > > >>>>>>>>> whether do
> > > > > > >>>>>>>>>>>>>>>>>>>>>
we need separate WAL mode. If it will be
> > > > invisible,
> > > > > > >>>>>>>>>>>>>>>>>>
we'll
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
just
> > > > > > >>>>>>>>>> fix
> > > > > > >>>>>>>>>>>>>>>>>>>>>
these
> > > > > > >>>>>>>>>>>>>>>>>>
bugs without introducing new mode; if it will
> be
> > > > > > >>>>>>>>>>>>>>>>>>>>
perceptible,
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
we'll
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
continue the discussion about introducing
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
LOG_ONLY_SAFE.
> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>
Makes sense?
> > > > > > >>>>>>>>>>>>>>>>>>>>
Yes, this sounds like the right approach.
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards,
> > > > Ilya Suntsov
> > > > email: isuntsov@gridgain.com
> > > > *GridGain Systems*
> > > > www.gridgain.com
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message