ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Павлухин Иван <vololo...@gmail.com>
Subject Re: Consistency check and fix (review request)
Date Tue, 16 Apr 2019 08:39:07 GMT
Anton,

Thank you for your effort for improving consistency guarantees
provided by Ignite.

The subject sounds really vital. Could you please elaborate why it
comes as an on-demand enabled proxy but not as a mode enabled by some
configuration property (or even as a default behavior)? How do you see
the future development of such consistency checks? As for me it will
be great if we can improve consistency guarantees provided by default.

Also thinking loud a bit:
1. It sounds suspicious that reads can cause writes (unexpected
deadlocks might be possible).
2. I do not believe that it is possible to implement a (bugless?)
feature which will fix other bugs.
3. A storage (or database) product (Ignite in our case) consistency is
not equal to a user application consistency. So, it might be that
introduced checks are insufficient to make business applications
happy.

пн, 15 апр. 2019 г. в 19:27, Andrey Gura <agura@apache.org>:
>
> Anton,
>
> I'm trying tell you that this proxy can produce false positive result,
> incorrect result and just hide bugs. What will the next solution?
> withNoBugs proxy?
>
> You can perform consistency check using idle verify utility. Recovery
> tool is good idea but user should trigger this process, not some cache
> proxy implementation.
>
> On Mon, Apr 15, 2019 at 5:34 PM Anton Vinogradov <av@apache.org> wrote:
> >
> > Seems, we already fixed all bugs caused this feature, but there is no
> > warranty we will not create new :)
> > This proxy is just checker that consistency is ok.
> >
> > >> reaching bugless implementation
> > Not sure it's possible. Once you have software it contains bugs.
> > This proxy will tell you whether these bugs lead to inconsistency.
> >
> > On Mon, Apr 15, 2019 at 5:19 PM Andrey Gura <agura@apache.org> wrote:
> >
> > > Method name is minor problem. I still believe that there is no need
> > > for this proxy because there are no any guarantees about bugless
> > > implementation this functionality. Better way is reaching bugless
> > > implementation of current functionality.
> > >
> > > On Mon, Apr 15, 2019 at 4:51 PM Anton Vinogradov <av@apache.org> wrote:
> > > >
> > > > Andrey,
> > > >
> > > > >> It means also that at least method name is bad.
> > > > Agreed, already discussed with Aleksey Plekhanov.
> > > > Decided that ".withConsistencyCheck()" is a proper name.
> > > >
> > > > >> What is the profit?
> > > > This proxy allows to check (and fix) is there any consistency violation
> > > > across the topology.
> > > > The proxy will check all backups contain the same values as primary.
> > > > So, when it's possible (you're ready to spend resources for this check)
> > > you
> > > > will be able to read-with-consistency-check.
> > > > This will decrease the amount of "inconsistency caused
> > > > war/strikes/devastation" situations, which is important for financial
> > > > systems.
> > > >
> > > > On Mon, Apr 15, 2019 at 3:58 PM Andrey Gura <agura@apache.org> wrote:
> > > >
> > > > > Anton,
> > > > >
> > > > > what does expression "withConsistency" mean? From user's standpoint
it
> > > > > means that all operations performed without this proxy are not
> > > > > consistent. It means also that at least method name is bad.
> > > > >
> > > > > Are there any guarantees that withConsistency proxy will not contain
> > > > > bugs that will lead to inconsistent write after inconsistency was
> > > > > found? I think there are no such guarantees. Bugs still are possible.
> > > > > So I always must use withConsistency proxy because I doesn't have
> > > > > other choice - all ways are unreliable and withConsistency just sounds
> > > > > better.
> > > > >
> > > > > Eventually we will have two different ways for working with cache
> > > > > values with different bugs set. What is the profit?
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Apr 12, 2019 at 2:49 PM Anton Vinogradov <av@apache.org>
> > > wrote:
> > > > > >
> > > > > > Folks,
> > > > > >
> > > > > > I've checked the tx benchmarks and found no performance drop.
> > > > > > Also, see no issues at TC results.
> > > > > > So, seems, code ready to be merged.
> > > > > >
> > > > > > Everyone interested, please share any objections about
> > > > > > - public API
> > > > > > - test coverage
> > > > > > - implementation approach
> > > > > >
> > > > > > On Wed, Apr 3, 2019 at 5:46 PM Anton Vinogradov <av@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Nikolay,
> > > > > > >
> > > > > > > This is not a PoC, but the final solution (I hope so:)
) required
> > > the
> > > > > > > review.
> > > > > > > LWW means Last Write Wins, detailed explanation can be
found at
> > > IEP-31.
> > > > > > >
> > > > > > > On Wed, Apr 3, 2019 at 5:24 PM Nikolay Izhikov <
> > > nizhikov@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hello, Anton.
> > > > > > >>
> > > > > > >> Thanks for the PoC.
> > > > > > >>
> > > > > > >> > finds correct values according to LWW strategy
> > > > > > >>
> > > > > > >> Can you, please, clarify what is LWW strategy?
> > > > > > >>
> > > > > > >> В Ср, 03/04/2019 в 17:19 +0300, Anton Vinogradov
пишет:
> > > > > > >> > Ilya,
> > > > > > >> >
> > > > > > >> > This is impossible due to a conflict between some
isolation
> > > levels
> > > > > and
> > > > > > >> > get-with-consistency expectations.
> > > > > > >> > Basically, it's impossible to perform get-with-consistency
> > > after the
> > > > > > >> other
> > > > > > >> > get at !READ_COMMITTED transaction.
> > > > > > >> > The problem here is that value should be cached
according to the
> > > > > > >> isolation
> > > > > > >> > level, so get-with-consistency is restricted in
this case.
> > > > > > >> > Same problem we have at case get-with-consistency
after put, so
> > > we
> > > > > have
> > > > > > >> > restriction here too.
> > > > > > >> > So, the order matter. :)
> > > > > > >> >
> > > > > > >> > See OperationRestrictionsCacheConsistencyTest
[1] for details.
> > > > > > >> >
> > > > > > >> > [1]
> > > > > > >> >
> > > > > > >>
> > > > >
> > > https://github.com/apache/ignite/blob/8b0b0c3e1bde93ff9c4eb5667d794dd64a8b06f0/modules/core/src/test/java/org/apache/ignite/internal/processors/cache/consistency/OperationRestrictionsCacheConsistencyTest.java
> > > > > > >> >
> > > > > > >> > On Wed, Apr 3, 2019 at 4:54 PM Ilya Kasnacheev
<
> > > > > > >> ilya.kasnacheev@gmail.com>
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Hello!
> > > > > > >> > >
> > > > > > >> > > Sounds useful especially for new feature
development.
> > > > > > >> > >
> > > > > > >> > > Can you do a run of all tests with cache.forConsistency(),
> > > see if
> > > > > > >> there are
> > > > > > >> > > cases that fail?
> > > > > > >> > >
> > > > > > >> > > Regards,
> > > > > > >> > > --
> > > > > > >> > > Ilya Kasnacheev
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > ср, 3 апр. 2019 г. в 16:17, Anton
Vinogradov <av@apache.org>:
> > > > > > >> > >
> > > > > > >> > > > Igniters,
> > > > > > >> > > >
> > > > > > >> > > > Sometimes, at real deployment, we're
faced with inconsistent
> > > > > state
> > > > > > >> across
> > > > > > >> > > > the topology.
> > > > > > >> > > > This means that somehow we have different
values for the
> > > same
> > > > > key at
> > > > > > >> > > > different nodes.
> > > > > > >> > > > This is an extremely rare situation,
but, when you have
> > > > > thousands of
> > > > > > >> > > > terabytes of data, this can be a real
problem.
> > > > > > >> > > >
> > > > > > >> > > > Apache Ignite provides a consistency
guarantee, each
> > > affinity
> > > > > node
> > > > > > >> should
> > > > > > >> > > > contain the same value for the same
key, at least
> > > eventually.
> > > > > > >> > > > But this guarantee can be violated because
of bugs, see
> > > IEP-31
> > > > > [1]
> > > > > > >> for
> > > > > > >> > > > details.
> > > > > > >> > > >
> > > > > > >> > > > So, I created the issue [2] to handle
such situations.
> > > > > > >> > > > The main idea is to have a special cache.withConsistency()
> > > proxy
> > > > > > >> allows
> > > > > > >> > > > checking a fix inconsistency on get
operation.
> > > > > > >> > > >
> > > > > > >> > > > I've created PR [3] with following improvements
(when
> > > > > > >> > > > cache.withConsistency() proxy used):
> > > > > > >> > > >
> > > > > > >> > > > - PESSIMISTIC && !READ_COMMITTED
transaction
> > > > > > >> > > > -- checks values across the topology
(under locks),
> > > > > > >> > > > -- finds correct values according to
LWW strategy,
> > > > > > >> > > > -- records special event in case consistency
violation found
> > > > > > >> (contains
> > > > > > >> > > > inconsistent map <Node, <K,V>>
and last values <K,V>),
> > > > > > >> > > > -- enlists writes with latest value
for each inconsistent
> > > key,
> > > > > so
> > > > > > >> it will
> > > > > > >> > > > be written on tx.commit().
> > > > > > >> > > >
> > > > > > >> > > > - OPTIMISTIC || READ_COMMITTED transactions
> > > > > > >> > > > -- checks values across the topology
(not under locks, so
> > > > > > >> false-positive
> > > > > > >> > > > case is possible),
> > > > > > >> > > > -- starts PESSIMISTIC && SERIALIZABLE
(at separate thread)
> > > > > > >> transaction
> > > > > > >> > >
> > > > > > >> > > for
> > > > > > >> > > > each possibly broken key and fixes it
on a commit if
> > > necessary.
> > > > > > >> > > > -- original transaction performs get-after-fix
and can be
> > > > > continued
> > > > > > >> if
> > > > > > >> > >
> > > > > > >> > > the
> > > > > > >> > > > fix does not conflict with isolation
level.
> > > > > > >> > > >
> > > > > > >> > > > Future plans
> > > > > > >> > > > - Consistency guard (special process
periodically checks we
> > > > > have no
> > > > > > >> > > > inconsistency).
> > > > > > >> > > > - MVCC support.
> > > > > > >> > > > - Atomic caches support.
> > > > > > >> > > > - Thin client support.
> > > > > > >> > > > - SQL support.
> > > > > > >> > > > - Read-with-consistency before the write
operation.
> > > > > > >> > > > - Single key read-with-consistency optimization,
now the
> > > > > collection
> > > > > > >> > > > approach used each time.
> > > > > > >> > > > - Do not perform read-with-consistency
for the key in case
> > > it
> > > > > was
> > > > > > >> > > > consistently read some time ago.
> > > > > > >> > > >
> > > > > > >> > > > [1]
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >>
> > > > >
> > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-31+Consistency+check+and+fix
> > > > > > >> > > > [2] https://issues.apache.org/jira/browse/IGNITE-10663
> > > > > > >> > > > [3] https://github.com/apache/ignite/pull/5656
> > > > > > >> > > >
> > > > > > >>
> > > > > > >
> > > > >
> > >



-- 
Best regards,
Ivan Pavlukhin

Mime
View raw message