ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: IGNITE-1948 ClusterTopologyCheckedException can return null for retryReadyFuture()
Date Fri, 10 Feb 2017 08:06:55 GMT
Alexander,

I do not see why you would need to overwrite the cause field.

What I meant in the previous email is that instead of setting a retry ready
future in the checked exception, you would set a correct affinity topology
version here. Then, during a construction of CacheException (unchecked,
which is guaranteed to be thrown locally and will not be serialized), you
would create the retry ready future based on the topology version set.

Hope this helps,
AG

2017-02-07 17:18 GMT+03:00 Александр Меньшиков <sharplermc@gmail.com>:

> Alexey, I ran into some difficulties.
>
> Look at the method: GridNearTxFinishFuture.CheckBackupMiniFuture#
> onDhtFinishResponse(GridDhtTxFinishResponse res)
>
>             *Throwable err* = res.checkCommittedError();
>
>             if (*err* != null) {
>                 if (*err* instanceof IgniteCheckedException) {
>                     *ClusterTopologyCheckedException cause* =
>                         ((IgniteCheckedException)*err*).
> *getCause(ClusterTopologyCheckedException.class)*;
>
>                     if (*cause* != null)
>                         *cause.retryReadyFuture(*cctx.
> nextAffinityReadyFuture(tx.topologyVersion()));
> *                                   //^_____Here update the readyFut in
> some first "cause". *
>                 }
>
>                 onDone(*err*);
>             }
>
>
> So I can't rewrite "cause" field in exception without reflection. It means
> if I try to use two exception: one with "readyFut" and second without
> "readyFut", I see no way to do it in this place.
>
> Perhaps I misunderstood your original idea. Or maybe this code some kind
> of a crutch and should be removed. What do you think?
>
> 2017-01-30 16:54 GMT+03:00 Alexey Goncharuk <alexey.goncharuk@gmail.com>:
>
>> In the example above there is no point of setting the retry future in the
>> exception because it will be serialized and sent to a remote node.
>>
>> I see the only possible way to ensure this: make this be verified at
>> compile time. This looks like a big change, but the draft may look like
>> so:
>> 1) Introduce new exception type, like CacheTopology*Checked*Exception
>> which
>> *must* take the minimum topology version to wait for
>> 2) In all cases when a cache operation fails because of topology change,
>> provide the appropriate exception
>> 3) When CacheTopologyException is being constructed, create a
>> corresponding
>> local future based on wait version and throw the exception.
>>
>> Even though this still does not give us 100% guarantee that we did not
>> miss
>> anything, it should cover a lot more cases than now.
>>
>> 2017-01-30 14:20 GMT+03:00 Александр Меньшиков <sharplermc@gmail.com>:
>>
>> > Alexey,
>> >
>> > For GridCacheIoManager#sendNoRetry it's not necessary because exception
>> > will be ignored (or will cast to String). Perhaps my message was
>> unclear.
>> > I try to say that three methods can throw "ClusterTopologyCheckedExcepti
>> on"
>> > without any problem.
>> >
>> > But you are right, in some methods I can't set "retryFuture". We must
>> > ensure that exception doesn't escape away. The best option is to ignore
>> it
>> > (or cast to String).
>> >
>> > But any way, look here:
>> >
>> > IgniteTxHandler#sendReply(UUID nodeId, GridDhtTxFinishRequest req,
>> boolean
>> > committed, GridCacheVersion nearTxId)
>> >
>> > Inside you can found next lines:
>> > __________________________
>> > ClusterTopologyCheckedException *cause* =
>> >     new ClusterTopologyCheckedException("Primary node left grid.");
>> >
>> > res.checkCommittedError(new IgniteTxRollbackCheckedException("Failed to
>> > commit transaction " +
>> >     "(transaction has been rolled back on backup node): " +
>> req.version(),
>> > *cause*));
>> > __________________________
>> >
>> > How do you unsure that *"cause"* can't come to
>> GridCacheUtils#retryTopologySafe(final
>>
>> > Callable<S> c) ?
>> >
>> >
>> > Perhaps I don't know some rules which you use to maintain it now?
>> >
>> >
>> > 2017-01-30 12:24 GMT+03:00 Alexey Goncharuk <alexey.goncharuk@gmail.com
>> >:
>> >
>> >> Alexander,
>> >>
>> >> Do you have a use-case in mind which results in IgniteTopologyException
>> >> thrown from public API with retryFuture unset?
>> >>
>> >> retryFuture makes sense only for certain cache operations and may be
>> set
>> >> only in certain places in the code where we already know what to wait
>> for.
>> >> I do not see how you can initialize this future, for example, in
>> >>  GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage
>> msg,
>> >> byte
>> >> plc)
>> >>
>> >> --AG
>> >>
>> >> 2017-01-27 15:45 GMT+03:00 Александр Меньшиков <sharplermc@gmail.com>:
>> >>
>> >> > I found a lot of using "ClusterTopologyCheckedException" exception.
>> In
>> >> > most
>> >> > use-case these exceptions were just ignored. It's easier to see if
>> >> > issue-4612 will be applied (I'm waiting for code review by my team
>> >> leader)
>> >> > where I renamed all unused exceptions in the "ignored".
>> >> >
>> >> > It means that in some case "retryReadyFuture" can left unset. But
>> there
>> >> are
>> >> > code where exception is being getting from "get()" method in some
>> >> "future"
>> >> > class and don't ignored (exception is sending to method
>> >> > "GridFutureAdapter#onDone()" for example).
>> >> >
>> >> > So I decided not to do a deep global analysis of code and make some
>> >> simple
>> >> > rules.
>> >> > 1. If some method "X" throws ClusterTopologyCheckedException and
>> >> ignores
>> >> > it
>> >> > (or converts to String, there are some variants to lose exception
>> like
>> >> > object) in all methods that call the method "X", then we can don't
>> set
>> >> > "retryReadyFuture" for optimization. But this should be clarified in
>> >> > javadoc.
>> >> > 2. But if exception escapes at least once like object: we must set
>> >> > "retryReadyFuture" in that method.
>> >> >
>> >> > A few times when call tree was simple, I went deeper.
>> >> >
>> >> > So I found only three methods which can throw
>> >> > "ClusterTopologyCheckedException" without setting
>> "retryReadyFuture":
>> >> >
>> >> > 1. IgfsFragmentizerManager#sendWithRetries(UUID nodeId,
>> >> > IgfsCommunicationMessage msg)
>> >> > 2. GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage
>> >> msg,
>> >> > byte plc)
>> >> > 3. GridCacheIoManager#sendOrderedMessage(ClusterNode node, Object
>> >> topic,
>> >> > GridCacheMessage msg, byte plc, long timeout)
>> >> >
>> >> > In all other methods "ClusterTopologyCheckedException" escapes away
>> too
>> >> > far.
>> >> >
>> >> > What do you think about this approach to make compromise between
>> >> > optimization and correctness?
>> >> >
>> >>
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message