ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: ignite 1.4 status
Date Sat, 19 Sep 2015 01:07:24 GMT
Yakov,

Valentin and I debugged the issue with ignite-1171 and I think we got to
the bottom of it. First of all, pending messages were not reset to the
correct collection on joining node which resulted in skipped custom event
notifications. Second, the check that you have added to avoid discarding of
custom message was checking wrong variable and wrong type :) After we fixed
those two issues, the test seem to pass. Please review my changes again.

--AG

2015-09-18 14:10 GMT-07:00 Yakov Zhdanov <yzhdanov@apache.org>:

> Igniters,
>
> While working on ignite-1171 we discovered couple more issues in discovery
> that might have threaten custom events processing under some circumstances
> (we have continuous processes based on this logic, for example).
>
> Alexey Goncharuk has picked this up.
>
> Another critical issue discovered today -
> https://issues.apache.org/jira/browse/IGNITE-1516 - performance drop in
> offheap query benchmark. Semyon will be fixing it.
>
> https://issues.apache.org/jira/browse/IGNITE-973 - Sergi has come to
> conclusion that race still present in cache offheap swap logic. Currently
> this is assigned to Semyon, too.
>
> We need to postpone release till very beginning of next week.
>
> --Yakov
>
> 2015-09-18 12:01 GMT+03:00 Yakov Zhdanov <yzhdanov@apache.org>:
>
> > Alex, I think that your approach with delaying custom message will work.
> > As far as coordinator crash protection, we guarantee delivery of certain
> > messages types (including custom message). This logic was implemented
> long
> > ago and seems to work. So, the message just gets resent.
> >
> > Semyon, can you please take  a look at Alex's changes?
> >
> > --Yakov
> >
> > 2015-09-18 3:24 GMT+03:00 Alexey Goncharuk <alexey.goncharuk@gmail.com>:
> >
> >> Yakov,
> >>
> >> The approach with collecting discovery data on NodeAddFinished message
> >> does
> >> not work because this messages get relayed to clients before the message
> >> passes the whole ring. If we make it to pass the ring and relay it to
> >> clients on the second round, we get the same race as I was fixing.
> >>
> >> I think the correct approach here is to delay custom event messages when
> >> node join is in progress - basically do not allow custom messages
> between
> >> NodeAddedMessage and NodeAddFinished message. I implemented a very
> simple
> >> fix in ignite-1171, however I need you someone else with good expertise
> in
> >> discovery protocol to take a look at my changes because I am sure I
> missed
> >> something - e.g. I am not sure how delayed messages should be handled in
> >> case when coordinator node crashes.
> >>
> >> 2015-09-17 8:52 GMT-07:00 Yakov Zhdanov <yzhdanov@gridgain.com>:
> >>
> >> > Alex, I think it makes sense to continue investigating this. We can
> >> discuss
> >> > whether we include or skip the fix once fix is ready.
> >> >
> >> > As far as other tickets:
> >> >
> >> >
> >>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC
> >> >
> >> > IGNITE-1171 Getting affinity for topology version earlier than
> affinity
> >> is
> >> > calculated - is on Alex Goncharuk.
> >> > IGNITE-973 Failed to get value for key: 13791. at
> >> >
> >> >
> >>
> o.a.i.i.processors.query.h2.opt.GridH2AbstractKeyValueRow.getValue(GridH2AbstractKeyValueRow.java:223)
> >> > - assigned to Sergi. There seems to be a problem in offheap indexing
> >> which
> >> > can be reproduced from time to time. This is an old issue and I think
> >> can
> >> > be postponed if does not fit.
> >> >
> >> > +1 IGFS issue
> >> > and rest ver.x issues
> >> >
> >> > I hope IGNITE-1171 will be fixed today so picture become much cleaner.
> >> >
> >> > --
> >> > Yakov Zhdanov, Director R&D
> >> > *GridGain Systems*
> >> > www.gridgain.com
> >> >
> >> > 2015-09-17 0:59 GMT+03:00 Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> >> >:
> >> >
> >> > > Yakov, Igniters,
> >> > >
> >> > > I have found at least one issue related to ignite-1171 hang, it is
> >> caused
> >> > > by a race between discovery custom message and
> collectDiscoveryData()
> >> > call
> >> > > (updated the ticket). I remember we wanted to call
> >> collectDiscoveryData()
> >> > > during the NodeAddFinishedMessage processing, however it was not
> >> > > implemented - do we think that this is a correct change and do we
> >> want it
> >> > > to be fixed in 1.4? Discovery changes are quite sensitive and I
> would
> >> > > prefer them to be tested thoroughly.
> >> > >
> >> > > 2015-09-16 9:09 GMT-07:00 Yakov Zhdanov <yzhdanov@apache.org>:
> >> > >
> >> > > > Guys,
> >> > > >
> >> > > > I want to update release status.
> >> > > >
> >> > > > Testing has revealed some cache issues which should be fixed
with
> >> the
> >> > > > release. Moreover, it turned out that these issues block vert.x
> >> > release.
> >> > > > So, if we fix them we can consider including vert.x into 1.4
> >> release.
> >> > > Which
> >> > > > is good I think.
> >> > > >
> >> > > > I think that Alex Goncharuk is the best person who can look into
> >> vert.x
> >> > > > issues. Alex, please first of all pay attention to IGNITE-1171
-
> >> > Getting
> >> > > > affinity for topology version earlier than affinity is calculated
> -
> >> > Test
> >> > > > reproducing the issue has been added to ignite1.4. Alex please
let
> >> us
> >> > > know
> >> > > > if this can be fixed.
> >> > > >
> >> > > > These issues are on Semyon Boikov:
> >> > > >
> >> > > > IGNITE-973 Failed to get value for key: 13791. at
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> o.a.i.i.processors.query.h2.opt.GridH2AbstractKeyValueRow.getValue(GridH2AbstractKeyValueRow.java:223)
> >> > > > - We need more time to finish with this. Some race in swap is
> still
> >> > > there.
> >> > > > IGNITE-1452 OptimizedMarshaller.unmarshal hangs in
> >> > > > IgniteCacheQueryNodeRestartSelfTest2 - Need to check TC and merge.
> >> > > >
> >> > > > Rest of tickets are vert.x related. Here is the link -
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC
> >> > > >
> >> > > > Andrey Gura, please provide as much information as you can for
the
> >> rest
> >> > > of
> >> > > > vert.x tickets.
> >> > > >
> >> > > > Thanks!
> >> > > >
> >> > > > --Yakov
> >> > > >
> >> > > > 2015-09-15 19:12 GMT+03:00 Yakov Zhdanov <yzhdanov@apache.org>:
> >> > > >
> >> > > > > Raul, how is your status with the streamer? I think there
is no
> >> > reason
> >> > > > for
> >> > > > > rush. We can put it to 1.5. Please let me know what you
think.
> >> > > > >
> >> > > > > As far as release status here are the open tickets -
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC
> >> > > > >
> >> > > > > https://issues.apache.org/jira/browse/IGNITE-1239 - Alex
> >> Goncharuk,
> >> > > can
> >> > > > > you please let us know if this will be finished today?
> >> > > > > https://issues.apache.org/jira/browse/IGNITE-1490 - Ilya
> Suntsov
> >> > works
> >> > > > on
> >> > > > > reproducing this. I suspect we may have problems with near
cache
> >> > > > evictions.
> >> > > > > Can Val or Alex proceed with this after Ilya finishes test
run?
> >> Ilya,
> >> > > > > please respond in ticket upon your results.
> >> > > > >
> >> > > > > Thanks!
> >> > > > >
> >> > > > > --Yakov
> >> > > > >
> >> > > > > 2015-09-15 11:15 GMT+03:00 Raul Kripalani <raul@evosent.com>:
> >> > > > >
> >> > > > >> Hi guys,
> >> > > > >>
> >> > > > >> The MQTT streamer I'm working on will be ready this
week.
> >> Hopefully
> >> > as
> >> > > > >> soon
> >> > > > >> as today or tomorrow.
> >> > > > >>
> >> > > > >> It's not important for the 1.4 release, but it seems
like it'll
> >> make
> >> > > the
> >> > > > >> timeline to get potentially merged.
> >> > > > >>
> >> > > > >> Regards,
> >> > > > >> Raúl.
> >> > > > >> On 15 Sep 2015 00:05, "Yakov Zhdanov" <yzhdanov@apache.org>
> >> wrote:
> >> > > > >>
> >> > > > >> > Guys,
> >> > > > >> >
> >> > > > >> > Current status is the following:
> >> > > > >> >
> >> > > > >> > 1. Sam needs to merge his fixes after TC is finished.
> >> > > > >> > 2. Some minor changes pending from Denis + release
notes fix
> >> > pointed
> >> > > > by
> >> > > > >> > Dmitry.
> >> > > > >> > 3. Several suites are still red on TC
> >> > > > >> >
> >> > > > >> > I have moved plenty of tickets to ignite-1.5. Here
is the
> link
> >> to
> >> > > > >> currently
> >> > > > >> > open tickets that I want everyone (esp. assignees)
to look
> >> through
> >> > > and
> >> > > > >> tell
> >> > > > >> > me whether ticket can be moved or should be fixed
-
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC
> >> > > > >> >
> >> > > > >> > Alex Goncharuk has 5 tickets.
> >> > > > >> > Semyon Boikov has 5 tickets.
> >> > > > >> > Valentin has 4
> >> > > > >> > Sergi has 4
> >> > > > >> > Vladimir has 3
> >> > > > >> > Ivan V. has 3
> >> > > > >> >
> >> > > > >> > Guys, please look your tickets through and let
us know your
> >> > > decision.
> >> > > > >> >
> >> > > > >> > --Yakov
> >> > > > >> >
> >> > > > >> > 2015-09-14 21:04 GMT+03:00 Dmitriy Setrakyan <
> >> > dsetrakyan@apache.org
> >> > > >:
> >> > > > >> >
> >> > > > >> > > Yakov,
> >> > > > >> > >
> >> > > > >> > > I know you were managing the 1.4 release.
Can you please
> >> provide
> >> > > an
> >> > > > >> > update
> >> > > > >> > > of what goes into the release at this point
and what is the
> >> > > overall
> >> > > > >> plan?
> >> > > > >> > >
> >> > > > >> > > Thanks,
> >> > > > >> > > D.
> >> > > > >> > >
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message