geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Smith <dsm...@pivotal.io>
Subject Re: Lucene upgrade
Date Wed, 06 Nov 2019 22:42:38 GMT
>
> 1.) We add some product code/lucene listener to detect whether we have old
> versions of geode and if so, do not write to lucene on the newly updated
> node until all versions are up to date.


Elaborating on this option a little more, this might be as simple as
something like the below at the beginning of LuceneEventListener.process.
Maybe there is a better way to cache/check whether there are old members.

The danger with this approach is that the queues will grow until the
upgrade is complete. But maybe that is the only way to successfully do a
rolling upgrade with lucene indexes.

boolean hasOldMember = cache.getMembers().stream()
    .map(InternalDistributedMember.class::cast)
    .map(InternalDistributedMember::getVersionObject)
    .anyMatch(version -> version.compareTo(Version.GEODE_1_11_0) <0);

if(hasOldMember) {
  return false;
}


On Wed, Nov 6, 2019 at 2:16 PM Jason Huynh <jhuynh@pivotal.io> wrote:

> Jake, -from my understanding, the implementation details of geode-lucene is
> that we are using a partitioned region as a "file-system" for lucene
> files.  As new servers are rolled, the issue is that the new servers have
> the new codec.  As puts occur on the users data region, the async listeners
> are processing on new/old servers alike.  If a new server writes using the
> new codec, it's written into the partitioned region but if an old server
> with the old codec needs to read that file, it will blow up because it
> doesn't know about the new codec.
> Option 1 is to not have the new servers process/write if it detects
> different geode systems (pre-codec changes).
> Option 2 is similar but requires users to pause the aeq/lucene listeners
>
> Deleting the indexes and recreating them can be quite expensive.  Mostly
> due to tombstone creation when creating a new lucene index, but could be
> considered Option 3.  It also would probably require
> https://issues.apache.org/jira/browse/GEODE-3924 to be completed.
>
> Gester - I may be wrong but I think option 1 is still doable.  We just need
> to not write using the new codec until after all servers are upgraded.
>
> There was also some upgrade challenge with scoring from what I remember,
> but that's a different topic...
>
>
> On Wed, Nov 6, 2019 at 1:00 PM Xiaojian Zhou <gzhou@pivotal.io> wrote:
>
> > He tried to upgrade lucene version from current 6.6.4 to 8.2. There're
> some
> > challenges. One challenge is the codec changed, which caused the format
> of
> > index is also changed.
> >
> > That's why we did not implement it.
> >
> > If he resolved the coding challenges, then rolling upgrade will probably
> > need option-2 to workaround it.
> >
> > Regards
> > Gester
> >
> >
> > On Wed, Nov 6, 2019 at 11:47 AM Jacob Barrett <jbarrett@pivotal.io>
> wrote:
> >
> > > What about “versioning” the region that backs the indexes? Old servers
> > > with old license would continue to read/write to old region. New
> servers
> > > would start re-indexing with the new version. Given the async nature of
> > the
> > > indexing would the mismatch in indexing for some period of time have an
> > > impact?
> > >
> > > Not an ideal solution but it’s something.
> > >
> > > In my previous life we just deleted the indexes and rebuilt them on
> > > upgrade but that was specific to our application.
> > >
> > > -Jake
> > >
> > >
> > > > On Nov 6, 2019, at 11:18 AM, Jason Huynh <jhuynh@pivotal.io> wrote:
> > > >
> > > > Hi Mario,
> > > >
> > > > I think there are a few ways to accomplish what Dan was
> > suggesting...Dan
> > > or
> > > > other's, please chime in with more options/solutions.
> > > >
> > > > 1.) We add some product code/lucene listener to detect whether we
> have
> > > old
> > > > versions of geode and if so, do not write to lucene on the newly
> > updated
> > > > node until all versions are up to date.
> > > >
> > > > 2.)  We document it and provide instructions (and a way) to pause
> > lucene
> > > > indexing before someone attempts to do a rolling upgrade.
> > > >
> > > > I'd prefer option 1 or some other robust solution, because I think
> > > option 2
> > > > has many possible issues.
> > > >
> > > >
> > > > -Jason
> > > >
> > > >
> > > >> On Wed, Nov 6, 2019 at 1:03 AM Mario Kevo <mario.kevo@est.tech>
> > wrote:
> > > >>
> > > >> Hi Dan,
> > > >>
> > > >> thanks for suggestions.
> > > >> I didn't found a way to write lucene in older format. They only
> > support
> > > >> reading old format indexes with newer version by using
> > lucene-backward-
> > > >> codec.
> > > >>
> > > >> Regarding to freeze writes to the lucene index, that means that we
> > need
> > > >> to start locators and servers, create lucene index on the server,
> roll
> > > >> it to current and then do puts. In this case tests passed. Is it ok?
> > > >>
> > > >>
> > > >> BR,
> > > >> Mario
> > > >>
> > > >>
> > > >>> On Mon, 2019-11-04 at 17:07 -0800, Dan Smith wrote:
> > > >>> I think the issue probably has to do with doing a rolling upgrade
> > > >>> from an
> > > >>> old version of geode (with an old version of lucene) to the new
> > > >>> version of
> > > >>> geode.
> > > >>>
> > > >>> Geode's lucene integration works by writing the lucene index to
a
> > > >>> colocated
> > > >>> region. So lucene index data that was generated on one server
can
> be
> > > >>> replicated or rebalanced to other servers.
> > > >>>
> > > >>> I think what may be happening is that data written by a geode
> member
> > > >>> with a
> > > >>> newer version is being read by a geode member with an old version.
> > > >>> Because
> > > >>> this is a rolling upgrade test, members with multiple versions
will
> > > >>> be
> > > >>> running as part of the same cluster.
> > > >>>
> > > >>> I think to really fix this rolling upgrade issue we would need
to
> > > >>> somehow
> > > >>> configure the new version of lucene to write data in the old
> format,
> > > >>> at
> > > >>> least until the rolling upgrade is complete. I'm not sure if that
> is
> > > >>> possible with lucene or not - but perhaps? Another option might
be
> to
> > > >>> freeze writes to the lucene index during the rolling upgrade
> process.
> > > >>> Lucene indexes are asynchronous, so this wouldn't necessarily
> require
> > > >>> blocking all puts. But it would require queueing up a lot of
> updates.
> > > >>>
> > > >>> -Dan
> > > >>>
> > > >>> On Mon, Nov 4, 2019 at 12:05 AM Mario Kevo <mario.kevo@est.tech>
> > > >>> wrote:
> > > >>>
> > > >>>> Hi geode dev,
> > > >>>>
> > > >>>> I'm working on upgrade lucene to a newer version. (
> > > >>>> https://issues.apache.org/jira/browse/GEODE-7309)
> > > >>>>
> > > >>>> I followed instruction from
> > > >>>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Upgrading+to+Lucene+7.1.0
> > > >>>> Also add some other changes that is needed for lucene 8.2.0.
> > > >>>>
> > > >>>> I found some problems with tests:
> > > >>>> * geode-
> > > >>>>   lucene/src/test/java/org/apache/geode/cache/lucene/internal/dist
> > > >>>> ribu
> > > >>>>   ted/DistributedScoringJUnitTest.java:
> > > >>>>
> > > >>>>
> > > >>>> *
> > > >>>> geode-
> > > >>>>
> lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/RollingUp
> > > >>>>
> gradeQueryReturnsCorrectResultsAfterClientAndServersAreRolledOver.j
> > > >>>> ava:
> > > >>>> *
> > > >>>> geode-
> > > >>>>
> lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/RollingUp
> > > >>>>
> gradeQueryReturnsCorrectResultAfterTwoLocatorsWithTwoServersAreRoll
> > > >>>> ed.java:
> > > >>>> *
> > > >>>> ./geode-
> > > >>>>
> lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/RollingUp
> > > >>>>
> gradeQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegio
> > > >>>> n.java:
> > > >>>> *
> > > >>>> ./geode-
> > > >>>>
> lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/RollingUp
> > > >>>>
> gradeQueryReturnsCorrectResultsAfterServersRollOverOnPersistentPart
> > > >>>> itionRegion.java:
> > > >>>>
> > > >>>>      -> failed due to
> > > >>>> Caused by: org.apache.lucene.index.IndexFormatTooOldException:
> > > >>>> Format
> > > >>>> version is not supported (resource
> > > >>>> BufferedChecksumIndexInput(segments_1)): 6 (needs to be between
7
> > > >>>> and
> > > >>>> 9). This version of Lucene only supports indexes created with
> > > >>>> release
> > > >>>> 6.0 and later.
> > > >>>>        at
> > > >>>>
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.jav
> > > >>>> a:21
> > > >>>> 3)
> > > >>>>        at
> > > >>>>
> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:3
> > > >>>> 05)
> > > >>>>        at
> > > >>>>
> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:2
> > > >>>> 89)
> > > >>>>        at
> > > >>>> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:846)
> > > >>>>        at
> > > >>>>
> org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.finis
> > > >>>> hCom
> > > >>>> putingRepository(IndexRepositoryFactory.java:123)
> > > >>>>        at
> > > >>>>
> org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.compu
> > > >>>> teIn
> > > >>>> dexRepository(IndexRepositoryFactory.java:66)
> > > >>>>        at
> > > >>>>
> org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager
> > > >>>> .com
> > > >>>> puteRepository(PartitionedRepositoryManager.java:151)
> > > >>>>        at
> > > >>>>
> org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager
> > > >>>> .lam
> > > >>>> bda$computeRepository$1(PartitionedRepositoryManager.java:170)
> > > >>>>        ... 16 more
> > > >>>>
> > > >>>>
> > > >>>> *
> > > >>>> geode-
> > > >>>>
> lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/RollingUp
> > > >>>>
> gradeQueryReturnsCorrectResultsAfterClientAndServersAreRolledOverAl
> > > >>>> lBucketsCreated.java:
> > > >>>>
> > > >>>>      -> failed with the same exception as previous tests
> > > >>>>
> > > >>>>
> > > >>>> I found this on web
> > > >>>>
> > > >>>>
> > > >>
> > > >>
> > >
> >
> https://stackoverflow.com/questions/47454434/solr-indexing-issue-after-upgrading-from-4-7-to-7-1
> > > >>>> , but not have an idea how to proceed with that.
> > > >>>>
> > > >>>> Does anyone has any idea how to fix it?
> > > >>>>
> > > >>>> BR,
> > > >>>> Mario
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message