archiva-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Lamy <ol...@apache.org>
Subject Re: maven-indexer / Lucene
Date Thu, 06 Jul 2017 02:39:26 GMT
Yup.
The idea is to have an extra jar produced by the maven-indexer with shaded
lucene version.
So the lucene classes (version used by Maven indexer) will be relocated in
a package called org.apache.maven.index.shaded.lucene (such
org.apache.maven.index.shaded.lucene.search.BooleanClause )
Then you exclude lucene dependencies used by maven indexer and voila.
The voila is a bit optimistic and not so ezy but anyway working on it ATM.


On 6 July 2017 at 07:08, Martin <martin_s@apache.org> wrote:

> What do you mean exactly by shading? Moving to another package name?
>
> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> > maybe an option is to use some shading?
> > I'm thinking of shading lucene packages used by maven indexer. I can
> easily
> > provide a build for that.
> > WDYT?
> >
> > On 26 June 2017 at 11:49, Olivier Lamy <olamy@apache.org> wrote:
> > > Hi
> > > graph/document storage could be convenient (but not possible with
> neo4j as
> > > it's GPL license [1])
> > > well we can add solr as an additional webapp with our jetty
> distribution
> > > but this will be a pain for users who want to use tomcat or any other
> > > servlet container...
> > > we still need to investigate a new storage model :-)
> > >
> > > Olivier
> > > [1] https://neo4j.com/licensing/
> > >
> > > On 25 June 2017 at 06:26, Martin <martin_s@apache.org> wrote:
> > >> Yes, you are right. The lucene dependency causes a lot of trouble and
> > >> will
> > >> cause headaches with each version change of one of the dependencies.
> > >> What are the requirements for a replacement?
> > >> - We want to store hierarchical data?
> > >> - We want to store metadata for nodes ?
> > >> - Fulltext search (only metadata or for artifacts too?)
> > >> - Blob / Artifact storage (I don't think so, but not so familiar with
> the
> > >> archiva artifact model)?
> > >>
> > >> Maybe some graph database may be an alternative. Don't know if the
> > >> license of
> > >> neo4j is compatible to the apache license, and I think it brings
> lucene
> > >> as
> > >> dependency too. I will have a look.
> > >> Problem is, if there is fulltext search needed, I think, for most of
> the
> > >> frameworks we get a lucene dependency, if it's embedded.
> > >>
> > >> Other alternatives:
> > >> - Implement fulltext search by our own (index of the metadata stored
> via
> > >> the
> > >> archiva api) and use the lucene dependency that comes from the
> > >> maven-indexer
> > >> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> > >> application
> > >> (war).
> > >>
> > >> Greetings
> > >>
> > >> Martin
> > >>
> > >> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> > >> > well this gonna be a pain.
> > >> > IMHO we need to find a new alternative to jcr oak.
> > >> > And something not using Lucene as it's a real pain to have different
> > >> > librairies using lucene as they do not update in the same time (and
> > >>
> > >> Lucene
> > >>
> > >> > break backward compat so quickly...)
> > >> > Any ideas? I'd like to have something embedded (but with a possible
> > >> > external server configuration).
> > >> > There is currently a Cassandra implementation. I was not satisfied
> > >> > about
> > >> > performance but I guess I did that 4yo ago so can be improved for
> sure
> > >> :
> > >> :-)
> > >> :
> > >> > Maybe orientdb?
> > >> > What else?
> > >> >
> > >> > On 24 June 2017 at 09:50, Olivier Lamy <olamy@apache.org> wrote:
> > >> > > well the issue is non compatible version of Lucene for Maven
> Indexer
> > >>
> > >> and
> > >>
> > >> > > Oak (well I can try push a patch to Oak for upgrading...)
> > >> > >
> > >> > > On 24 June 2017 at 08:41, Olivier Lamy <olamy@apache.org>
wrote:
> > >> > >> Hi
> > >> > >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> > >> > >> I'm working on it in the branch ( feature/jcr_oak )
> > >> > >> Not sure why but I have intermittent failure with store-jcr
> module.
> > >> > >> I definitely agree on the upgrade.
> > >> > >> Well we can simply detect it's not oak compatible and schedule
a
> > >> > >> full
> > >> > >> reindex (maybe with a message in logs and ui?)
> > >> > >> But we need to be sure we can still read central index and
not
> sure
> > >>
> > >> about
> > >>
> > >> > >> possible lucene conflict with oak and maven indexer.
> > >> > >> We can work on this branch? (I created a Jenkins job for
it
> > >> > >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> > >> > >> va-jcr-oak-branch/)
> > >> > >> If you prefer master I would say no worries neither.
> > >> > >> Something else to look at is upgrading maven-core etc...
> > >> > >> Anyway
> > >> > >> Cheers
> > >> > >> Olivier
> > >> > >>
> > >> > >> On 22 June 2017 at 19:16, Martin <martin_s@apache.org>
wrote:
> > >> > >>> Hi,
> > >> > >>>
> > >> > >>> upgrading the maven indexer leads to some major changes.
> > >> > >>> Lucene is used by maven-indexer and also by jackrabbit.
> Jackrabbit
> > >> > >>> sticks to
> > >> > >>> the old 3.x version and, as I see it, they will not move
to a
> newer
> > >> > >>> version.
> > >> > >>> There is Jackrabbit Oak as alternative.
> > >> > >>> I tried a proof of concept and could replace the jackrabbit
> > >> > >>> implementation of
> > >> > >>> metadata-store-jcr with a oak implementation. At least
I got the
> > >>
> > >> unit
> > >>
> > >> > >>> tests of
> > >> > >>> this module all to pass.
> > >> > >>> But switching to Oak has some drawbacks:
> > >> > >>> - The repository format changed and we must provide a
way to
> > >> > >>> migrate
> > >> > >>> (either
> > >> > >>> migrate the existing repository or create a new one by
> reindexing)
> > >> > >>> - The lucene version used is newer but does not match
to the
> > >> > >>> version
> > >> > >>> from the
> > >> > >>> maven-indexer dependencies. There may come up some
> > >> > >>> incompatibilities
> > >> > >>> that are
> > >> > >>> not solvable without using a modified version of one
of the
> both.
> > >> > >>> Or
> > >> > >>> there may
> > >> > >>> be the possibility to switch to solr (as separate component)
and
> > >>
> > >> get rid
> > >>
> > >> > >>> of
> > >> > >>> the lucene dependencies for jcr inside the archiva project.
> > >> > >>>
> > >> > >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
too:
> > >> > >>> - The Plexus-Sisu-Bridge does not work as before.
> > >> > >>> - We must migrate from the NexusIndexer to the indexer
API.
> > >> > >>>
> > >> > >>> So switching to the new indexer and oak means more work
as
> expected
> > >>
> > >> and
> > >>
> > >> > >>> some
> > >> > >>> risks regarding new incompatibility problems. And I think
this
> > >>
> > >> cannot be
> > >>
> > >> > >>> done
> > >> > >>> without broken master builds for some time period.
> > >> > >>>
> > >> > >>> So, what should we do? I think maven indexer is one of
the core
> > >> > >>> components of
> > >> > >>> archiva, and we should utilize the 3.x-version to  migrate
to
> the
> > >>
> > >> new
> > >>
> > >> > >>> indexer
> > >> > >>> version, even if this means switching to jcr oak. Otherwise
it
> > >> > >>> would
> > >> > >>> mean to
> > >> > >>> stick to the old version for the next years.
> > >> > >>> @Olivier, regarding the maven-indexer / sisu-Bridge API
> changes, I
> > >>
> > >> hope
> > >>
> > >> > >>> you
> > >> > >>> can provide  useful help.
> > >> > >>>
> > >> > >>> I committed the PoC to the branch feature/jcr_oak. There
are
> some
> > >> > >>> modules
> > >> > >>> where the tests do not pass (mainly because of the indexer
API
> > >>
> > >> changes).
> > >>
> > >> > >>> Any comments?
> > >> > >>>
> > >> > >>> Cheers
> > >> > >>>
> > >> > >>> Martin
> > >> > >>>
> > >> > >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
Lamy:
> > >> > >>> > forget it but we need to ensure we can read maven
index
> files....
> > >> > >>> >
> > >> > >>> > On 13 June 2017 at 17:06, Olivier Lamy <olamy@apache.org>
> wrote:
> > >> > >>> > > Hi,
> > >> > >>> > > Remember jackrabbit depends on Lucene as well
so upgrading
> > >>
> > >> Lucene
> > >>
> > >> > >>> can be a
> > >> > >>>
> > >> > >>> > > problem here.
> > >> > >>> > > Regarding maven-indexer yes we can depend on
a snapshot
> until
> > >>
> > >> the
> > >>
> > >> > >>> release.
> > >> > >>>
> > >> > >>> > > I can release it ;-)
> > >> > >>> > >
> > >> > >>> > > On 13 June 2017 at 06:06, Martin <martin_s@apache.org>
> wrote:
> > >> > >>> > >> Hi,
> > >> > >>> > >>
> > >> > >>> > >> the lucene version depends on the maven
indexer. But I'm
> not
> > >>
> > >> sure
> > >>
> > >> > >>> about
> > >> > >>>
> > >> > >>> > >> the
> > >> > >>> > >> current state of maven-indexer. The version
has not changed
> > >>
> > >> since
> > >>
> > >> > >>> some
> > >> > >>>
> > >> > >>> > >> 2013.
> > >> > >>> > >>
> > >> > >>> > >> There are commits on the master branch
since then, and the
> > >>
> > >> lucene
> > >>
> > >> > >>> version
> > >> > >>>
> > >> > >>> > >> has
> > >> > >>> > >> been changed too, but no releases were
tagged.
> > >> > >>> > >> Does it make sense to switch to the maven-indexer
> > >> > >>> > >> 6.0-SNAPSHOT?
> > >> > >>> > >>
> > >> > >>> > >> As I know there are new compact index formats
with new
> lucene
> > >> > >>>
> > >> > >>> versions
> > >> > >>>
> > >> > >>> > >> but I'm
> > >> > >>> > >> not sure if this is relevant for the maven
indexes.
> > >> > >>> > >>
> > >> > >>> > >> Cheers
> > >> > >>> > >>
> > >> > >>> > >> Martin
> > >> > >>> > >
> > >> > >>> > > --
> > >> > >>> > > Olivier Lamy
> > >> > >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> > >>
> > >> > >> --
> > >> > >> Olivier Lamy
> > >> > >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> > >
> > >> > > --
> > >> > > Olivier Lamy
> > >> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >
> > > --
> > > Olivier Lamy
> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>
>
>


-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message