manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: [VOTE] Release Apache ManifoldCF 2.11, RC3
Date Tue, 25 Sep 2018 12:15:57 GMT
Excellent!! I'll create a new RC.

Thanks again,
Karl


On Tue, Sep 25, 2018 at 8:13 AM Julien Massiera <
julien.massiera@francelabs.com> wrote:

> This new fix seems to work. Ingestions and deletions are working and the
> image file with huge metadata is indexed !
>
> Julien
>
>
> On 25/09/2018 13:59, Karl Wright wrote:
> > I've committed a hack to trunk.  It has been tested for Solr Cell
> > documents, deletions, and for tika-connector-extracted documents that
> don't
> > have a lot of metadata.  I'm asking Julien to test it with his specific
> > image that has lots of metadata to see if the pathway for that case works
> > properly.  If it does, I'll spin another RC.
> >
> > Long term, since I'm a Lucene/Solr committer, I think I'm going to have
> to
> > take SolrJ under my wing if we expect it to work for ManifoldCF.  I don't
> > have a lot of time to do stuff like this anymore but clearly neither does
> > the Solr team.
> >
> > Karl
> >
> >
> > On Tue, Sep 25, 2018 at 6:14 AM Karl Wright <daddywri@gmail.com> wrote:
> >
> >> The back-and-forth is not going well.  Mr. Noble is needing to be
> >> convinced that it is a valid use case for Solr to have metadata longer
> than
> >> 4096 characters.  In fact it seems like the Solr folks have deliberately
> >> been trying to get rid of support for multipart posts for a while,
> because
> >> they don't see the need for them.  I'm still hoping to convince them
> >> otherwise but I'm not getting a positive feel.
> >>
> >> I'm still trying to figure out if multipart posts have any fundamental
> >> conflict with their RequestWriter architecture.  If not I can perhaps
> >> override the RequestWrite implementation and add multipart support that
> >> way.  But it's not going to be a quick process by any means.
> >>
> >>
> >> On Mon, Sep 24, 2018 at 12:13 PM Karl Wright <daddywri@gmail.com>
> wrote:
> >>
> >>> Hi Julien,
> >>>
> >>> This has nothing to do with the new Tika.
> >>>
> >>> It is not normal; it means that UpdateRequests are not being sent as
> >>> multipart form posts.  It's going to require work from the Solr team
> to fix
> >>> this problem, however, because everything I do to work around the issue
> >>> nonetheless seems to fail. :-(
> >>>
> >>> I'm having a back-and-forth with Paul Noble right now.  I'll update
> >>> accordingly when I know more.
> >>>
> >>> Karl
> >>>
> >>>
> >>> On Mon, Sep 24, 2018 at 11:33 AM Julien Massiera <
> >>> julien.massiera@francelabs.com> wrote:
> >>>
> >>>> After testing it, it is a +1 for me
> >>>>
> >>>> However, I found a new interesting issue coming with the new Tika
> >>>> version. I had a jpg file for which some metadata were not extracted
> >>>> before, like the RedTRC, BlueTRC and GreenTRC which contain
> >>>> approximatively 2048 bytes of data each. As the metadata are passed
to
> >>>> Solr through the URI, I get the following error : URI is too large
> >8192
> >>>>
> >>>> Do we consider it as a "normal issue" or is it worth checking the
> >>>> metadata length before sending the ingest request ?
> >>>>
> >>>>
> >>>> On 24/09/2018 16:43, Karl Wright wrote:
> >>>>> Please vote on whether to release ManifoldCF 2.11, RC3.  This release
> >>>>> contains a number of fixes/improvements/additions, described in
the
> >>>>> CHANGES.txt file.  In addition, it includes Tika 1.19, which has
a
> >>>> number
> >>>>> of fixes for classpath issues specifically requested by ManifoldCF.
> >>>>>
> >>>>> This completely fixes a SolrJ related problem with the Solr Connector
> >>>> found
> >>>>> in RC3.  All tests pass.
> >>>>>
> >>>>> The release artifact can be found at:
> >>>>>
> >>>>>
> >>>>
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.11
> >>>>> There is also a tag at:
> >>>>>
> >>>>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.11-RC3
> >>>>>
> >>>>> Thanks again,
> >>>>> Karl Wright
> >>>>>
> >>>> --
> >>>> Julien MASSIERA
> >>>> Directeur développement produit
> >>>> France Labs – Les experts du Search
> >>>> Retrouvez-nous à l’Enterprise Search & Discovery Summit à Washington
> DC
> >>>> www.francelabs.com
> >>>>
> >>>>
>
> --
> Julien MASSIERA
> Directeur développement produit
> France Labs – Les experts du Search
> Retrouvez-nous à l’Enterprise Search & Discovery Summit à Washington DC
> www.francelabs.com
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message