manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Call for trunk pipeline testers
Date Wed, 11 Jun 2014 23:07:36 GMT
Hi Ahmet,

For 1.x we can't drop backwards compatibility, so we can't drop our custom
solr server implementation.  For 2.0 we might be able to drop things that
are no longer useful, though.

Karl



On Wed, Jun 11, 2014 at 6:42 PM, Ahmet Arslan <iorixxx@yahoo.com.invalid>
wrote:

> Hi,
>
> bq. we are going to also modify the solr connector for not using the
> extract update handler
>
> +1 to this. With this, we can support wide range of solr versions by just
> sending xml update messages. Solr setups will be simpler i.e. Don't need to
> have solr cell jars. We can drop our custom solr server implementation.
>
> Ahmet
>
>
>
> On Wednesday, June 11, 2014 7:10 PM, Rafa Haro <rharo@apache.org> wrote:
> Hi Karl,
>
> We (in Zaizi) had also this requirement. We initially addressed it by
> creating a sort of "Processor Connector" mainly for semantically
> enhancing the repository documents before indexing them. We would be
> very happy to give this a try and provide feedback because our current
> approach is totally temporal. Apart from processing the document, we
> also had an special requirement that is to produce different instances
> of repository documents because we populate more than one index at the
> same time. We would need to check also how we can do exactly the same
> with this processing pipeline.
>
> Apart from this Karl, we can also take care of the Tika integration
> (actually we already did it) and eventually take care of CONNECTORS-954
> then. Because we already use Tika as "processor connector", we are going
> to also modify the solr connector for not using the extract update
> handler which present some problems also. Would that be interesting also
> for the community?
>
> Cheers,
> Rafa
>
>
>
>
> El 11/06/14 16:09, Karl Wright escribió:
> > Hi folks,
> >
> > ManifoldCF finally has a pipeline!  All tests pass.  Now I'm looking for
> > people to try things out by hand to see if there are any rough edges,
> > before we get too far along in the 1.7 development cycle to fix them.
> >
> > Trunk has all the necessary moving parts and documentation as well.
> There
> > are two transformation connectors available -- one that does nothing but
> > pass data through, and one that forces metadata (just like the framework
> > "Forced metadata" tab).  But since you can have more than one of each
> kind
> > of connector in a pipeline, this should be enough to exercise things
> fairly
> > completely.
> >
> > We still need to address a couple of things in the medium and long term.
> > First, we need a Tika transformation connector, that extracts metadata
> from
> > binary files.  There's an existing ticket for that: CONNECTORS-954.  If
> > anyone wants to take a crack at that, please let me know.  (Takumi
> Yoshida
> > would be the obvious choice.)  Second, we need to come up with a strategy
> > of removing obsolete tabs/features, like the aforementioned general job
> > Forced Metadata tab.  We've got a fair number of such features around
> now.
> > These kinds of things cannot be removed without either a comprehensive
> > automatic upgrade, or loss of backwards compatibility.  I am thinking
> maybe
> > we break with backwards compatibility and work towards cleaning out
> > duplicate features for ManifoldCF 2.0.
> >
> > Thoughts?
> >
> > Karl
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message