manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shinichiro Abe (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1219) Lucene Output Connector
Date Tue, 14 Jul 2015 18:23:04 GMT


Shinichiro Abe commented on CONNECTORS-1219:

Thanks for the review as for my commits. I added term vector option, and replaced string args
with reader, since then, I didn't see any errors about OOM. And I did big crawling in my chaos
manifoldcf directory many times, it is okay for managing memory as long as setting proper
max doc length.
I'm ready to merge this, but before merging, please check this patch which impls a simple
search handler working on jetty, i confirmed it worked well a few hours ago(And I want to
add highlighting response this week). I plan to create a search servet and its api in the
future, which has a role of distributed searching for multiple mcf instance on multiple nodes.
the servet will have to send requests to more than one jetty search handler. i'd like to add
this, but is this a too much feature for users? 

> Lucene Output Connector
> -----------------------
>                 Key: CONNECTORS-1219
>                 URL:
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Shinichiro Abe
>            Assignee: Shinichiro Abe
>         Attachments: CONNECTORS-1219-v0.1patch.patch, CONNECTORS-1219-v0.2.patch
> A output connector for Lucene local index directly, not via remote search engine. It
would be nice if we could use Lucene various API to the index directly, even though we could
do the same thing to the Solr or Elasticsearch index. I assume we can do something to classification,
categorization, and tagging, using e.g lucene-classification package.

This message was sent by Atlassian JIRA

View raw message