lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: apply document filter to solr index
Date Mon, 04 Jan 2016 14:22:31 GMT
Well, you have a crawling and extraction pipeline. You can probably inject
a classification algorithm somewhere in there, possibly NLP trained on
manual seed. Or just a list of typical words as a start.

This is kind of pre-Solr stage though.

Regards,
    Alex
On 4 Jan 2016 7:37 pm, <liviuchristian@yahoo.com.invalid> wrote:

> Hi everyone, I'm working on a search engine based on solr which indexes
> documents from a large variety of websites.
> The engine is focused on cook recipes. However, one problem is that these
> websites provide not only content related to cooking recipes but also
> content related to: fashion, travel, politics, liberty rights etc etc which
> are not what the user expects to find on a cooking recipes dedicated search
> engine.
> Is there any way to filter out content which is not related to the core
> business of the search engine?
> Something like parental control software maybe?
> Kind regards,Christian Christian Fotache Tel: 0728.297.207 Fax:
> 0351.411.570

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message