manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1306) how to include URL inside the source tag
Date Thu, 28 Apr 2016 13:29:12 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262128#comment-15262128
] 

Karl Wright commented on CONNECTORS-1306:
-----------------------------------------

You have some choices here for implementation.

(1) We can add a fixed field in the WebConnector.  That makes sense because the URL may specifically
be useful for web crawling, whereas they are less useful for other cases.
(2) We could add support in the Metadata Adjuster for forcing document id's into the attribute
of your choice.  This would generally allow people access to document ID's
(3) We could add this support solely to the Elastic Search connector.

Please help us decide.


> how to include URL inside the source tag
> ----------------------------------------
>
>                 Key: CONNECTORS-1306
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1306
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Elastic Search connector
>    Affects Versions: ManifoldCF 2.2
>         Environment: Production
>            Reporter: Arunkumar 
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.5
>
>
> we are crawling the website and pushing to elasticsearch. While crawling the urls are
converted as ids in elasticsearch. In elastic search we could not search against id field.
so we need to include the urls in source tag(meta tag)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message