manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CONNECTORS-917) SharePoint connector would benefit from site discovery
Date Sun, 30 Nov 2014 09:16:12 GMT

     [ https://issues.apache.org/jira/browse/CONNECTORS-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wright updated CONNECTORS-917:
-----------------------------------
    Fix Version/s:     (was: ManifoldCF 2.0)
                   ManifoldCF next

> SharePoint connector would benefit from site discovery
> ------------------------------------------------------
>
>                 Key: CONNECTORS-917
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-917
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: SharePoint connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF next
>
>
> The current SharePoint connector only can crawl a single SharePoint site.  But SharePoint
can support multiple sites.  Indeed, in some cases there are hundreds of such sites.  Setting
up a connection and jobs for each one would be a difficult task.
> The SharePoint admin site allows you to discover the sites that exist.  Using this feature
as part of the crawl would allow for a much more automated way of handling large SharePoint
installations.
> Some notes:
>    - Not yet clear how "one site" vs. "many sites" should coexist in one connector
>      - Form of document identifier must change
>      - Each document identifier must include the site path first
>      - Since subsite path can be just "/", also needs to be resilient against that
>      - Something like: <site_path>//<current_subsite_doc_list_item_etc_path>.
 But "//" will collide with old-style.
>      - If old-style document identifier always must start with a "/", then we can simply
start it with (say) a "+", to signal that it is a new-style identifier
>      - Not clear yet if there's a new form that would allow us to know if a doc identifier
was old form or not
>    - Native authority also right now needs to know what site it is working with
>      - Site discovery therefore must also be run in the authority, and tokens for each
discovered site must be returned
>      - Native tokens must therefore be qualified with a site ID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message