manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1203) Erratic handling of Sharepoint 2010 _ModerationStatus metadata
Date Sat, 06 Jun 2015 10:42:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575661#comment-14575661
] 

Karl Wright commented on CONNECTORS-1203:
-----------------------------------------

Hi Dale,

In order to actually see what's going on, we will need to change a header in the request that
HttpClient is making to the server.

It turns out that httpclient's current default behavior is now to push an "Accept-Encoding:
bzip,deflate" header on all requests unless told otherwise.  So we will need to tell it otherwise.

In the file:
framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java


around line 312, you will see:

{code}
    method.setHeader(new BasicHeader("Accept","*/*"));
{code}

Please add a line:

{code}
   method.setHeader(new BasicHeader("Accept-Encoding",""));
{code}

You will then need to rebuild, and repeat your data gathering.  You should see XML going back
and forth then, rather than gobbledegook.

Thanks!


> Erratic handling of Sharepoint 2010 _ModerationStatus metadata
> --------------------------------------------------------------
>
>                 Key: CONNECTORS-1203
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1203
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: SharePoint connector
>    Affects Versions: ManifoldCF 1.7.2
>            Reporter: Dale Dreiske
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.10, ManifoldCF 2.2
>
>         Attachments: debug7.log
>
>
> The ManifoldCF Sharepoint 2010 connector handles the Approval Status metadata inconsistently.
In some cases it does not provide access to Approval Status at all.
> On /mcf-crawler-ui/execute.jsp#metapathwidget :
> * The field name appears in the drop down list as "Approval Status" when adding a new
rule.
> * The field name is NOT available in the drop down list for top level sites.
> * The field name is listed as "_ModerationStatus" for existing rules.
> With connector debug turned on, the ManifoldCF logs show the field coming from Sharepoint
as "ows__ModerationStatus". This is consistent across all pages, even when the the field is
not added to the metadata rules.
> When sent to Solr, it appears in one of these 4 forms:
> * "ows__ModerationStatus"
> * "_ModerationStatus"
> * "_moderationstatus"
> * In some cases it does not get passed at all.
> This issue is most troublesome when this field is not displayed for creating new metadata
rules. It appears it is only available when creating rules for pages in low level sites. Example
paths:
> * /abc  - does not work for top level sharepoint sites
> * /abc/xyz  - works but passes name inconsistently;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message