manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Thomas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1517) Documentum Connector uses different "unconstrained" a_content_type filters depending on whether the Content Types tab has been edited
Date Mon, 13 Aug 2018 14:45:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578357#comment-16578357
] 

James Thomas commented on CONNECTORS-1517:
------------------------------------------

Hi Karl, I have now had a chance to try out the patches. I'll attach a transcript which shows
the queries executed (from manifoldcf.log) when I ran a job with particular configuration
in the Content Types tab of the Documentum Connector.

My observations and thoughts:
 * The core bug that I reported - that editing the Content Types tab and then resetting it
results in different semantics at search time appears fixed.
 * The default search is still unconstrained.
 * It is surprising to be able to have both "No content type restriction" and any other checkbox
checked at the same time. I wonder whether "No content type restriction" checked should disable
all of the others?
 * It is surprising to be able to submit with no checkboxes checked and have a query constrained
with "1<0" as it looks like this can never succeed
 * Generalising the above, I think I'd prefer to see some more restriction on what combinations
of box can be checked at the same time.
 * It would be convenient as a user to have a control for check all/uncheck all
 * I haven't been using ManifoldCF/Documentum long enough to know whether there are likely
to be backwards compatibility issues in changing the UI this way

[^Notes.txt]

> Documentum Connector uses different "unconstrained" a_content_type filters depending
on whether the Content Types tab has been edited
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1517
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1517
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Documentum connector
>    Affects Versions: ManifoldCF 2.10
>            Reporter: James Thomas
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 2.11
>
>         Attachments: CONNECTORS-1517-2.patch, CONNECTORS-1517.patch, Notes.txt
>
>
> I am using Manifold 2.10 patched for issue https://issues.apache.org/jira/browse/CONNECTORS-1512
> I find that the "unconstrained" query submitted to Documentum differs depending on whether
the Content Types in the job have been edited or not. This can dramatically affect which files
are fetched. After editing, there are likely to be fewer.
> For example, having simply created a job connecting to DM and setting only the Paths
value to Administrator/james the following request is generated. (Taken from manifoldcf.log).
> Note that there are no a_content_type constraints (and my line break for readibility):
> {code:java}
> DEBUG 2018-07-26T05:52:56,422 (Startup thread) - DCTM: About to execute query= (select
for READ distinct i_chronicle_id from dm_document where r_modify_date >= date('01/01/1970
01:00:00','mm/dd/yyyy hh:mi:ss') and r_modify_date<=date('07/26/2018 05:52:56','mm/dd/yyyy
hh:mi:ss') AND (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND r_content_size>0))
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> Once the Content Types tab has been edited (e.g. to remove the 123w type) it looks like
this, i.e. the search constrains to only the selected types (my ellipsis for readibility):
> {code:java}
> DEBUG 2018-07-26T05:58:36,755 (Startup thread) - DCTM: About to execute query= (select
for READ distinct i_chronicle_id from dm_document where r_modify_date >= date('01/01/1970
01:00:00','mm/dd/yyyy hh:mi:ss') and r_modify_date<=date('07/26/2018 05:58:36','mm/dd/yyyy
hh:mi:ss') AND (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND r_content_size>0

> AND a_content_type IN ('acad', ... 'zip_pub_html'))) 
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> If the 123w type is now reselected in the Content Types tab, the search adds it to the
list of a_content_type entries, but doesn't return to the unconstrained initial search:
> {code:java}
> DEBUG 2018-07-26T05:59:16,863 (Startup thread) - DCTM: About to execute query= (select
for READ distinct i_chronicle_id from dm_document where r_modify_date >= date('01/01/1970
01:00:00','mm/dd/yyyy hh:mi:ss') and r_modify_date<=date('07/26/2018 05:59:16','mm/dd/yyyy
hh:mi:ss') AND (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND r_content_size>0

> AND a_content_type IN ('123w', ... 'zip_pub_html'))) 
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> This means that running what appears to be an equivalent job several times may not fetch
the same set of documents from Documentum.
> I expect that the same configuration in the UI produces the same search to Documentum,
regardless of how the configuration was arrived at.
> If the selected items in the Content Types list is treated as the only set of files to
fetch (i,.e. the initial unconstrained search is considered incorrect here) then I guess I
might also like to have flexibility to fetch file types not on the checklist in the Content
Types tab.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message