manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Will Parkinson <parkinson.w...@gmail.com>
Subject Re: Sharepoint SID extraction for groups
Date Sat, 16 Nov 2013 15:09:31 GMT
Hi Karl,

Yeah that seems to be be case, to get ManifoldCF to work in my case i just
created a separate class to obtain all the user SID's directly from AD if
the group assigned in Sharepoint is an AD group.  This seems to work fine
for now, but it seems to be causing a few database issues.

First of all, some of the SID lists are up to 1.5MB, which seems to be
causing the carrydown table to become huge.  I am also getting errors like

1C159E0: ERROR: could not serialize access due to read/write dependencies
among transactions
  Detail: Reason code: Canceled on identification as a pivot, during
conflict in checking.
  Hint: The transaction might succeed if retried.; sleeping for 56816 ms
org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: could not
serialize access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during
conflict in checking.
  Hint: The transaction might succeed if retried.
        at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:622)
        at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:651)
        at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:187)
        at
org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:68)
        at
org.apache.manifoldcf.crawler.jobs.Carrydown.recordCarrydownDataMultiple(Carrydown.java:343)
        at
org.apache.manifoldcf.crawler.jobs.JobManager.addDocuments(JobManager.java:4174)
        at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.processDocumentReferences(WorkerThread.java:2017)
        at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.flush(WorkerThread.java:1948)
        at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:562)
Caused by: org.postgresql.util.PSQLException: ERROR: could not serialize
access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during
conflict in checking.
  Hint: The transaction might succeed if retried.
        at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
        at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
        at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
        at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
        at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
        at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
        at
org.apache.manifoldcf.core.database.Database.execute(Database.java:883)
        at
org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)

And then i eventually get an error like this

 WARN 2013-11-17 00:41:09,058 (Finisher thread) - Found a long-running
query (77260 ms): [SELECT id FROM jobs WHERE status IN (?,?,?,?,?) FOR
UPDATE]
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 0: 'A'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 1: 'W'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 2: 'R'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 3: 'O'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 4: 'U'
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan: LockRows
(cost=0.00..3.34 rows=5 width=14) (actual time=0.026..0.027 rows=1 loops=1)
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:   ->  Seq Scan on
jobs  (cost=0.00..3.29 rows=5 width=14) (actual time=0.024..0.024 rows=1
loops=1)
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:         Filter:
(status = ANY ('{A,W,R,O,U}'::bpchar[]))
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:         Rows
Removed by Filter: 17
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan: Total runtime:
0.058 ms
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -

And then the update stops completely, even though the status on the "Status
and job management page" is still set as "running".  Do you have any ideas
on how i can fix this?

I am doing some research at the moment on the best way to store permissions
information without storing 100's of SID's.

Cheers,

Will


On Wed, Nov 6, 2013 at 11:42 PM, Karl Wright <daddywri@gmail.com> wrote:

> I should also add that, as far as ActiveDirectory groups go, my
> understanding is that in non-Claim-Space versions of SharePoint, there's a
> SharePoint group created for each AD group.  So a SharePoint user will
> belong to some native SharePoint groups, but also to some "mirrored"
> SharePoint groups that are created because of the user's group
> relationships in AD.
>
> Claim Space seems to change this in the following way: SharePoint groups
> no longer mirror AD groups.  Instead, the Claim Space authorization tokens
> implicitly describe the relationships.  So you have to talk to both
> SharePoint AND AD in order to fully understand what documents in SharePoint
> are authorized for what users.
>
> Karl
>
>
>
> On Wed, Nov 6, 2013 at 8:37 AM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Will,
>>
>> The current connector indeed maps SharePoint groups to individual user
>> SIDs.  That is not terribly scalable, and it is one reason why I've created
>> dedicated SharePoint authorities in the CONNECTORS-754-2 branch, so that we
>> can authorize documents at a group level.
>>
>> I've also done considerable research on the ClaimSpace security model.
>> Supporting it fully has required some modifications to the basic
>> authorization model that ManifoldCF uses to tie documents to authorities.
>> This basic work is done and is now part of trunk as well.  And the
>> documentation has been updated to describe the revised authorization model.
>>
>> If you want to try working with the CONNECTORS-754-2 branch, I'd be very
>> happy to interact with you to iron out any problems.  What you will need to
>> do if you try it out is the following:
>>
>> (1) Create an authority group for your SharePoint instance
>> (2) Create a "SharePoint/Native" authority tied to that authority group
>> (3) If this is a claim-space SharePoint instance, then also create a
>> "SharePoint/Active Directory" authority tied to the same authority group
>> (4) Create your SharePoint repository connection, making sure to select
>> "Native" mode
>>
>> The implementation is currently the best I can do in the absence of a
>> full-blown Claim Space instance.  Even so, there are still questions in my
>> mind that, if I could solve them, would help clarify the implementation.
>> For example, what "Role Definitions" do - are they essentially just another
>> form of group?  And, whether it is better to use a user, group, or role
>> definition's name for an access token, or the ID?  Perhaps you can clarify
>> a bit, I don't know...
>>
>>
>> Thanks,
>> Karl
>>
>>
>>
>> On Wed, Nov 6, 2013 at 8:14 AM, Will Parkinson <parkinson.will@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I am just wondering how the extraction of the groups permissions works
>>> for the sharepoint connector.  From what I can see, it seems that the group
>>> is determined via the MCPermissions.asmx web service and then each user in
>>> that group is iterated over and the SID for those users are extracted.
>>>
>>> Is this the case?  If so, are groups created in Sharepoint and AD groups
>>> treated the same way?
>>>
>>> Cheers,
>>>
>>> Will
>>>
>>
>>
>

Mime
View raw message