manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject RE: Sharepoint SID extraction for groups
Date Sat, 16 Nov 2013 19:06:06 GMT
Hi will,
The long running query is not fatal - it is just a warning.

The very-long sid list requires a SharePoint authority, as discussed.

The pivot error sounds like it is something that can be addressed
though.  Please create a ticket and put the full exception into it,
and I will look at it either tomorrow or Monday.

Thanks,
Karl

Sent from my Windows Phone

-----Original Message-----
From: Will Parkinson
Sent: 11/16/2013 10:10 AM
To: user@manifoldcf.apache.org
Subject: Re: Sharepoint SID extraction for groups








Hi Karl,


Yeah that seems to be be case, to get ManifoldCF to work in my case i
just created a separate class to obtain all the user SID's directly
from AD if the group assigned in Sharepoint is an AD group.  This
seems to work fine for now, but it seems to be causing a few database
issues.

First of all, some of the SID lists are up to 1.5MB, which seems to be
causing the carrydown table to become huge.  I am also getting errors
like

1C159E0: ERROR: could not serialize access due to read/write
dependencies among transactions
   Detail: Reason code: Canceled on identification as a pivot, during
conflict in checking.
  Hint: The transaction might succeed if retried.; sleeping for 56816 ms
org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR:
could not serialize access due to read/write dependencies among
transactions
   Detail: Reason code: Canceled on identification as a pivot, during
conflict in checking.
  Hint: The transaction might succeed if retried.
        at org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:622)
         at org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:651)
        at org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:187)
         at org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:68)
        at org.apache.manifoldcf.crawler.jobs.Carrydown.recordCarrydownDataMultiple(Carrydown.java:343)
        at org.apache.manifoldcf.crawler.jobs.JobManager.addDocuments(JobManager.java:4174)
         at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.processDocumentReferences(WorkerThread.java:2017)
        at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.flush(WorkerThread.java:1948)
         at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:562)
Caused by: org.postgresql.util.PSQLException: ERROR: could not
serialize access due to read/write dependencies among transactions
   Detail: Reason code: Canceled on identification as a pivot, during
conflict in checking.
  Hint: The transaction might succeed if retried.
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
         at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
        at org.apache.manifoldcf.core.database.Database.execute(Database.java:883)
         at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)

And then i eventually get an error like this

 WARN 2013-11-17 00:41:09,058 (Finisher thread) - Found a long-running
query (77260 ms): [SELECT id FROM jobs WHERE status IN (?,?,?,?,?) FOR
UPDATE]
  WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 0: 'A'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 1: 'W'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 2: 'R'
  WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 3: 'O'
 WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 4: 'U'
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan: LockRows
(cost=0.00..3.34 rows=5 width=14) (actual time=0.026..0.027 rows=1
loops=1)
  WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:   ->  Seq
Scan on jobs  (cost=0.00..3.29 rows=5 width=14) (actual
time=0.024..0.024 rows=1 loops=1)
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:
Filter: (status = ANY ('{A,W,R,O,U}'::bpchar[]))
  WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:         Rows
Removed by Filter: 17
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan: Total runtime: 0.058 ms
 WARN 2013-11-17 00:41:09,060 (Finisher thread) -

And then the update stops completely, even though the status on the
"Status and job management page" is still set as "running".  Do you
have any ideas on how i can fix this?

I am doing some research at the moment on the best way to store
permissions information without storing 100's of SID's.

Cheers,

Will




On Wed, Nov 6, 2013 at 11:42 PM, Karl Wright <daddywri@gmail.com> wrote:


I should also add that, as far as ActiveDirectory groups go, my
understanding is that in non-Claim-Space versions of SharePoint,
there's a SharePoint group created for each AD group.  So a SharePoint
user will belong to some native SharePoint groups, but also to some
"mirrored" SharePoint groups that are created because of the user's
group relationships in AD.

Claim Space seems to change this in the following way: SharePoint
groups no longer mirror AD groups.  Instead, the Claim Space
authorization tokens implicitly describe the relationships.  So you
have to talk to both SharePoint AND AD in order to fully understand
what documents in SharePoint are authorized for what users.

Karl








On Wed, Nov 6, 2013 at 8:37 AM, Karl Wright <daddywri@gmail.com> wrote:









Hi Will,



The current connector indeed maps SharePoint groups to individual user
SIDs.  That is not terribly scalable, and it is one reason why I've
created dedicated SharePoint authorities in the CONNECTORS-754-2
branch, so that we can authorize documents at a group level.


I've also done considerable research on the ClaimSpace security model.
 Supporting it fully has required some modifications to the basic
authorization model that ManifoldCF uses to tie documents to
authorities.  This basic work is done and is now part of trunk as
well.  And the documentation has been updated to describe the revised
authorization model.

If you want to try working with the CONNECTORS-754-2 branch, I'd be
very happy to interact with you to iron out any problems.  What you
will need to do if you try it out is the following:

(1) Create an authority group for your SharePoint instance
(2) Create a "SharePoint/Native" authority tied to that authority group
(3) If this is a claim-space SharePoint instance, then also create a
"SharePoint/Active Directory" authority tied to the same authority
group
(4) Create your SharePoint repository connection, making sure to
select "Native" mode

The implementation is currently the best I can do in the absence of a
full-blown Claim Space instance.  Even so, there are still questions
in my mind that, if I could solve them, would help clarify the
implementation.  For example, what "Role Definitions" do - are they
essentially just another form of group?  And, whether it is better to
use a user, group, or role definition's name for an access token, or
the ID?  Perhaps you can clarify a bit, I don't know...


Thanks,
Karl







On Wed, Nov 6, 2013 at 8:14 AM, Will Parkinson <parkinson.will@gmail.com> wrote:






Hello,


I am just wondering how the extraction of the groups permissions works
for the sharepoint connector.  From what I can see, it seems that the
group is determined via the MCPermissions.asmx web service and then
each user in that group is iterated over and the SID for those users
are extracted.

Is this the case?  If so, are groups created in Sharepoint and AD
groups treated the same way?

Cheers,

Will

Mime
View raw message