Return-Path: X-Original-To: apmail-incubator-connectors-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-connectors-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0905E9FF3 for ; Tue, 31 Jan 2012 17:37:43 +0000 (UTC) Received: (qmail 57815 invoked by uid 500); 31 Jan 2012 17:37:39 -0000 Delivered-To: apmail-incubator-connectors-user-archive@incubator.apache.org Received: (qmail 56625 invoked by uid 500); 31 Jan 2012 17:37:36 -0000 Mailing-List: contact connectors-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: connectors-user@incubator.apache.org Delivered-To: mailing list connectors-user@incubator.apache.org Received: (qmail 56500 invoked by uid 99); 31 Jan 2012 17:37:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jan 2012 17:37:35 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=5.0 tests=RCVD_IN_DNSWL_MED,SINGLE_HEADER_2K,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [128.229.5.20] (HELO mclniron01-ext.bah.com) (128.229.5.20) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jan 2012 17:00:27 +0000 x-SBRS: None X-REMOTE-IP: 10.12.10.58 X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=VDa2sVBZJ9ly/wN+GtI6CrmodIMotqIObqDvHFqZThY= c=1 sm=2 a=oVSte4tGHB0A:10 a=_jJgUbogJ7YA:10 a=VSl--tYzzWsA:10 a=kj9zAlcOel0A:10 a=xqWC_Br6kY4A:10 a=pGLkceISAAAA:8 a=mV9VRH-2AAAA:8 a=lGcAVYM4AAAA:8 a=yMhMjlubAAAA:8 a=mRAZkp4NAAAA:8 a=ISD-xh5Qqq90KcMokFcA:9 a=-Jus5_Va1l-sknINgkwA:7 a=CjuIK1q_8ugA:10 a=8SEq8QoH29YA:10 a=MSl-tDqOz04A:10 a=kjO27gckG74A:10 a=EEI4Bxjg1aMTSuRv:21 a=laGweg1X4cF2CHr3:21 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqEEAI0dKE8KDAo6/2dsb2JhbABDFq1kgWuBcgEBAQQBAhcBDBM0CxACAQgOAwQBAQEKAgkJECEHCh0IAgQOAQQIEodruHmLAAYCAgEBBAoJEQUDBgGDPiltAzULBoJFYwSSeY0uh00 X-IronPort-AV: E=Sophos;i="4.71,596,1320642000"; d="scan'208";a="114308869" Received: from ashbcshb04.resource.ds.bah.com ([10.12.10.58]) by mclniron01-int.bah.com with ESMTP; 31 Jan 2012 12:00:04 -0500 Received: from ASHBDAG2M4.resource.ds.bah.com ([fe80::cc6:899:b51e:568e]) by ASHBCSHB04.resource.ds.bah.com ([10.12.10.58]) with mapi id 14.01.0355.002; Tue, 31 Jan 2012 12:00:03 -0500 From: "Silvia, Daniel [USA]" To: Karl Wright CC: "connectors-user@incubator.apache.org" Subject: RE: ManifoldCF's dist/shapoint-integration dir Thread-Topic: ManifoldCF's dist/shapoint-integration dir Thread-Index: AQHM129nRffR1v/G10uk3YBV5Uk9J5Yb4HmegACJiQCAAOrLYoAAXxcAgABenoCAAIcOfYAAFMtxgABaAgD//68afYAAWJoA//+zahuAAFYXgP//uTIhAAqhTgAAJNQ5ogAazzgAAFSSo5kAMA2pAP//zZmegABYigCAAQjxrIAAbWkA//+vm5KAAGF7gP//sr3PgABbYoD//719EA== Date: Tue, 31 Jan 2012 17:00:03 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.12.230.72] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hi Karl Ok, I added a number of Paths using File, Site, and Library to the main sit= e, and sub site under the main site. I looked at the log file and it appear= s I am getting an axis configuration exception: No service named UserGroupSoap is available and No service named http://...= .../GetUserCollectionFromGroup is available. My site admin for the SharePoint Portal has given me Full Control access, s= o there shouldn't be aby issue with authentication to the SharePoint servic= es. Also, I went to the properties.xml file to modify the org.apache.manifoldcf= .connectors property, however, this property didn't exist. I added the prop= erty which looks like Thanks ________________________________________ From: Karl Wright [daddywri@gmail.com] Sent: Tuesday, January 31, 2012 10:52 AM To: Silvia, Daniel [USA] Cc: connectors-user@incubator.apache.org Subject: Re: ManifoldCF's dist/shapoint-integration dir It's been a while since I've set up a SharePoint job but I think what you are missing is a file rule (instead of just a library rule). Here's what the end-user documentation says on the matter: "Each rule consists of a path, a rule type, and an action. The actions are "Include" and "Exclude". The rule type tells the connection what kind of SharePoint entity it is allowed to exactly match. For example, a "File" rule will only exactly match SharePoint paths that represent files - it cannot exactly match sites or libraries. The path itself is just a sequence of characters, where the "*" character has the special meaning of being able to match any number of any kind of characters, and the "?" character matches exactly one character of any kind. The rule matcher extends strict, exact matching by introducing a concept of implicit inclusion rules. If your rule action is "Include", and you specify (say) a "File" rule, the matcher presumes implicit inclusion rules for the corresponding site and library. So, if you create an "Include File" rule that matches (for example) "/MySite/MyLibrary/MyFile", there is an implied "Site Include" rule for "/MySite", and an implied "Library Include" rule for "/MySite/MyLibrary". Similarly, if you create a "Library Include" rule, there is an implied "Site Include" rule that corresponds to it. Note that these shortcuts only applies to "Include" rules - there are no corresponding implied "Exclude" rules." What this means is that you should probably be declaring file rules with "*" as the file name for each library, rather than a library rule. You might want to just try this. If you still have trouble, you can try setting the "org.apache.manifoldcf.connectors" property to "DEBUG" in the properties.xml file and restarting ManifoldCF before your next crawl. The manifoldcf.log file will then have output describing the decisions the SharePoint connector made about each site, library, file, or folder it encountered. Thanks, Karl On Tue, Jan 31, 2012 at 10:27 AM, Silvia, Daniel [USA] wrote: > Hi Karl > > The Path Rules are : > > Path Match: /Shared Documents > Type: library > Action: include > > Path Match: /IDD/Shared Documents > Type: library > Action: include > > Path Match: /IDD/Documents > Type: library > Action: include > > Path Match: /manifoldcf/Shared Documents > Type: library > Action: include > > I hope this helps. > > I really appreciate your help. > > > > ________________________________________ > From: Karl Wright [daddywri@gmail.com] > Sent: Tuesday, January 31, 2012 10:01 AM > To: Silvia, Daniel [USA] > Cc: connectors-user@incubator.apache.org > Subject: Re: ManifoldCF's dist/shapoint-integration dir > > "When I select only the fetch activity, I don't see anything in the > events, when I select the Document Ingest activity, I don't see > anything in the events." > > So either you've already run the job and the documents were accessed > the first time (and won't be accessed again until they change), or the > problem is likely that your SharePoint Path Rules are not including > any documents. It would be very helpful at this point to include a > screen shot of the job you've created. Since you are not on the net, > perhaps you can jot down your SharePoint path rules for me to have a > look at, as they are displayed when you view the job. > > Thanks, > Karl > > On Tue, Jan 31, 2012 at 9:44 AM, Silvia, Daniel [USA] > wrote: >> Hi Karl >> >> Ok, I have created a new job and ran the job and went to the Simple Hist= ory Report. >> >> I see the Events. If all the Activities in the Simple History Report, D= ocument Deletion(SolrPipeline), Document Ingest(SolrPipeline), and Fetch ar= e selected I see a start job and end job for events . When I get to the Sim= ple History Report I can select the "Connection", I don't have an option to= select the Activities I run the report first. >> When I select only the fetch activity, I don't see anything in the event= s, when I select the Document Ingest activity, I don't see anything in the = events. >> >> My solr output connection has the following information: >> Protocol: http >> Server: "the server name" >> Port:8080 (we are running solr on Jboss port 8080) >> Web Application Name: solr >> Core Name: collection1 >> Update Handler: update/extract >> Remove Handler: /update >> Status Handler: /admin/ping >> >> >> >> ________________________________________ >> From: Karl Wright [daddywri@gmail.com] >> Sent: Tuesday, January 31, 2012 9:00 AM >> To: Silvia, Daniel [USA]; connectors-user@incubator.apache.org >> Subject: Re: ManifoldCF's dist/shapoint-integration dir >> >> Ok, let's do one thing at a time. >> >> First: >> >> "For the Path tab where there are Path Rules, are these the paths we >> want ManifoldCF to follow? Each site, and each Library like Documents >> and Shared Documents. And in the Metadata tab, this is the tab where >> you indicate for each "Site" and "Library" you want to include >> specific metadata or include all metadata?" >> >> For SharePoint, there are Path Rules and Metadata Rules. The Path >> Rules describe what documents you want to include or exclude. The >> Metadata Rules describe what metadata you want to include or exclude. >> For right now I would ignore the Metadata Rules and just make sure you >> have Path Rules that mean that you have included documents. >> >> "As I run the report, I see "Documents", "Active, and "Processed" >> where the numbers change under the "Active" column as well as the >> "Document" and "Processed" column (these just get larger, where Active >> changes). " >> >> This "report" we actually call the Job Status screen. The fact that >> the numbers get larger and the job doesn't just end indicates that you >> are successfully crawling your SharePoint, and you have set up the job >> to include at least some documents. This is good news. However, this >> is NOT the "Simple History" report I was alluding to earlier. To get >> to that report, click on the "Simple History" link on the left-hand >> navigation area. This report will show the events of your choice >> (default - ALL recorded events) over a given time window (default: the >> last hour). If you've done this right you should at least see a "Job >> start" event. The events you are most interested in are the "fetch" >> (which describes all attempts to fetch documents from SharePoint) and >> "document ingest", which describe attempts to get documents into Solr. >> You can refresh the displayed events by clicking the "Go" button in >> the middle of the screen whenever you wish. >> >> I'd like you to delete your job, create it again, and start it. Then, >> while it is running, I'd like you to go to the "Simple History" >> screen, and select the appropriate connection (your SharePoint >> repository connection), and click the "Go" button. So as not to skip >> anything basic: >> >> (1) What event types do you see? >> (2) Are there "fetch" events? >> (3) Are there "document ingest" events? >> >> If you see no "fetch" events, that implies you have either not >> specified any documents to include in your job, OR your Solr >> connection is configured to reject too many document types so they are >> all getting filtered out. >> >> If you see "document ingest" events, but those have errors, it implies >> that the configuration of your Solr connection is incorrect and does >> not match the way your Solr is configured. If you send me a specific >> error code and/or text I can help you figure out what is happening. >> >> If you see "document ingest" events with NO errors, but the Solr >> instance is not getting documents, you are describing an impossible >> situation. While your Solr instance may not be configured to have the >> Extracting Update Handler active, or it may be at a different URL than >> what you pointed at, that would definitely yield errors or >> notifications in the Simple History. >> >> Please let me know what you actually see. >> Karl >> >> >> >> On Tue, Jan 31, 2012 at 7:53 AM, Silvia, Daniel [USA] >> wrote: >>> Hi Karl >>> >>> I am trying to figure out why I can't see anything being indexed into o= ur Solr index. I was looking at another post where you were working with "M= artijn" and that individual was not able to see info getting into Solr. In = the report that I have set up, I have included all metadata associated to = each site, Share Documents, and Documents. In the Solr Field Mapping, I am = associating metadata fields that are indicated in the MetaData tab to field= s that exist in our solr index. >>> >>> For the Path tab where there are Path Rules, are these the paths we wan= t ManifoldCF to follow? Each site, and each Library like Documents and Shar= ed Documents. And in the Metadata tab, this is the tab where you indicate f= or each "Site" and "Library" you want to include specific metadata or inclu= de all metadata? >>> >>> As I run the report, I see "Documents", "Active, and "Processed" where = the numbers change under the "Active" column as well as the "Document" and = "Processed" column (these just get larger, where Active changes). While I w= as researching why I may not be seeing something over on the Solr side, I s= aw your communication with another individual indicating that I should see = something like literal.xxx=3Dyyy in the Solr log. This is an older post so = there maybe something else I should see. But the only thing I see when I lo= ok at the Solr log is "[ ] webapp=3D/solr path=3D/update/extract params=3D{= commit=3Dtrue} status=3D0 QTime=3D0". >>> >>> Any ideas. >>> >>> Thanks >>> >>> >>> >>> >>> >>> ________________________________________ >>> From: Karl Wright [daddywri@gmail.com] >>> Sent: Monday, January 30, 2012 10:40 AM >>> To: Silvia, Daniel [USA] >>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>> >>> The default time range for the Simple History is the last hour. I >>> suspect you are unaware of that. If you want a different time range >>> you will have to modify the start and end time pulldowns accordingly. >>> >>> Karl >>> >>> On Mon, Jan 30, 2012 at 10:34 AM, Silvia, Daniel [USA] >>> wrote: >>>> Hi Karl >>>> >>>> I am looking at the Simple History in the UI and there isn't much to s= ee, unless I am not getting what I am suppose to. I see the "Start Time, A= ctivity, Identifier, Bytes, and Time, I don't get anything for Result Code = or Result Description. I looked in the documentation and we should be getti= ng something in those fields, I believe. >>>> >>>> Anyway, I will look through the mail list to see what I can find. >>>> >>>> Thanks for the help. >>>> >>>> Dan >>>> >>>> ________________________________________ >>>> From: Karl Wright [daddywri@gmail.com] >>>> Sent: Monday, January 30, 2012 8:24 AM >>>> To: Silvia, Daniel [USA] >>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>> >>>> So just to be clear, I'm NOT talking about the ManifoldCF logging. >>>> For the Solr connector you probably won't need to turn that on; it's >>>> pretty simple and you can look at the Simple History in the UI to see >>>> what the request and response look like from Solr. I was talking >>>> instead about Solr logging - when you run the Solr Webapp, by default >>>> all requests against the Extracting Update Handler are logged to >>>> standard error, so you will see them appear in the process window in >>>> which Solr is running. >>>> >>>> My suggestion to you is to first have a look at the Simple History for >>>> the job you are trying to run. If you are getting back 500 errors >>>> from Solr, that means you have not set up Solr properly to work with >>>> ManifoldCF. In recent versions of Solr, the example works fine out of >>>> the box, but when you try to deploy any other way you are often >>>> missing the jar that contains the extracting update handler, so of >>>> course nothing works. Several people on the connectors-user list have >>>> run into this and if you search the list (go to the ManifoldCF site >>>> and click through to the mailing list page and there are links at the >>>> bottom for this purpose) you will find posts that describe exactly >>>> what is wrong and how to fix it. >>>> >>>> Hope this helps. >>>> >>>> Karl >>>> >>>> >>>> On Sun, Jan 29, 2012 at 2:30 PM, Silvia, Daniel [USA] >>>> wrote: >>>>> Yea,but for some reason the logging isn't coming through. The logging= is set for info and I will have to change the logging level to DEBUG. >>>>> >>>>> Thanks again for your help. >>>>> >>>>> >>>>> ________________________________________ >>>>> From: Karl Wright [daddywri@gmail.com] >>>>> Sent: Friday, January 27, 2012 5:06 PM >>>>> To: Silvia, Daniel [USA] >>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>> >>>>> Actually, the best thing for debugging the Solr connection is looking >>>>> at standard-output on the Solr instance. You will see all the posts >>>>> that are made and what the arguments were. Also, this is the kind of >>>>> question you'd get a lot of benefit from posting to the list. The >>>>> end-user documentation I pointed you at before describes some of this >>>>> but the Solr connector has grown beyond the doc to some extent at thi= s >>>>> point. >>>>> >>>>> Karl >>>>> >>>>> On Fri, Jan 27, 2012 at 9:51 AM, Silvia, Daniel [USA] >>>>> wrote: >>>>>> Hi Karl >>>>>> >>>>>> Is there a log level other than Wire-level debugging to view log st= aements for trying to send output to a Solr instance in the Jobs List/Creat= ion section? We are having an issue getting content to Solr. Is there a doc= ument anywhere which defines the fields for the Jobs sections for the Solr = Field Mapping tab and the Paths and MetaData tabs? >>>>>> >>>>>> Thanks >>>>>> >>>>>> Dan >>>>>> >>>>>> ________________________________________ >>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>> Sent: Thursday, January 26, 2012 10:44 AM >>>>>> To: Silvia, Daniel [USA] >>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>> >>>>>> I am afraid I don't know the answer to that. I'm sure it's infinite= ly >>>>>> configurable but it's not clear what the SharePoint web services nee= d >>>>>> to do under the hood, so anything I tell you would be just a guess. >>>>>> >>>>>> Karl >>>>>> >>>>>> On Thu, Jan 26, 2012 at 10:43 AM, Silvia, Daniel [USA] >>>>>> wrote: >>>>>>> Hi Karl >>>>>>> >>>>>>> One more question. Do you know the minimum permissions needed to cr= awl the Sharepoint instance and all sites under the instance? The individua= l who set my permissions set me up as the "site collection admin" for the t= op most site. Is there a specific admin role without setting the user crawl= ing the sharpoint instance other than "Farm Admin"? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> ________________________________________ >>>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>>> Sent: Thursday, January 26, 2012 9:53 AM >>>>>>> To: Silvia, Daniel [USA] >>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>>> >>>>>>> Good news! Please keep in touch; we'd like to hear how things work >>>>>>> for you (it helps keep the software fresh ;-) ). >>>>>>> >>>>>>> Karl >>>>>>> >>>>>>> On Thu, Jan 26, 2012 at 9:48 AM, Silvia, Daniel [USA] >>>>>>> wrote: >>>>>>>> Hey Karl >>>>>>>> >>>>>>>> (1) was the issue. When requesting access to the SharePoint instan= ce I indicated that I needed to be able to crawl SharePoint, I guess the pr= oblem was on my end indicating that I also needed privileges to crawl the s= ite. >>>>>>>> >>>>>>>> Anyway, thank you for your help. When I change the SharePoint vers= ion to v 3 I get a message indicating "Connection Working". >>>>>>>> >>>>>>>> Appreciate the help. >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> ________________________________________ >>>>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>>>> Sent: Thursday, January 26, 2012 9:19 AM >>>>>>>> To: Silvia, Daniel [USA] >>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>>>> >>>>>>>> The error message "axisFault=3DServer, detail=3DServer was unable = to >>>>>>>> process request --> Requested Registry access is not allowed" is A= xis >>>>>>>> interpreting an error message from SharePoint. What it is saying = is >>>>>>>> that the user you are trying to crawl with is unable to read the >>>>>>>> SharePoint machine's registry but needs to. There are two possibl= e >>>>>>>> causes for this: >>>>>>>> >>>>>>>> (1) The user you gave doesn't have enough permissions to crawl Sha= rePoint >>>>>>>> (2) When you installed the SharePoint MCPermissions plugin, you >>>>>>>> installed it logged in as a user that did not enough permissions t= o do >>>>>>>> what it needs to do. >>>>>>>> >>>>>>>> You can tell the difference between the two by selecting "SharePoi= nt >>>>>>>> 2.0" in the sharepoint version pulldown. If a connection saved in >>>>>>>> this way says "Connection working", it means that the MCPermission= s >>>>>>>> plugin has the permission problem, not your user. >>>>>>>> >>>>>>>> Karl >>>>>>>> >>>>>>>> On Thu, Jan 26, 2012 at 9:14 AM, Silvia, Daniel [USA] >>>>>>>> wrote: >>>>>>>>> Hi Karl >>>>>>>>> >>>>>>>>> When I try to use option (1) and don't put anything in the Site f= ield, I get an error message "axisFault=3DServer, detail=3DServer was unabl= e to process request --> Requested Registry access is not allowed" and when= I put a "/" in the site filed I get a GUI error indicating that the site = field can't end with a "/". >>>>>>>>> >>>>>>>>> Anyway, do you have any ideas. Or maybe the Sharepoint instance i= s not configured properly for us to crawl? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ________________________________________ >>>>>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>>>>> Sent: Thursday, January 26, 2012 8:52 AM >>>>>>>>> To: Silvia, Daniel [USA] >>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>>>>> >>>>>>>>> SharePoint has two kinds of site: >>>>>>>>> >>>>>>>>> (1) the root site, which can be reached by the path http://server= :port >>>>>>>>> (2) a number of sites under the 'virtual path', with URLs of the = form: >>>>>>>>> >>>>>>>>> http://server:port/something/sitename >>>>>>>>> >>>>>>>>> The "something" is, by default, the string "site", so >>>>>>>>> http://server:port/site/xyz might be the URL of one such virtual = site. >>>>>>>>> >>>>>>>>> The form of the "site" field in the SharePoint connection for the >>>>>>>>> first is either blank or "/" (can't remember which right now), an= d the >>>>>>>>> form of the "site" field for the second is "/site/xyz". On no ac= count >>>>>>>>> does the connector expect to see default.aspx attached to that pa= th, >>>>>>>>> so you should not do this; it cannot work. >>>>>>>>> >>>>>>>>> FWIW, my recommendation to try setting the connection type to >>>>>>>>> "SharePoint 2.0" was to rule out any possible installation issue = with >>>>>>>>> the ManifoldCF sharepoint plugin. The connection check for 2.0 d= oes >>>>>>>>> not look for it; only the connection check for 3.0 does. >>>>>>>>> >>>>>>>>> Karl >>>>>>>>> >>>>>>>>> On Thu, Jan 26, 2012 at 8:41 AM, Silvia, Daniel [USA] >>>>>>>>> wrote: >>>>>>>>>> Hey Karl >>>>>>>>>> >>>>>>>>>> I am also getting an "HTTP Error 401.2: Unauthorized: Access is = denied due to server configuration" when setting the Site field to /default= .aspx. Do most Sharepoint instances have the urls set to something like htt= p://server:port/sites/...... instead of http://server:port/? When I use the= "/default.aspx" I see in the log files that ManifoldCF is trying to go to = the Lists.asmx service with the url http://server:port/default.aspx/_vti_bi= n/Lists.asmx, where nothing is found. >>>>>>>>>> >>>>>>>>>> As you can tell I am not much of a SharePoint user or installer. >>>>>>>>>> >>>>>>>>>> Also, I don't think the issue is with the connector in ManifoldC= F, I am just trying to >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ________________________________________ >>>>>>>>>> From: Silvia, Daniel [USA] >>>>>>>>>> Sent: Thursday, January 26, 2012 7:23 AM >>>>>>>>>> To: Karl Wright >>>>>>>>>> Subject: RE: ManifoldCF's dist/shapoint-integration dir >>>>>>>>>> >>>>>>>>>> Hey Karl >>>>>>>>>> >>>>>>>>>> The issue I am having is that the Sharepoint instance url is som= ething like http://server:port/default.aspx. If I don't put anything in the= site field I get a message indicating "Requested Registry Access is not al= lowed". I was putting "/default.apsx" as my Site field which I believe may = have been the issue. However, what do you put in your Site field when the s= ite is the top most site, as in http://server:port/default.aspx? >>>>>>>>>> >>>>>>>>>> I would love to send you the log messages, but I am working on a= network which is not connected to the outside. >>>>>>>>>> >>>>>>>>>> Thanks for your help. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ________________________________________ >>>>>>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>>>>>> Sent: Wednesday, January 25, 2012 6:12 PM >>>>>>>>>> To: Silvia, Daniel [USA] >>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>>>>>> >>>>>>>>>> Daniel, >>>>>>>>>> >>>>>>>>>> FWIW, I can help you diagnose the issue, but to do so you really= need >>>>>>>>>> to give me some concrete data. I'm happy to grovel over the who= le >>>>>>>>>> wire log if you feel you can send it to me; something that may n= ot >>>>>>>>>> seem important to you will likely stand out strongly to me. I c= an, >>>>>>>>>> for example, see whether you are getting back HTML because of an >>>>>>>>>> authentication error, for instance. And if you ARE getting back= valid >>>>>>>>>> SOAP, I would then be sure that something was wrong with the Axi= s >>>>>>>>>> client configuration, and I could pursue that here with the data >>>>>>>>>> provided. The problem with software like SharePoint running on = IIS is >>>>>>>>>> that it can be configured a nearly infinite number of ways, so >>>>>>>>>> diagnosis is more of an art than a science. I strongly suspect = that >>>>>>>>>> you're laboring under a pretty straightforward misconception whi= ch is >>>>>>>>>> likely blocking progress, rather than there being an issue with = the >>>>>>>>>> SharePoint connector itself. But I can't tell that without more >>>>>>>>>> detailed communication. >>>>>>>>>> >>>>>>>>>> Also, you mentioned that the Lists.asmx service was right where = you >>>>>>>>>> expected it to be. Have you read the SharePoint Connector part = of the >>>>>>>>>> end-user documentation? To whit: >>>>>>>>>> >>>>>>>>>> "Select the server protocol, and enter the server name and port,= based >>>>>>>>>> on what you recorded from the URL for your SharePoint site. For = the >>>>>>>>>> "Site path" field, type in the portion of the root site URL that >>>>>>>>>> includes everything after the server and port, except for the fi= nal >>>>>>>>>> "aspx" file. For example, if the SharePoint URL is >>>>>>>>>> "http://myserver:81/sites/somewhere/index.asp", the site path wo= uld be >>>>>>>>>> "/sites/somewhere"." The Lists.asmx service in this example wou= ld be >>>>>>>>>> expected to be found at >>>>>>>>>> "http://myserver:81/sites/somewhere/_vti_bin/Lists.asmx". And t= he URL >>>>>>>>>> you would start with would be the URL you see in the browser whe= n you >>>>>>>>>> log into the SharePoint web client and go to the site you wish t= o >>>>>>>>>> crawl. Is this what you are doing? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks again, >>>>>>>>>> Karl >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jan 25, 2012 at 12:33 PM, Karl Wright wrote: >>>>>>>>>>> The code that parses the SOAP response is Apache Axis. This ha= sn't >>>>>>>>>>> changed in several years. >>>>>>>>>>> >>>>>>>>>>> Can you answer the following questions: >>>>>>>>>>> >>>>>>>>>>> (1) When the SharePoint connector makes a request to SharePoint= , is >>>>>>>>>>> the response HTML, or is it XML? Does it have an XML header wh= ich >>>>>>>>>>> describes a Microsoft XML namespace? It sure sounds like it is >>>>>>>>>>> responding with HTML. The SharePoint connector is expecting to >>>>>>>>>>> communicate using SOAP. Is the response valid SOAP? >>>>>>>>>>> >>>>>>>>>>> (2) What version of SharePoint are you trying to connect to? I= s the >>>>>>>>>>> SharePoint 2007? SharePoint 2010? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Karl >>>>>>>>>>> >>>>>>>>>>> On Wed, Jan 25, 2012 at 12:26 PM, Silvia, Daniel [USA] >>>>>>>>>>> wrote: >>>>>>>>>>>> Hi Karl >>>>>>>>>>>> >>>>>>>>>>>> I have added the specific log4j lines for Http Client wire and= I restarted the ManifoldCF instance. I was also see the webservice Lists.a= smx through IE. When reviewing the log files I was able to see some of the = content that resides in the Sharepoint instance in the content coming back = from the request. However, I am still seeing the error messages in the Mani= foldCF GUI as well as in the log file indicating "Bad Envelope: HTML" ,"No= service named ListsSoap is available" and "No service named http://schemas= .microsoft.com/sharepoint/soap/GetListCollection is available". >>>>>>>>>>>> >>>>>>>>>>>> Could there be something going on with the way the services ar= e being built on the client side? >>>>>>>>>>>> >>>>>>>>>>>> Appreciate your help. >>>>>>>>>>>> >>>>>>>>>>>> Dan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ________________________________________ >>>>>>>>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>>>>>>>> Sent: Tuesday, January 24, 2012 4:52 PM >>>>>>>>>>>> To: Silvia, Daniel [USA]; connectors-user@incubator.apache.org >>>>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>>>>>>>> >>>>>>>>>>>> I have not seen this exact problem before. >>>>>>>>>>>> >>>>>>>>>>>> The "Bad envelope tag: HTML" indicates that the SOAP request t= he >>>>>>>>>>>> SharePoint connector is attempting to perform is, in fact, ret= urning >>>>>>>>>>>> an HTML response. This usually indicates that the server or p= ath >>>>>>>>>>>> parameters you've used to set up the connection are not set co= rrectly, >>>>>>>>>>>> and SharePoint is not actually being engaged. >>>>>>>>>>>> >>>>>>>>>>>> But usually when that happens I don't recall a ConfigurationEx= ception >>>>>>>>>>>> logged, unless it's what Axis does in response to the HTML. >>>>>>>>>>>> >>>>>>>>>>>> The best thing to do at this point is turn on Http Client wire >>>>>>>>>>>> logging, restart ManifoldCF, and view the connection. The log= will >>>>>>>>>>>> then contain a record of the exact SOAP requests and the respo= nses, >>>>>>>>>>>> and we can see what's wrong. The technique is described here: >>>>>>>>>>>> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/CONNECTORS/Debuggi= ng+Connections >>>>>>>>>>>> >>>>>>>>>>>> You can also confirm that the right SharePoint web services ar= e >>>>>>>>>>>> functioning on the machine in question by trying to access the= m >>>>>>>>>>>> directly. For the Lists web service, which is the one it soun= ds like >>>>>>>>>>>> it was complaining about, try using IE (not Firefox etc becaus= e you >>>>>>>>>>>> want NTLM support) to go to the url where you think the web se= rvice >>>>>>>>>>>> lives. This will be http: or https:, plus the server, plus th= e port, >>>>>>>>>>>> plus the path, plus "_vti_bin/Lists.asmx". You should see an >>>>>>>>>>>> unequivocable SharePoint response. For an example from the Mi= crosoft >>>>>>>>>>>> demo service, try http://www.wssdemo.com/_vti_bin/Lists.asmx. >>>>>>>>>>>> >>>>>>>>>>>> Please let me know how it goes, and cc the dev list (as I have= ) so a >>>>>>>>>>>> record of what you're encountering can be made available to ot= hers. >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> Karl >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jan 24, 2012 at 1:52 PM, Silvia, Daniel [USA] >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Hi Karl >>>>>>>>>>>>> >>>>>>>>>>>>> I have downloaded the newest version of ManifoldCF v .4 and h= ave run the necessary ant scripts to download dependencies and then built t= he entire project. I have also had the ShrePoint webservice MetCarta.ShareP= oint.MCPermissionsService.wsp deployed on the SharePoint instance due to ru= nning version 3 of SharePoint (SharePoint 2007). When I try to create a Rep= ository Connection and select "Save" I get a message on the ManifoldCF fron= t end of "org.xml.sax.SAXException Bad envelope tag: HTML". When I look at = the log file I see an error message " org.apache.axis.ConfigurationExceptio= n: No service named ListsSoap is available". >>>>>>>>>>>>> >>>>>>>>>>>>> Can you tell me if you have seen this issue before and what m= ay be causing this issue? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for your help. >>>>>>>>>>>>> >>>>>>>>>>>>> Dan >>>>>>>>>>>>> ________________________________________ >>>>>>>>>>>>> From: Karl Wright [daddywri@gmail.com] >>>>>>>>>>>>> Sent: Friday, January 20, 2012 7:31 AM >>>>>>>>>>>>> To: Silvia, Daniel [USA] >>>>>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Daniel, >>>>>>>>>>>>> >>>>>>>>>>>>> In order for the SharePoint connector to build, you need to h= ave the >>>>>>>>>>>>> wsdls in place in the right area. We cannot ship those becau= se of >>>>>>>>>>>>> potential copyright issues. The easiest way to obtain the ri= ght >>>>>>>>>>>>> dependencies is: >>>>>>>>>>>>> >>>>>>>>>>>>> ant download-dependencies >>>>>>>>>>>>> >>>>>>>>>>>>> Then, just build normally: >>>>>>>>>>>>> >>>>>>>>>>>>> ant build >>>>>>>>>>>>> >>>>>>>>>>>>> This will only work for ManifoldCF-0.4-incubating, or trunk. >>>>>>>>>>>>> 0.4-incubating is still in the process of being signed off by= the >>>>>>>>>>>>> incubator, but you can find the release candidate here: >>>>>>>>>>>>> >>>>>>>>>>>>> http://people.apache.org/~kwright >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Karl >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 20, 2012 at 7:02 AM, Silvia, Daniel [USA] >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Hi Karl >>>>>>>>>>>>>> >>>>>>>>>>>>>> I work with Matt Parker and we are in the process of develop= ing a pipeline >>>>>>>>>>>>>> that uses ManifoldCF at the beginning. I just subscribed to = the >>>>>>>>>>>>>> connectors-user-subscribe@incubator.apache.org >>>>>>>>>>>>>> group yesterday and submitted an e-mail question to the grou= p. Can you help >>>>>>>>>>>>>> us with the below issue? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I downloaded MCF and started playing with the default setup = under Jetty and >>>>>>>>>>>>>> Derby. It starts up without any issue. I am trying to config= ure a SharePoint >>>>>>>>>>>>>> connector, connecting to SharePoint Service 3. I have been f= ollowing the >>>>>>>>>>>>>> instructions and I am at the point of deploying the custom S= harePoint web >>>>>>>>>>>>>> service to the SharePoint instance. The instructions indicat= e that I should >>>>>>>>>>>>>> get the web service from dist/sharepoint-integration after b= uilding MCF. >>>>>>>>>>>>>> However, after looking through the entire directory structur= e, I am unable >>>>>>>>>>>>>> to find the service to deploy. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can someone tell me where to find this service? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for your help. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Daniel Silvia=