manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Getting a 401 Unauthorized on a SharePoint 2010 crawl request, with MCPermissions.asmx installed
Date Wed, 18 Sep 2013 15:27:37 GMT
Hi Dmitry,

It may be worth reviewing with that engineer what steps he took when he
installed the instance.  If he used the standard installer, IIRC there are
a number of ways you can mess this up - the primary way being if you try to
install IIS afterwards and then just try to patch things up.  The canned
install usually does best if IIS is installed first.

At any rate, I think that you have a probable case of "operator error"
here...

Karl



I can think of a few possibilities.


On Wed, Sep 18, 2013 at 11:16 AM, Dmitry Goldenberg
<dgoldenberg@kmwllc.com>wrote:

> SharePoint was not installed by a domain user (the Windows instance is not
> on a domain).
>
> This is not a canned AWS SharePoint installation; an engineer on the team
> installed it, using the standard installer program, I believe.
>
>
> On Wed, Sep 18, 2013 at 10:34 AM, Will Parkinson <parkinson.will@gmail.com
> > wrote:
>
>> Dmitry, do you know if Sharepoint was installed by a domain user?  I have
>> heard of issues with Sharepoint if not installed using a domain user (e.g.
>> DOMAIN\someuser)
>>
>>
>> On Thu, Sep 19, 2013 at 12:31 AM, Will Parkinson <
>> parkinson.will@gmail.com> wrote:
>>
>>> No, i didnt have that issue.  The issue i had was the // and ///
>>> references being added in the wrong places in the page URL's
>>>
>>> I was getting things like
>>>
>>>  /Site Name/Lib///rary/test.aspx
>>>
>>> My first set up was an out of the box set up, the main site was on port
>>> 80, using classic authentication.  With the path modification in the
>>> mcf-sharepoint-connector.jar, it worked very well.
>>>
>>> I set up active directory on that same server to authenticate via NTLM
>>>
>>> The second server had the site on https on port 443, had claims based
>>> authentication using ADFS and kerberos.  I had to modify the
>>> mcf-sharepoint-connector.jar and MCPermissions.wsp to get this to work
>>> around the lack of SID's returned from the permissions webservice.
>>>
>>> In this case, Active Directory and ADFS were set up on separate AWS
>>> servers
>>>
>>>
>>>
>>>
>>> On Thu, Sep 19, 2013 at 12:23 AM, Karl Wright <daddywri@gmail.com>wrote:
>>>
>>>> Hi Will,
>>>>
>>>> The path stuff we're already dealing with - see the CONNECTORS-772
>>>> branch.  But what we are having trouble with is something much more
>>>> fundamental.  On Dmitry's AWS instance, when you talk to the web services
>>>> for a root site, it works fine.  But as soon as you add a subsite path into
>>>> the URL, it *seems* to work fine, but actually behaves as though you never
>>>> specified any subsite at all - it returns root site information only.  On
>>>> this system, this occurs for ALL web services, even Microsoft's.  The
>>>> reason is that the value of SPContext.Current.Web never points to the
>>>> subsite you specified.  The result is that you cannot use SharePoint
>>>> subsites with ManifoldCF without causing havoc.
>>>>
>>>> Does this sound completely unfamiliar to you?  If you never encountered
>>>> it, then we should compare how these instances were set up, unless you have
>>>> any further ideas.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>>
>>>> On Wed, Sep 18, 2013 at 10:12 AM, Will Parkinson <
>>>> parkinson.will@gmail.com> wrote:
>>>>
>>>>> Hey Karl (and Dmitry)
>>>>>
>>>>> For AWS, i had to modify the way the the relPath in the in the addFile
>>>>> function in the FileStream class (in SharepointRepository.java) calculated
>>>>> the modifiedPath
>>>>>
>>>>> Essentially, i ensured that the relPath always contains the site as
>>>>> part of the path
>>>>>
>>>>>               if (siteName != "") {
>>>>>                     int siteInd = relPath.indexOf(siteName);
>>>>>                     if (siteInd == -1 || siteInd > 3) {
>>>>>                         relPath = siteName + relPath;
>>>>>                     }
>>>>>                 }
>>>>>
>>>>>
>>>>> Which fixed my pathing issue and the index out of bounds errors.
>>>>>
>>>>> I have also made many other modification to cope with AD and claims
>>>>> based auth and compatibility with Sharepoint 2013
>>>>>
>>>>> Dmitry, i have uploaded my modified mcf-sharepoint-connector.jar and
>>>>> MCPermissions WSP if you would like to try them out
>>>>>
>>>>> http://pngnetworks.com/sharepoint-2010-claims.zip
>>>>>
>>>>> Just make sure you back up your current ones as this is still very
>>>>> much in development :)
>>>>>
>>>>> Also, the logging is very verbose.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Will
>>>>>
>>>>>
>>>>> On Wed, Sep 18, 2013 at 11:41 PM, Karl Wright <daddywri@gmail.com>wrote:
>>>>>
>>>>>> Hi Will,
>>>>>> When you folks set up YOUR AWS instance, did it work with MCF out of
>>>>>> the box?  Or did you need to do something?  And, if so, what did you do?
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 18, 2013 at 9:28 AM, Will Parkinson <
>>>>>> parkinson.will@gmail.com> wrote:
>>>>>>
>>>>>>> Yes that's right, only really interested in the site that you are
>>>>>>> trying to crawl
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Sep 18, 2013 at 11:25 PM, Dmitry Goldenberg <
>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>
>>>>>>>> Will,
>>>>>>>>
>>>>>>>> For SharePoint - 80, the output is
>>>>>>>>
>>>>>>>> NTAuthenticationProviders       : (STRING) "NTLM"
>>>>>>>>
>>>>>>>> I assume we're not interested in the Default Web Site; for that,
>>>>>>>> the output is simply "The parameter NTAuthenticationProviders is not set at
>>>>>>>> this node."
>>>>>>>>
>>>>>>>> - Dmitry
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Sep 18, 2013 at 9:16 AM, Will Parkinson <
>>>>>>>> parkinson.will@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> If you open IIS manager and click on sites, it is displayed in the
>>>>>>>>> ID column (see screenshot attached)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Sep 18, 2013 at 10:55 PM, Dmitry Goldenberg <
>>>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>
>>>>>>>>>> **Hi Will,
>>>>>>>>>> Sorry, what is the "sharepoint website *number*" in that
>>>>>>>>>> invokation?
>>>>>>>>>> - Dmitry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 18, 2013 at 8:53 AM, Will Parkinson <
>>>>>>>>>> parkinson.will@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Dmitry
>>>>>>>>>>>
>>>>>>>>>>> Just out of interest, what does the following command output on
>>>>>>>>>>> your system
>>>>>>>>>>>
>>>>>>>>>>> cd to C:\inetpub\adminscripts
>>>>>>>>>>>
>>>>>>>>>>> *cscript adsutil.vbs get w3svc/<put your sharepoint website
>>>>>>>>>>> number here>/root/NTAuthenticationProviders*
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Will
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 18, 2013 at 10:44 PM, Karl Wright <
>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> "This is the second time I'm encountering the issue which leads
>>>>>>>>>>>> me to believe it's a quirk of IIS and/or SharePoint."
>>>>>>>>>>>>
>>>>>>>>>>>> It cannot be just a quirk of SharePoint because SharePoint's UI
>>>>>>>>>>>> etc could not create or work with subsites if that was true.  It may well
>>>>>>>>>>>> be a configuration issue with IIS, which is indeed what I suspect.  I have
>>>>>>>>>>>> pinged all the resources I know of to try and get some insight as to why
>>>>>>>>>>>> this is happening.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> "Perhaps this is something that can be worked into the 'fabric'
>>>>>>>>>>>> of ManifoldCF as a workaround for a known issue."
>>>>>>>>>>>>
>>>>>>>>>>>> Like I said before, this is a huge amount of work, tantamount
>>>>>>>>>>>> to rewriting most of the connector.  If this is what you want to request,
>>>>>>>>>>>> that is your option, but there is no way we'd complete any of this work
>>>>>>>>>>>> before December/January at the earliest.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> "Just to understand this a bit better, the main breakage here
>>>>>>>>>>>> is that the wildcards don't work properly, right? "
>>>>>>>>>>>>
>>>>>>>>>>>> No, it means that ManifoldCF cannot get at any data of any kind
>>>>>>>>>>>> associated with a SharePoint subsite.  Accessing root data works fine.  If
>>>>>>>>>>>> you try to crawl as things are now, you must disable all subsites and just
>>>>>>>>>>>> crawl the root site, or you will crawl the same things with longer and
>>>>>>>>>>>> longer paths indefinitely.
>>>>>>>>>>>>
>>>>>>>>>>>> Karl
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 18, 2013 at 8:38 AM, Dmitry Goldenberg <
>>>>>>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Karl,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is the second time I'm encountering the issue which leads
>>>>>>>>>>>>> me to believe it's a quirk of IIS and/or SharePoint. Perhaps this is
>>>>>>>>>>>>> something that can be worked into the 'fabric' of ManifoldCF as a
>>>>>>>>>>>>> workaround for a known issue. I understand that it may have far reaching
>>>>>>>>>>>>> tenticles but I wonder if that's really the only option...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Just to understand this a bit better, the main breakage here
>>>>>>>>>>>>> is that the wildcards don't work properly, right?  In theory if I have a
>>>>>>>>>>>>> repo connector config which lists specific library and list paths, things
>>>>>>>>>>>>> should work?  It's only when the /* types of wildcards are included, we're
>>>>>>>>>>>>> in trouble?
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 18, 2013 at 8:07 AM, Karl Wright <
>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Someone else was having a similar problem. See
>>>>>>>>>>>>>> http://social.technet.microsoft.com/Forums/sharepoint/en-US/e4b53c63-b89a-4356-a7b0-6ca7bfd22826/getting-sharepoint-subsite-from-custom-webservice.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Apparently it does depend on how you get to the web service,
>>>>>>>>>>>>>> which does argue that it is an IIS issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Sep 17, 2013 at 5:44 PM, Karl Wright <
>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As discussed privately I had a look at your system.  What is
>>>>>>>>>>>>>>> happening is that the C# static SPContext.Current.Web is not reflecting the
>>>>>>>>>>>>>>> subsite in any url that contains a subsite.  In other words, the URL coming
>>>>>>>>>>>>>>> in might be "
>>>>>>>>>>>>>>> http://servername/subsite1/_vti_bin/MCPermissions.asmx",
>>>>>>>>>>>>>>> but the MCPermissions.asmx plugin will think it is being executed in the
>>>>>>>>>>>>>>> root context ("http://servername").  That's pretty broken
>>>>>>>>>>>>>>> behavior, so I'm guessing that the problem is that either IIS or SharePoint
>>>>>>>>>>>>>>> is somehow misconfigured to do this, and the web services would then begin
>>>>>>>>>>>>>>> to work right again.  But I have no idea how this should actually be fixed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Will Parkinson, one of the subscribers of this list, may
>>>>>>>>>>>>>>> find the symptoms meaningful, since he set up an AWS SharePoint instance
>>>>>>>>>>>>>>> before.  I hope he will respond in a helpful way.  Until then, I think we
>>>>>>>>>>>>>>> are stuck.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Sep 17, 2013 at 9:49 AM, Dmitry Goldenberg <
>>>>>>>>>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Karl,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It looks like I'll be able to get access for you to the
>>>>>>>>>>>>>>>> test system we're using. Would you be interested in working with the system
>>>>>>>>>>>>>>>> directly? I certainly don't mind doing some testing but I thought we'd
>>>>>>>>>>>>>>>> speed things up this way. If so, could you email me from a more private
>>>>>>>>>>>>>>>> account so we can set this up?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Sep 17, 2013 at 7:38 AM, Karl Wright <
>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Another interesting bit from the log:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/_catalogs/lt/Forms/AllItems.aspx', 'List
>>>>>>>>>>>>>>>>> Template Gallery'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/_catalogs/masterpage/Forms/AllItems.aspx',
>>>>>>>>>>>>>>>>> 'Master Page Gallery'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/Shared Documents/Forms/AllItems.aspx', 'Shared
>>>>>>>>>>>>>>>>> Documents'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/SiteAssets/Forms/AllItems.aspx', 'Site Assets'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/SitePages/Forms/AllPages.aspx', 'Site Pages'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/_catalogs/solutions/Forms/AllItems.aspx',
>>>>>>>>>>>>>>>>> 'Solution Gallery'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/Style Library/Forms/AllItems.aspx', 'Style
>>>>>>>>>>>>>>>>> Library'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/Test Library 1/Forms/AllItems.aspx', 'Test
>>>>>>>>>>>>>>>>> Library 1'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/_catalogs/theme/Forms/AllItems.aspx', 'Theme
>>>>>>>>>>>>>>>>> Gallery'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library list: '/_catalogs/wp/Forms/AllItems.aspx', 'Web Part
>>>>>>>>>>>>>>>>> Gallery'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/Shared Documents'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/Shared
>>>>>>>>>>>>>>>>> Documents' exactly matched rule path '/*'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Including library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/Shared Documents'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/SiteAssets'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/SiteAssets'
>>>>>>>>>>>>>>>>> exactly matched rule path '/*'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Including library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/SiteAssets'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/SitePages'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/SitePages'
>>>>>>>>>>>>>>>>> exactly matched rule path '/*'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Including library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/SitePages'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/Style Library'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Library '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/Style Library'
>>>>>>>>>>>>>>>>> exactly matched rule path '/*'
>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,799 (Worker thread '7') -
>>>>>>>>>>>>>>>>> SharePoint: Including library
>>>>>>>>>>>>>>>>> '/Abcd/Klmnopqr/Klmnopqr/Defghij/Defghij/Style Library'
>>>>>>>>>>>>>>>>> <<<<<<
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This time it appears that it is the Lists service that is
>>>>>>>>>>>>>>>>> broken and does not recognize the parent site.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I haven't corrected this problem yet since now I am
>>>>>>>>>>>>>>>>> beginning to wonder if *any* of the web services under Amazon work at all
>>>>>>>>>>>>>>>>> for subsites.  We may be better off implementing everything we need in the
>>>>>>>>>>>>>>>>> MCPermissions service.  I will ponder this as I continue to research the
>>>>>>>>>>>>>>>>> logs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It's still valuable to check my getSites()
>>>>>>>>>>>>>>>>> implementation.  I'll be doing another round of work tonight on the plugin.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 8:45 PM, Karl Wright <
>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The augmented plugin can be downloaded from
>>>>>>>>>>>>>>>>>> http://people.apache.org/~kwright/MetaCarta.SharePoint.MCPermissionsService.wsp.  The revised connector code is also ready, and should be checked out and
>>>>>>>>>>>>>>>>>> built from
>>>>>>>>>>>>>>>>>> https://svn.apache.org/repos/asf/manifoldcf/branches/CONNECTORS-772.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Once you set it all up, you can see if it is doing the
>>>>>>>>>>>>>>>>>> right thing by just trying to drill down through subsites in the UI.  You
>>>>>>>>>>>>>>>>>> should always see a list of subsites that is appropriate for the context
>>>>>>>>>>>>>>>>>> you are in; if this does not happen it is not working.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 7:45 PM, Dmitry Goldenberg <
>>>>>>>>>>>>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Karl,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I can see how preloading the list of subsites may be
>>>>>>>>>>>>>>>>>>> less optimal.. The advantage of doing it this way is one call and you've
>>>>>>>>>>>>>>>>>>> got the structure in memory, which may be OK unless there are sites with a
>>>>>>>>>>>>>>>>>>> ton of subsites which may stress out memory. The disadvantage is having to
>>>>>>>>>>>>>>>>>>> throw this structure around..
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yes, I'll certainly help test out your changes, just let
>>>>>>>>>>>>>>>>>>> me know when they're available.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 7:19 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks for the code snippet.  I'd prefer, though, to
>>>>>>>>>>>>>>>>>>>> not preload the entire site structure in memory.  Probably it would be
>>>>>>>>>>>>>>>>>>>> better to just add another method to the ManifoldCF SharePoint 2010
>>>>>>>>>>>>>>>>>>>> plugin.  More methods are going to be added anyway to support Claim Space
>>>>>>>>>>>>>>>>>>>> Authentication, so I guess this would be just one more.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> We honestly have never seen this problem before - so
>>>>>>>>>>>>>>>>>>>> it's not just flakiness, it has something to do with the installation, I'm
>>>>>>>>>>>>>>>>>>>> certain.  At any rate, I'll get going right away on a workaround - if you
>>>>>>>>>>>>>>>>>>>> are willing to test what I produce.  I'm also certain there is at least one
>>>>>>>>>>>>>>>>>>>> other issue, but hopefully that will become clearer once this one is
>>>>>>>>>>>>>>>>>>>> resolved.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 6:49 PM, Dmitry Goldenberg <
>>>>>>>>>>>>>>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Karl,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> subsite discovery is effectively disabled except
>>>>>>>>>>>>>>>>>>>>> directly under the root site
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yes. Come to think of it, I once came across this
>>>>>>>>>>>>>>>>>>>>> problem while implementing a SharePoint connector.  I'm not sure whether
>>>>>>>>>>>>>>>>>>>>> it's exactly what's happening with the issue we're discussing but looks
>>>>>>>>>>>>>>>>>>>>> like it.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I started off by using multiple getWebCollection calls
>>>>>>>>>>>>>>>>>>>>> to get child subsites of sites and trying to navigate down that way. The
>>>>>>>>>>>>>>>>>>>>> problem was that getWebCollection was always returning the immediate
>>>>>>>>>>>>>>>>>>>>> subsites of the root site no matter whether you're at the root or below, so
>>>>>>>>>>>>>>>>>>>>> I ended up generating infinite loops.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I switched over to using a single
>>>>>>>>>>>>>>>>>>>>> getAllSubWebCollection call and caching its results. That call returns the
>>>>>>>>>>>>>>>>>>>>> full list of all subsites as pairs of Title and Url.  I had a POJO similar
>>>>>>>>>>>>>>>>>>>>> to the one below which held the list of sites and contained logic for
>>>>>>>>>>>>>>>>>>>>> enumerating the child sites, given the URL of a (parent) site.  From what I
>>>>>>>>>>>>>>>>>>>>> recall, getWebCollection works inconsistently, either across SP versions or
>>>>>>>>>>>>>>>>>>>>> across installations, but the logic below should work in any case.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> *** public class SubSiteCollection -- holds a list of
>>>>>>>>>>>>>>>>>>>>> CrawledSite pojo's each of which is a { title, url }.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> *** SubSiteCollection has the following:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  public List<CrawledSite> getImmediateSubSites(String
>>>>>>>>>>>>>>>>>>>>> siteUrl) {
>>>>>>>>>>>>>>>>>>>>>   List<CrawledSite> subSites = new
>>>>>>>>>>>>>>>>>>>>> ArrayList<CrawledSite>();
>>>>>>>>>>>>>>>>>>>>>   for (CrawledSite site : sites) {
>>>>>>>>>>>>>>>>>>>>>    if (isChildOf(siteUrl, site.getUrl().toString())) {
>>>>>>>>>>>>>>>>>>>>>     subSites.add(site);
>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>>>>>>>>   return subSites;
>>>>>>>>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  private static boolean isChildOf(String parentUrl,
>>>>>>>>>>>>>>>>>>>>> String urlToCheck) {
>>>>>>>>>>>>>>>>>>>>>   final String parent = normalizeUrl(parentUrl);
>>>>>>>>>>>>>>>>>>>>>   final String child = normalizeUrl(urlToCheck);
>>>>>>>>>>>>>>>>>>>>>   boolean ret = false;
>>>>>>>>>>>>>>>>>>>>>   if (child.startsWith(parent)) {
>>>>>>>>>>>>>>>>>>>>>    String remainder = child.substring(parent.length());
>>>>>>>>>>>>>>>>>>>>>    ret = StringUtils.countOccurrencesOf(remainder,
>>>>>>>>>>>>>>>>>>>>> SLASH) == 1;
>>>>>>>>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>>>>>>>>   return ret;
>>>>>>>>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  private static String normalizeUrl(String url) {
>>>>>>>>>>>>>>>>>>>>>   return ((url.endsWith(SLASH)) ? url : url +
>>>>>>>>>>>>>>>>>>>>> SLASH).toLowerCase();
>>>>>>>>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 2:54 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Have a look at this sequence also:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,817 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Subsite list: '
>>>>>>>>>>>>>>>>>>>>>> http://ec2-99-99-99-99.compute-1.amazonaws.com/Abcd',
>>>>>>>>>>>>>>>>>>>>>> 'Abcd'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,817 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Subsite list: '
>>>>>>>>>>>>>>>>>>>>>> http://ec2-99-99-99-99.compute-1.amazonaws.com/Defghij',
>>>>>>>>>>>>>>>>>>>>>> 'Defghij'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,817 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Subsite list: '
>>>>>>>>>>>>>>>>>>>>>> http://ec2-99-99-99-99.compute-1.amazonaws.com/Klmnopqr',
>>>>>>>>>>>>>>>>>>>>>> 'Klmnopqr'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include site
>>>>>>>>>>>>>>>>>>>>>> '/Klmnopqr/Abcd/Abcd/Klmnopqr/Abcd'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Site '/Klmnopqr/Abcd/Abcd/Klmnopqr/Abcd' exactly matched rule
>>>>>>>>>>>>>>>>>>>>>> path '/*'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Including site '/Klmnopqr/Abcd/Abcd/Klmnopqr/Abcd'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include site
>>>>>>>>>>>>>>>>>>>>>> '/Klmnopqr/Abcd/Abcd/Klmnopqr/Defghij'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Site '/Klmnopqr/Abcd/Abcd/Klmnopqr/Defghij' exactly matched
>>>>>>>>>>>>>>>>>>>>>> rule path '/*'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Including site '/Klmnopqr/Abcd/Abcd/Klmnopqr/Defghij'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include site
>>>>>>>>>>>>>>>>>>>>>> '/Klmnopqr/Abcd/Abcd/Klmnopqr/Klmnopqr'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Site '/Klmnopqr/Abcd/Abcd/Klmnopqr/Klmnopqr' exactly matched
>>>>>>>>>>>>>>>>>>>>>> rule path '/*'
>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 11:43:56,818 (Worker thread '8') -
>>>>>>>>>>>>>>>>>>>>>> SharePoint: Including site '/Klmnopqr/Abcd/Abcd/Klmnopqr/Klmnopqr'
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> <<<<<<
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This is using the GetSites(String parent) method with
>>>>>>>>>>>>>>>>>>>>>> a site name of "/Klmnopqr/Abcd/Abcd/Klmnopqr", and getting back three sites
>>>>>>>>>>>>>>>>>>>>>> (!!).  The parent path is not correct, obviously, but nevertheless this one
>>>>>>>>>>>>>>>>>>>>>> way in which paths are getting completely messed up.  It *looks* like the
>>>>>>>>>>>>>>>>>>>>>> Webs web service is broken in such a way as to ignore the URL coming in,
>>>>>>>>>>>>>>>>>>>>>> except for the base part, which means that subsite discovery is effectively
>>>>>>>>>>>>>>>>>>>>>> disabled except directly under the root site.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This might still be OK if it is not possible to
>>>>>>>>>>>>>>>>>>>>>> create subsites of subsites in this version of SharePoint.  Can you confirm
>>>>>>>>>>>>>>>>>>>>>> that this is or is not possible?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 2:42 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> "This is everything that got generated, from the
>>>>>>>>>>>>>>>>>>>>>>> very beginning"
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Well, something isn't right.  What I expect to see
>>>>>>>>>>>>>>>>>>>>>>> that I don't right up front are:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - A webs "getWebCollection" invocation for
>>>>>>>>>>>>>>>>>>>>>>> /_vti_bin/webs.asmx
>>>>>>>>>>>>>>>>>>>>>>> - Two lists "getListCollection" invocations for
>>>>>>>>>>>>>>>>>>>>>>> /_vti_bin/lists.asmx
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Instead the first transactions I see are from
>>>>>>>>>>>>>>>>>>>>>>> already busted URLs - which make no sense since there would be no way they
>>>>>>>>>>>>>>>>>>>>>>> should have been able to get queued yet.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> So there are a number of possibilities.  First,
>>>>>>>>>>>>>>>>>>>>>>> maybe the log isn't getting cleared out, and the session in question
>>>>>>>>>>>>>>>>>>>>>>> therefore starts somewhere in the middle of manifoldcf.log.1.  But no:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> C:\logs>grep "POST /_vti_bin/webs" manifoldcf.log.1
>>>>>>>>>>>>>>>>>>>>>>> grep: input lines truncated - result questionable
>>>>>>>>>>>>>>>>>>>>>>> <<<<<<
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Nevertheless there are some interesting points
>>>>>>>>>>>>>>>>>>>>>>> here.  First, note the following response, which I've been able to
>>>>>>>>>>>>>>>>>>>>>>> determine is against "Test Library 1":
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 13:02:31,590 (Worker thread '23') -
>>>>>>>>>>>>>>>>>>>>>>> SharePoint: getListItems xml response: '<GetListItems xmlns="
>>>>>>>>>>>>>>>>>>>>>>> http://schemas.microsoft.com/sharepoint/soap/directory/"><GetListItemsResponse
>>>>>>>>>>>>>>>>>>>>>>> xmlns=""><GetListItemsResult
>>>>>>>>>>>>>>>>>>>>>>> FileRef="SitePages/Home.aspx"/></GetListItemsResponse></GetListItems>'
>>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 13:02:31,590 (Worker thread '23') -
>>>>>>>>>>>>>>>>>>>>>>> SharePoint: Checking whether to include document '/SitePages/Home.aspx'
>>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 13:02:31,590 (Worker thread '23') -
>>>>>>>>>>>>>>>>>>>>>>> SharePoint: File '/SitePages/Home.aspx' exactly matched rule path '/*'
>>>>>>>>>>>>>>>>>>>>>>> DEBUG 2013-09-16 13:02:31,590 (Worker thread '23') -
>>>>>>>>>>>>>>>>>>>>>>> SharePoint: Including file '/SitePages/Home.aspx'
>>>>>>>>>>>>>>>>>>>>>>>  WARN 2013-09-16 13:02:31,590 (Worker thread '23') -
>>>>>>>>>>>>>>>>>>>>>>> Sharepoint: Unexpected relPath structure; path is '/SitePages/Home.aspx',
>>>>>>>>>>>>>>>>>>>>>>> but expected <list/library> length of 26
>>>>>>>>>>>>>>>>>>>>>>> <<<<<<
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> The FileRef in this case is pointing at what,
>>>>>>>>>>>>>>>>>>>>>>> exactly?  Is there a SitePages/Home.aspx in the "Test Library 1" library?
>>>>>>>>>>>>>>>>>>>>>>> Or does it mean to refer back to the root site with this URL construction?
>>>>>>>>>>>>>>>>>>>>>>> And since this is supposedly at the root level, how come the combined site
>>>>>>>>>>>>>>>>>>>>>>> + library name comes out to 26??  I get 15, which leaves 11 characters
>>>>>>>>>>>>>>>>>>>>>>> unaccounted for.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I'm still looking at the logs to see if I can glean
>>>>>>>>>>>>>>>>>>>>>>> key information.  Later, if I could set up a crawl against the sharepoint
>>>>>>>>>>>>>>>>>>>>>>> instance in question, that would certainly help.  I can readily set up an
>>>>>>>>>>>>>>>>>>>>>>> ssh tunnel if that is what is required.  But I won't be able to do it until
>>>>>>>>>>>>>>>>>>>>>>> I get home tonight.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 1:58 PM, Dmitry Goldenberg <
>>>>>>>>>>>>>>>>>>>>>>> dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Karl,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This is everything that got generated, from the
>>>>>>>>>>>>>>>>>>>>>>>> very beginning, meaning that I did a fresh build, new database, new
>>>>>>>>>>>>>>>>>>>>>>>> connection definitions, start. The log must have rolled but the .1 log is
>>>>>>>>>>>>>>>>>>>>>>>> included.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> If I were to get you access to the actual test
>>>>>>>>>>>>>>>>>>>>>>>> system, would you mind taking a look? It may be more efficient than sending
>>>>>>>>>>>>>>>>>>>>>>>> logs..
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 1:48 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> These logs are different but have exactly the same
>>>>>>>>>>>>>>>>>>>>>>>>> problem; they start in the middle when the crawl is already well underway.
>>>>>>>>>>>>>>>>>>>>>>>>> I'm wondering if by chance you have more than one agents process running or
>>>>>>>>>>>>>>>>>>>>>>>>> something?  Or maybe the log is rolling and stuff is getting lost?  What's
>>>>>>>>>>>>>>>>>>>>>>>>> there is not what I would expect to see, at all.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I *did* manage to find two transactions that look
>>>>>>>>>>>>>>>>>>>>>>>>> like they might be helpful, but because the *results* of those transactions
>>>>>>>>>>>>>>>>>>>>>>>>> are required by transactions that take place minutes *before* in the log, I
>>>>>>>>>>>>>>>>>>>>>>>>> have no confidence that I'm looking at anything meaningful.  But I'll get
>>>>>>>>>>>>>>>>>>>>>>>>> back to you on what I find nonetheless.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> If you decide repeat this exercise, try watching
>>>>>>>>>>>>>>>>>>>>>>>>> the log with "tail -f" before starting the job.  You should not see any log
>>>>>>>>>>>>>>>>>>>>>>>>> contents at all until the job is started.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 1:11 PM, Dmitry Goldenberg
>>>>>>>>>>>>>>>>>>>>>>>>> <dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Attached please find logs which start at the
>>>>>>>>>>>>>>>>>>>>>>>>>> beginning. I started from a fresh build (clean db etc.), the logs start at
>>>>>>>>>>>>>>>>>>>>>>>>>> server start, then I create the output connection and the repo connection,
>>>>>>>>>>>>>>>>>>>>>>>>>> then the job, and then I fire off the job. I aborted the execution about a
>>>>>>>>>>>>>>>>>>>>>>>>>> minute into it or so.  That's all that's in the logs with:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> org.apache.manifoldcf.connectors=DEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> log4j.logger.httpclient.wire.header=DEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>> log4j.logger.org.apache.commons.httpclient=DEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 12:39 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Are you sure these are the right logs?
>>>>>>>>>>>>>>>>>>>>>>>>>>> - They start right in the middle of a crawl
>>>>>>>>>>>>>>>>>>>>>>>>>>> - They are already in a broken state when they
>>>>>>>>>>>>>>>>>>>>>>>>>>> start, e.g. the kinds of things that are being looked up are already
>>>>>>>>>>>>>>>>>>>>>>>>>>> nonsense paths
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I need to see logs from the BEGINNING of a fresh
>>>>>>>>>>>>>>>>>>>>>>>>>>> crawl to see how the nonsense paths happen.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 11:52 AM, Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>> Goldenberg <dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've generated logs with details as we
>>>>>>>>>>>>>>>>>>>>>>>>>>>> discussed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> The job was created afresh, as before:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Path rules:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* file include
>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* library include
>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* list include
>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* site include
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Metadata:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* include true
>>>>>>>>>>>>>>>>>>>>>>>>>>>> The logs are attached.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 11:20 AM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "Do you think that this issue is generic with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> regard to any Amz instance?"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I presume so, since you didn't apparently do
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> anything special to set one of these up.  Unfortunately, such instances are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not part of the free tier, so I am still constrained from setting one up
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for myself because of household rules here.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "For now, I assume our only workaround is to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> list the paths of interest manually"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Depending on what is going wrong, that may not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> even work.  It looks like several SharePoint web service calls may be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> affected, and not in a cleanly predictable way, for this to happen.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "is identification and extraction of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> attachments supported in the SP connector?"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF in general leaves identification
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and extraction to the search engine.  Solr, for instance uses Tika for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, if so configured.  You can configure your Solr output connection to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> include or exclude specific mime types or extensions if you want to limit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> what is attempted.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 11:09 AM, Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Goldenberg <dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, Karl. Do you think that this issue is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> generic with regard to any Amz instance? I'm just wondering how easily
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reproducible this may be..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For now, I assume our only workaround is to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> list the paths of interest manually, i.e. add explicit rules for each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> library and list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A related subject - is identification and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extraction of attachments supported in the SP connector?  E.g. if I have a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Word doc attached to a Task list item, would that be extracted?  So far, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> see that library content gets crawled and I'm getting the list item data
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but am not sure what happens to the attachments.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 10:48 AM, Karl Wright
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the additional information.  It
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does appear like the method that lists subsites is not working as expected
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> under AWS.  Nor are some number of other methods which supposedly just list
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the children of a subsite.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've reopened CONNECTORS-772 to work on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> addressing this issue.  Please stay tuned.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 10:08 AM, Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Goldenberg <dgoldenberg@kmwllc.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Most of the paths that get generated are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> listed in the attached log, they match what shows up in the diag report. So
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm not sure where they diverge, most of them just don't seem right.  There
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are 3 subsites rooted in the main site: Abcd, Defghij, Klmnopqr.  It's
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> strange that the connector would try such paths as:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /*Klmnopqr*/*Defghij*/*Defghij*/Announcements///
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- there are multiple repetitions of the same subsite on the path and to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> begin with, Defghij is not a subsite of Klmnopqr, so why would it try
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this? the /// at the end doesn't seem correct either, unless I'm missing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something in how this pathing works.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /Test Library
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1/Financia/lProjectionsTemplate.xl/Abcd/Announcements -- looks wrong. A
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> docname is mixed into the path, a subsite ends up after a docname?...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /Shared
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Documents/Personal_Fina/ncial_Statement_1_1.xl/Defghij/ -- same types of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> issues plus now somehow the docname got split with a forward slash?..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> There are also a bunch of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> StringIndexOutOfBoundsException's.  Perhaps this logic doesn't fit with the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pathing we're seeing on this amz-based installation?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'd expect the logic to just know that root
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contains 3 subsites, and work off that. Each subsite has a specific list of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> libraries and lists, etc. It seems odd that the connector gets into this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> matching pattern, and tries what looks like thousands of variations (I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> aborted the execution).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 16, 2013 at 7:56 AM, Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wright <daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To clarify, the way you would need to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> analyze this is to run a crawl with the wildcards as you have selected,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> abort if necessary after a while, and then use the Document Status report
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to list the document identifiers that had been generated.  Find a document
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> identifier that you believe represents a path that is illegal, and figure
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> out what SOAP getChild call caused the problem by returning incorrect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data.  In other words, find the point in the path where the path diverges
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from what exists into what doesn't exist, and go back in the ManifoldCF
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logs to find the particular SOAP request that led to the issue.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'd expect from your description that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problem lies with getting child sites given a site path, but that's just a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> guess at this point.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Sep 15, 2013 at 6:40 PM, Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wright <daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't understand what you mean by "I've
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tried the set of wildcards as below and I seem to be running into a lot of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cycles, where various subsite folders are appended to each other and an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extraction of data at all of those locations is attempted".   If you are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> seeing cycles it means that document discovery is still failing in some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> way.  For each folder/library/site/subsite, only the children of that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> folder/library/site/subsite should be appended to the path - ever.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you can give a specific example,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> preferably including the soap back-and-forth, that would be very helpful.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Sep 15, 2013 at 1:40 PM, Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Goldenberg <dgoldenberg@kmwllc.com>wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Quick question. Is there an easy way to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configure an SP repo connection for crawling of all content, from the root
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> site all the way down?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've tried the set of wildcards as below
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and I seem to be running into a lot of cycles, where various subsite
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> folders are appended to each other and an extraction of data at all of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> those locations is attempted. Ideally I'd like to avoid having to construct
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an exact set of paths because the set may change, especially with new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> content being added.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Path rules:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* file include
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* library include
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* list include
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* site include
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Metadata:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* include true
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'd also like to pull down any files
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> attached to list items. I'm hoping that some type of "/* file include"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should do it, once I figure out how to safely include all content.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Dmitry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message