manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: metadata problem for subsite libraries
Date Wed, 12 Mar 2014 15:47:54 GMT
Hi,

Just incase you want to compare, same parameters for a non-site document library that works
correctly.

metadataDescription : [Author, CheckoutUser, ContentType, Created, DocIcon, ID]
encodePath(site) : 
documentLibID : {53737D86-3748-4366-B490-056FBF2CD7EF}
decodedDocumentPath : /Documents/Haklar/Kitabi.docx
decodedDocumentPath.substring(cutoff+1) : Documents/Haklar/Kitabi.docx
dspStsWorks : false
metadataValues {Created=2013-01-02 16:47:51, DocIcon=docx, ID=311, Author=xxxx, ContentType=Document}

Thanks,
Ahmet



On Wednesday, March 12, 2014 5:40 PM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
 
Hi Karl,

metadataDescription : [ArticleByLine, ArticleStartDate, Audience, Author, ...]
encodePath(site) : /finans/ornekfinansoperasyon
documentLibID : {3434AB8D-5D1E-4987-A0C9-9FE30B00CDC0}
decodedDocumentPath : /finans/ornekfinansoperasyon/Sayfalar/AllNews.aspx
decodedDocumentPath.substring(cutoff+1) : Sayfalar/AllNews.aspx
dspStsWorks : false

Thanks,
Ahmet



On Wednesday, March 12, 2014 5:30 PM, Karl Wright <daddywri@gmail.com> wrote:
 
Ok, so the problem is that it is not successfully fetching the metadata values from SharePoint.

At line 1750:

>>>>>>
                int cutoff = decodedLibPath.lastIndexOf("/");
                metadataValues = proxy.getFieldValues( metadataDescription,
encodePath(site), documentLibID, decodedDocumentPath.substring(cutoff+1), dspStsWorks );
<<<<<<

... I'd want to know what all of the arguments are to proxy.getFieldValues().  Then if they
look good we'll have to dig deeper into the actual communication layer.

Karl






On Wed, Mar 12, 2014 at 11:17 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:

Hi Karl,
>
>
>
>metadataDescription (right after the unpack) : [ArticleByLine, ArticleStartDate, Audience,
Author, CampaignType, Charges, CheckoutUser, Comments, ContentType, Created, CustomFieldContent,
CustomFieldName, CustomTabContent, CustomTabName, DisplayName,  _UIVersionString]
>
>
>
>Thanks,
>Ahmet
>
>
>
>On Wednesday, March 12, 2014 5:07 PM, Karl Wright <daddywri@gmail.com> wrote:
> 
>Hi Ahmet,
>
>For sanity, please try printing out metadataDescription right after the unpack on line
1700:
>
>>>>>>>
>              ArrayList metadataDescription = new ArrayList();
>              int startPosition = unpackList(metadataDescription,version,0,'+');
><<<<<<
>
>Thanks,
>Karl
>
>
>
>
>
>On Wed, Mar 12, 2014 at 11:01 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
>
>Hi Karl,
>>
>>
>>metadataValues just before the  fetchAndIndexFile is empty {}
>>
>>
>>Thanks,
>>Ahmet
>>
>>
>>
>>On Wednesday, March 12, 2014 4:44 PM, Karl Wright <daddywri@gmail.com> wrote:
>> 
>>Hi Ahmet,
>>
>>The field names are unpacked at line 1699:
>>
>>>>>>>>
>>              ArrayList metadataDescription = new ArrayList();
>>              int startPosition = unpackList(metadataDescription,version,0,'+');
>><<<<<<
>>
>>Starting at 1729, the metadata values are fetched:
>>
>>>>>>>>
>>              Map<String,String> metadataValues = null;
>>              if (metadataDescription.size() > 0)
>>              {
>>                // Retrieve the library guid from carrydown data
>>                String[] libIDs = activities.retrieveParentData(documentIdentifier,
"guids");
>>
>>...
>><<<<<<
>>
>>This gets the metadata from SharePoint at line 1750:
>>
>>>>>>>>
>>                int cutoff = decodedLibPath.lastIndexOf("/");
>>                metadataValues = proxy.getFieldValues( metadataDescription,
encodePath(site), documentLibID, decodedDocumentPath.substring(cutoff+1), dspStsWorks );
>><<<<<<
>>
>>The metadata values are indexed at line 1764:
>>
>>>>>>>>
>>              if (!fetchAndIndexFile(activities, documentIdentifier,
version, fileUrl, serverUrl + encodedServerLocation + encodedDocumentPath,
>>                acls, denyAcls, createdDate, modifiedDate, metadataValues,
guid, sDesc))
>><<<<<<
>>
>>What I think you want to do is to print out the metadataValues contents just before
the fetchAndIndexFile method.  If they look good there, then we'll take the next step.
>>
>>Karl
>>
>>
>>
>>
>>
>>
>>
>>On Wed, Mar 12, 2014 at 10:29 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
>>
>>Hi Karl,
>>>
>>>
>>>sortedMetaDataFields prints all fields that I select from UI. e.g. [ArticleByLine,
ArticleStartDate, Audience, Author, CampaignType… ]
>>>What should be next step?
>>>
>>>
>>>Thanks,
>>>Ahmet
>>>
>>>
>>>
>>>On Wednesday, March 12, 2014 3:21 PM, Karl Wright <daddywri@gmail.com> wrote:
>>> 
>>>Hi Ahmet,
>>>
>>>I misspoke; the rules for metadata pay attention only to a path.
>>>
>>>The only way we can make progress here is to do some debugging.  In your trunk
checkout, have a look at SharePointRepository.java starting at line 993:
>>>
>>>>>>>>>
>>>            // == Document path ==
>>>            // Convert the modified document path to an unmodified
one, plus a library path.
>>>            String decodedLibPath = documentIdentifier.substring(0,dLibSeparatorIndex);
>>>            String decodedDocumentPath = decodedLibPath + documentIdentifier.substring(dLibSeparatorIndex+1);
>>>            if (checkIncludeFile(decodedDocumentPath,spec))
>>>            {
>>>              // This file is included, so calculate a version string. 
This will include metadata info, so get that first.
>>>              MetadataInformation metadataInfo = getMetadataSpecification(decodedDocumentPath,spec);
>>>
>>><<<<<<
>>>
>>>The class MetadataInformation describes the metadata that will be included given
the document path.  Later, at line 1023, specified fields that are also part of the library
the document is in are found:
>>>
>>>>>>>>>
>>>                String[] sortedMetadataFields = getInterestingFieldSetSorted(metadataInfo,libFields);
>>><<<<<<
>>>
>>>I suggest modifying the connector to print the contents of sortedMetadataFields
for each document that comes along.  You will need to do whatever necessary to force the
recrawl of just those documents whose metadata you are not getting.  If sortedMetadataFields
does not contain the fields you expect, that means that there is something wrong with how
the rules are being interpreted, or in how the fields for the library are being discovered. 
If it contains the right fields, then the problem must be in how the field names are getting
packed and unpacked from the version string.  Either way, please let me know.
>>>
>>>Karl
>>>
>>>
>>>
>>>
>>>
>>>On Wed, Mar 12, 2014 at 9:10 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
>>>
>>>Hi Karl,
>>>>
>>>>I am sorry but I don't follow. I assume, in my config, Paths/PathRule is correct
since it fetches documents (with no metadata). 
>>>>
>>>>In meta data section, there is no place for 'entity type'.
>>>>
>>>>Can you please elaborate? 
>>>>
>>>>Thanks,
>>>>Ahmet
>>>>
>>>>
>>>>On Wednesday, March 12, 2014 2:57 PM, Karl Wright <daddywri@gmail.com>
wrote:
>>>>
>>>>To clarify: Rules you define must match both the entity type (e.g. site, list,
lib, or document), as well as the path.  So the example you provided, since it does not specify
the entity type, is incomplete.
>>>>
>>>>Karl
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>On Wed, Mar 12, 2014 at 8:44 AM, Karl Wright <daddywri@gmail.com> wrote:
>>>>
>>>>Hi Ahmet,
>>>>>
>>>>>All I can remember about this coming up before involved people not having
appropriate metadata rules.  So if you include a screen shot of your metadata rules, that
ought to help clarify what is happening.
>>>>>
>>>>>FWIW, metadata for a library will require you to have an explicit matching
library rule on your metadata tab.  Since this is a subsite, you will also need a site rule.
>>>>>
>>>>>Thanks,
>>>>>Karl
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>On Wed, Mar 12, 2014 at 8:35 AM, Ahmet Arslan <iorixxx@yahoo.com>
wrote:
>>>>>
>>>>>Hi,
>>>>>>
>>>>>>I am connection a SharePoint 2010 instance with both trunk and ManifoldCF
1.5.1 version.
>>>>>>
>>>>>>When I define a job to crawl a document library by "add site", no
MetaData is sent to output connector. I can see list of metadata and select them. But only
GUID (although I don't select GUID nor it is listed in the list) is sent. Documents are indexed
but no metadata.
>>>>>>
>>>>>>There is no metadata problem with Lists.
>>>>>>
>>>>>>
>>>>>>'Document Library' Example
>>>>>>/site1/site2/Documents/* does not honour selected MetaData.
>>>>>>/Documents/* honurs selected MetaData.
>>>>>>
>>>>>>I think someone has reported similar  problems (for document library
under {sub}(site) in the past but I couldn't find the e-mail or jira.
>>>>>>
>>>>>>Thanks,
>>>>>>Ahmet
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
Mime
View raw message