manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Salih Sen <sa...@dilisim.com>
Subject (Sharepoint 2010) Getting page language from its site
Date Wed, 27 May 2015 09:10:51 GMT
Hi everyone,

We have multiple sites with different languages in a Sharepoint 2010
instance. We're trying to crawl them all and index them in a Solr
instance that support multiple language.

As far as I understand (don't know much about SP), sites have
"Language" fields that identifies their languages but pages don't
inherit them out of the box.

Is there a way to pass "Language" field of a Site to its and/or documents?



Alternatively we were thinking about language detection on page's text
but content fields whole a lot of junk and warnings about secure
connection. Any idea on that issue would be appreciated as well.

Eg:
"stylesheet text/css
/Style%20Library/tr-TR/Themable/Core%20Styles/controls.css stylesheet
text/css /_layouts/1055/styles/Themable/corev4.css?rev=nuXy0rRtVdOASORuBej%2BwQ%3D%3D
text/xml alternate /finans/rus/_vti_bin/spsdisco.aspx stylesheet
/SiteAssets/css/default.css stylesheet
/SiteAssets/css/perfect-scrollbar.css stylesheet
/SiteAssets/css/main.css screen stylesheet
/SiteAssets/css/jquery.fancybox.css stylesheet
/SiteAssets/css/jquery-ui-1.10.4.css /SiteAssets/css/sector-finans.css
stylesheet screen shortcut icon /_layouts/images/favicon.ico
image/vnd.microsoft.icon GENERATOR Microsoft SharePoint progid
SharePoint.WebPartPage.Document stream_source_info errors.aspx
stream_content_type application/octet-stream stream_size 110283
Expires 0 X-UA-Compatible IE=Edge,chrome=1 Content-Encoding UTF-8
stream_name errors.aspx Content-Type text/html; charset=UTF-8
resourceName errors.aspx dc:title


Thanks.
-- 
Salih Şen

Dilişim Bilgi Bilgisayar ve İletişim Teknolojileri Sanayi ve Ticaret Ltd. Sti.

email: salih@dilisim.com

Tel: 0 222 330 20 21

GSM: 0 507 296 15 51

Mime
View raw message