manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: [Windows Shares Connector] Un-expected removal of all documents
Date Tue, 31 Mar 2015 14:27:51 GMT
As an addition, this should be quite simple, not proceeding with the
processDocuments method, if the RepositoryConnector is not able to connect(
check method return not a proper message).

Right ?
Wondering where is the proper point to enter the action :)

Cheers

2015-03-31 14:59 GMT+01:00 Alessandro Benedetti <benedetti.alex85@gmail.com>
:

> Yes Karl,
>  I was thinking exactly that, to first check if the credentials are valid,
> before scanning all the documents.
> This because permissions per files depend on users/groups, but the current
> scenario is not in-validating the user, but invalidating the access of that
> user.
>
> An error must be thrown, but the docs not deleted ( not even scanned) .
>
> Furthermore, what will happen, in the case the server is down ?
> Are we safe in that scenario ?
>
> Cheers
>
> 2015-03-31 14:42 GMT+01:00 Karl Wright <daddywri@gmail.com>:
>
>> This is actually pretty standard behavior across our connector family, and
>> has been true since Day One.  The behavior comes from the basic broad
>> requirement that the crawler should keep going and skip the document when
>> the permissions do not allow it to be fetched.  With the Windows Share
>> connector, it's sometimes the case (when DFS is used a lot) that whole
>> subtrees of documents are not fetchable using the credentials supplied.
>> So
>> it is not so easy to just check for valid credentials at the beginning.
>>
>> For a solution, I'd be inclined to look for a way to figure out if the
>> credentials are actually *invalid*, and abort the job if so.  This is
>> distinct from the case where the credentials are valid but the connector
>> doesn't have permissions to read the document.  It will take some
>> experimentation to see if we get back different exception text in the two
>> situations.
>>
>> Karl
>>
>>
>> On Tue, Mar 31, 2015 at 9:30 AM, Alessandro Benedetti <
>> abenedetti@apache.org
>> > wrote:
>>
>> > Hi guys,
>> > playing with the Windows Shares Connector in ManifoldCF 1.8 I
>> encountered
>> > this problem :
>> >
>> > *Scenario*
>> > *1)* Indexing windows Shares server
>> > *2)* Indexing successfully finished with N docs indexed
>> > *3)* Offline ,while no indexing is happening, Shares server side, the
>> > Administrator password changes
>> > *4) *Repository Connector is not able to connect anymore(of course
>> because
>> > the password has changed)
>> > *5)* Next indexing cycle, ALL docs are removed from the index .
>> >
>> > *Expected Behaviour*
>> > As I user I would like to see an error message, that will let me
>> understand
>> > the issue, not losing all my N indexed docs .
>> >
>> > *Reason*
>> > Taking a look into the code, the problems seems to be in the :
>> >
>> >
>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector#getDocumentVersions
>> > where it tries to access each document singularly through Samba, and
>> > removing them one by one if not reachable anymore.
>> >
>> > *Solution*
>> > Before scanning each document, we have to be sure the connection is
>> > working.
>> > If not this is only armful.
>> >
>> > I will continue investigating, but I would like your opinion as well
>> >
>> > Cheers
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > --------------------------
>> >
>> > Benedetti Alessandro
>> > Visiting card : http://about.me/alessandro_benedetti
>> >
>> > "Tyger, tyger burning bright
>> > In the forests of the night,
>> > What immortal hand or eye
>> > Could frame thy fearful symmetry?"
>> >
>> > William Blake - Songs of Experience -1794 England
>> >
>>
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message