manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: [Repository Connector] In Job Repository Config modification
Date Tue, 14 Apr 2015 10:20:50 GMT
Hi Alessandro,

Do not despair.  As I said before, even if all Box gives you is their user
interface, we can probably use that to do the job from ManifoldCF.

I am sure that the back-and-forth between the browser and their web page is
via HTTPS.  My first suggestion would be to install the Firefox plugin
called Live Headers, which you can find here:

https://addons.mozilla.org/en-us/firefox/addon/live-http-headers/

You will also need the curl utility.

What you want to do is obtain the contents of the web page whose form you
fill in when you interact with their site.  You can get that from Firefox,
or from curl, but you will want to understand the HTTP steps that you go
through to get to that page, most importantly, what cookies get set and
when.  You also want to record the HTML of the response page that includes
the token that you will need.  If they are really badass about this they
may present it in a gif or something, and then we'd really be screwed, but
if it is in normal text we should be able to do this.  You can check for
the latter situation by just viewing the result page html within the
browser.

What your goal is is to find a code way to fake out their web site using
automated means.  I would initially try constructing such a sequence
outside of MCF by writing a small java test class that is written with
httpcomponents httpclient.  I am happy to help you develop this by giving
advice over the next couple of days.

Thanks,
Karl



On Tue, Apr 14, 2015 at 5:02 AM, Alessandro Benedetti <
benedetti.alex85@gmail.com> wrote:

> I have not been working on this during tha last days, waiting for some
> feedback from Box as well, now I have an Update on this :
>
> "Hi Alessandro,
>
> There isn't a good way to bypass this process, and not something that we
> support. I'd recommend going through the browser step once, and just
> maintain / renew your access/refresh tokens such that you won't have to
> access a browser to make API calls.
>
> Apologies if this causes any inconvenience. I'll close this case out, but
> let me know if there's anything else I may assist with.
>
> Regards,
>
> Audrey
> Box User Services"
>
> This is complicating our situation, let me try to get more information
> because this is a very bad news for our use case.
>
> Cheers
>
> 2015-04-09 13:11 GMT+01:00 Karl Wright <daddywri@gmail.com>:
>
> > Ok, I've looked briefly at this.
> >
> > I have a reference as well.  It might be good to compare and contrast:
> >
> > https://aaronparecki.com/articles/2012/07/29/1/oauth2-simplified
> >
> > But nevertheless, let me put down what I think the flow is:
> >
> > (1) You register ManifoldCF with Box and get back a client ID and client
> > secret.  Those are permanent.
> > (2) The next step is to get an authorization token.  This currently seems
> > to require interaction with a UI (at least, that's how it is described in
> > the oauth documentation you provided).  The authorization token is valid
> > for only 30 seconds.
> > (3) From the authorization token, you can get talk to a Box API to get an
> > access token, which gives you access to the rest of the API.
> >
> >
> > Is this correct?
> >
> > If it is correct, then as I understand it, what we want is a ManifoldCF
> > setup like this:
> >
> > - The connection stores: client ID, client secret, user name, and user
> > password.  These are all permanent parts of the configuration.
> > - The connector will need to be able to obtain an access token on demand,
> > given the above information, when it concludes that it doesn't have a
> valid
> > one already
> > - Each connector instance will need to manage its own access token.  So
> if
> > there are 10 connections outstanding, there will be 10 independent access
> > tokens, each of which is obtained separately and expires separately.
> > That's the only way this connector is going to work properly across
> cluster
> > members etc.
> > - The process of obtaining the access token given all of the credentials
> > must be completely automated as part of the connector code.
> >
> > Since step (2) above seems to require UI interaction, which would make
> our
> > plan not work, we should figure out whether that's in fact the only way
> to
> > grant a user's permission.  My guess is that it is not; I'd put much
> money
> > on there being a programmatic way to do this.  Even if I am wrong about
> > that, with a little investigation of the UI interaction, I bet you can
> find
> > a URL that if you post the right information to, you will be able to
> figure
> > out what you need to post to obtain the authorization token.  At the very
> > worst, you can use a technique similar to how the Web connector submits
> > forms to fake out the Box UI.  I can certainly help you with that; the
> HTML
> > parser code is in common and is available for all connectors to use.
> >
> > Thoughts?
> > Karl
> >
> >
> >
> >
> >
> > On Thu, Apr 9, 2015 at 7:31 AM, Alessandro Benedetti <
> > abenedetti@apache.org>
> > wrote:
> >
> > > Of course Karl!
> > > This is the problem :
> > > Developing a Repo connector similar to Dropbox ( Box connector) .
> > > Authentication in Box is based on OAuth2.
> > > In details after a process to grant access to your application you get
> 2
> > > parameters for you Repository Connector :
> > > Access Token and Refresh Token [1]
> > >
> > > To instantiate a BoxAPIConnection you need a Client_id,Client_secret (
> 2
> > > not mutable) and an Access Token and a Refresh Token (2 mutable) .
> > > The access token expires in 1 hour, the Refresh Token can be used to
> get
> > a
> > > new Access Token, when this happens a new Access Token is produced (
> > 1h), a
> > > new Refresh Token is created and the old Refresh Token invalidated.
> > >
> > > Assuming the BoxAPIConnection object is managing properly the
> > refreshment,
> > > the Job will work until the BoxAPIConnection is living.
> > > When a Job finishes ( or Manifold stop and restart) a new Job will
> start
> > > with the old configured Access Token and Refresh Token ( that are not
> > valid
> > > anymore ).
> > >
> > > Unfortunately we can not set for the connector the only 2 not mutable
> > > params, as it is required user interaction to produce them so we need
> to
> > > configure all the 4 values.
> > > We can consider the Access Token and the Refresh Token produced by a
> > human
> > > user or an external application and sent to ManifoldCF.
> > > Using the current approach ManifoldCF should be able to update the
> values
> > > he has to be consistent with the updated values in BoxAPIConnection.
> > >
> > > A bigger problem comes when both a RepoConnector and an Authority
> > Connector
> > > are in place , but for this other complicate scenario I will wait
> until I
> > > have a clear situation from Box itself regarding their approaches.
> > >
> > > [1] https://developers.box.com/oauth/
> > >
> > >
> > >
> > > 2015-04-09 11:53 GMT+01:00 Karl Wright <daddywri@gmail.com>:
> > >
> > > > Hi Alessandro,
> > > >
> > > > It would be great if you could describe the customer problem from a
> bit
> > > > higher level, to see if there's a better design we could come up
> with.
> > > > What you have described is quite difficult to do with MCF due to the
> > > > multi-threaded and highly-cached nature of it.
> > > >
> > > > Thanks,
> > > > Karl
> > > >
> > > >
> > > > On Thu, Apr 9, 2015 at 5:55 AM, Alessandro Benedetti <
> > > > abenedetti@apache.org>
> > > > wrote:
> > > >
> > > > > Hi guys,
> > > > > I have one question :
> > > > > *ManifoldCF Version* : 1.8
> > > > >
> > > > > Developing a custom Repository Connector I have the need of
> updating
> > > the
> > > > > Repository Connector config based on a Custom Listener of events
> of a
> > > > > custom Publisher .
> > > > >
> > > > > This listener will react to the publisher events during a Job
> > > execution (
> > > > > i.e. can happen during the addSeeds or the processDocuments) .
> > > > > The listener will need to change the repository config accordingly
> > and
> > > > save
> > > > > them in the database.
> > > > > The main reason for this is that we need to store in the DB the
> > status
> > > of
> > > > > the publisher, because a new Job will need to use the updated Repo
> > > > > Connectors config ( changed by others jobs) .
> > > > > To simplify the problem let's assume we do not have concurrency
> > > problems
> > > > > right now.
> > > > > In the future we will need to  implement a solution that will be
> > thread
> > > > > safe.
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > --
> > > > > --------------------------
> > > > >
> > > > > Benedetti Alessandro
> > > > > Visiting card : http://about.me/alessandro_benedetti
> > > > >
> > > > > "Tyger, tyger burning bright
> > > > > In the forests of the night,
> > > > > What immortal hand or eye
> > > > > Could frame thy fearful symmetry?"
> > > > >
> > > > > William Blake - Songs of Experience -1794 England
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > --------------------------
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message