incubator-allura-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Brondsema <d...@brondsema.net>
Subject Re: developing a bulk export / backup feature
Date Fri, 21 Jun 2013 15:27:31 GMT
I found an nginx module for custom authentication that we could play around with
and see if it works:
http://mdounin.ru/hg/ngx_http_auth_request_module/file/a29d74804ff1/README  I'm
sure Apache has similar modules too.

If all we do is send an email, and don't show the status on the admin page
anywhere, a very long running backup could cause the admin to think it got stuck
or died, and thus request another backup.  I suppose we should have
anti-dogpiling logic to avoid that.

On 6/21/13 10:43 AM, Cory Johns wrote:
> Have we even tested serving large files through the app stack?  I strongly
> suspect they'd hit the long-request timeout.  I know I've hit it before
> when testing uploading large-ish attachments.
> 
> And on the subject of attachments, the API end-points already (or will with
> the next push) include attachment metadata, including the URL to download
> them from.  I definitely think that's good enough for now, as the admin can
> parse the URLs out and download them, if needed.  If that proves to be too
> onerous for doing project exports, then we can address it at that time.
> 
> Going back to serving up the exports, is there any way we could serve them
> outside of the app stack but still with authentication?  Such as a
> standalone, light-weight service that just serves files with authentication
> (could be useful for the screenshots and icons for private projects), or
> via authenticated SFTP?  This is verging on an infrastructure question at
> this point, but I definitely agree that we should have some auth in front
> of it but it's not going to be easy.
> 
> 
> On Tue, Jun 18, 2013 at 10:26 AM, Dave Brondsema <dave@brondsema.net> wrote:
> 
>> For us at SourceForge, we have a need to build a feature that lets project
>> admins download a backup/export of all their project data.  Since this is a
>> pretty big feature, I wanted to propose here how we might do it and get
>> feedback
>> & ideas before we proceed.
>>
>> Add a bulk_export() method to Application which would be responsible for
>> generating json for all the artifacts in the tool.  The format should
>> match the
>> API format for artifacts so that we're consistent.  Thus any tool that
>> implements bulk_export() would typically loop through all the artifacts
>> for this
>> instance (matching app_config_id) and convert to json the same way the API
>> json
>> is generated (e.g. call the __json__ method or RestController method; some
>> refactoring might be needed).  Multiple types of artifacts/objects could be
>> listed out in groups, e.g. Tracker app could have a list of tickets, list
>> of
>> saved search bins, list of milestones, and the tracker config data.
>>  Discussion
>> threads would need to be included too, ideally inline with the artifact
>> they go
>> with.  No permission checks would be done since this export would only be
>> available to admins (makes it faster & simpler).
>>
>> Provide a page on the Admin sidebar to generate a bulk export.  Project
>> admins
>> could choose individual tool instances, or all tools in the project (that
>> support it).  That form would kick off a background task which goes
>> through the
>> selected tools and runs their bulk_export() methods.  Save each tool's
>> data as
>> mount_point.json and zip them all together.
>>
>> It'd be easiest to store & deliver the zip files similarly to the code
>> snapshots
>> (static files not served through allura), but that won't be secure.  We'll
>> need
>> to either serve it through allura with authentication, or maybe name the
>> zip
>> file with a random name that can't be guessed (and then serve it directly
>> through apache or nginx).  Other ideas?
>>
>> When the task is complete, notify the user.  What way is best?  Send an
>> email?
>> Probably would be good to show a listing of available completed extracts
>> on the
>> extract page, so if any older ones are still sitting around they can be
>> retrieved (would be up to server admins to have a cron to delete old files)
>>
>> We could make this something that can be triggered automatically via the
>> API and
>> check status through the API, but that seems like a good thing to add on
>> later.
>>
>> Should we include attachments?  These would be important in some cases but
>> not
>> in others.  It could also increase the export size immensely in some cases.
>> Maybe leave out for now, and add in later when needed, possibly as an
>> option.
>>
>> Further thoughts on implementation details:
>>
>> So that a giant json string doesn't have to be held in memory for each
>> tool, the
>> export task should open a file handle for mount_point.json and send call
>> bulk_export() with that open file handle and each App can append to their
>> file
>> incrementally.
>>
>> If mongo performance is slow, some refactoring may be needed to avoid lots
>> of
>> individual mongo calls and be more batch oriented.  We can see how it goes.
>>
>> Could parallelize bulk_export() later, to do multiple tools at once.
>>
>>
>> Sound reasonable?  Any suggestions or other ideas?
>>
>>
>> --
>> Dave Brondsema : dave@brondsema.net
>> http://www.brondsema.net : personal
>> http://www.splike.com : programming
>>               <><
>>
> 



-- 
Dave Brondsema : dave@brondsema.net
http://www.brondsema.net : personal
http://www.splike.com : programming
              <><

Mime
View raw message