incubator-allura-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Brondsema" <brond...@users.sf.net>
Subject [allura:tickets] #6464 Create tracker importer for Google Code using CSV and scraping
Date Wed, 07 Aug 2013 19:56:48 GMT
Failure against https://code.google.com/p/google-code-feed-gadget/issues/detail?id=1 and http://code.google.com/p/modwsgi/issues/detail?id=11

~~~~
  File "/home/dbrondsema/dbrondsema-1019/forge/ForgeImporters/forgeimporters/google/tracker.py",
line 53, in import_tool
    self.process_fields(ticket, issue)
  File "/home/dbrondsema/dbrondsema-1019/forge/ForgeImporters/forgeimporters/google/tracker.py",
line 82, in process_fields
    owner=issue.get_issue_owner(),
  File "/home/dbrondsema/dbrondsema-1019/forge/ForgeImporters/forgeimporters/google/__init__.py",
line 166, in get_issue_owner
    return UserLink(self.page.find(id='issuemeta').find('th', text=re.compile('Owner:')).findNext().a)
  File "/home/dbrondsema/dbrondsema-1019/forge/ForgeImporters/forgeimporters/google/__init__.py",
line 185, in __init__
    self.name = tag.string.strip()
AttributeError: 'NoneType' object has no attribute 'string'
~~~~

Would we want to convert # of stars to # of upvotes?

Fields for type, priority, opsys, component (more possible?) should be added as custom fields
and converted.

Need to use `skip_mod_date` (grep for examples) to preserve the mod_date you set.

Need to disable notifications.  googlecodewikiimporter does this already, and for the Trac
importer I suggested looking at a way to make it happen for all importers.

Need to call g.post_event('project_updated')

Everything is done as the current user.  Would it be better to do it as *anonymous?  That's
what some of our other importers do.

Since GC tickets and comments are plain text, whitespace is significant and should be preserved.
 Also special markdown chars need to be escaped.  E.g. http://code.google.com/p/modwsgi/issues/detail?id=1
and http://code.google.com/p/modwsgi/issues/detail?id=4#c5  To do so, use `forgeblog.command.rssfeeds.plain2markdown()`
That needs html2text which is GPL'd, so make sure you handle the lack of html2text gracefully.
 (And if you refactor `plain2markdown` to a more generic place, make sure you update SF's
forge-classic code reference to it)

Comments aren't posted on the Allura ticket in sequential order.  They seem random.

Attachment on a comment didn't get imported (from modwsgi #1)


---

** [tickets:#6464] Create tracker importer for Google Code using CSV and scraping**

**Status:** in-progress
**Labels:** import google-code 
**Created:** Mon Jul 15, 2013 05:21 PM UTC by Cory Johns
**Last Updated:** Wed Aug 07, 2013 07:54 PM UTC
**Owner:** Cory Johns

Since the Google Data API for Issues is deprecated and was scheduled to be shut down already
(June 14th, 2013), we'll need to create an implementation using the CSV list and scraping
to ensure that the Google Code importer continues to work.

The importer should follow the framework discussed on the [mailing list](http://mail-archives.apache.org/mod_mbox/incubator-allura-dev/201307.mbox/%3CCAEMb8zUg7Kem2aDxVzAqF3U4aKEj7jL3UO=UpX=2+NfY_P8kXQ@mail.gmail.com%3E)
and integrate with the project importer from [#6456].

        The list of tickets and their metadata can be retrieved via the CSV export list, e.g.,
https://code.google.com/p/modwsgi/issues/csv but the ticket body and comments will need to
be scraped from the web interface.  The description and comments can be retrieved from, e.g.,
https://code.google.com/p/modwsgi/issues/detail?id=22 by iterating over the items with `id="hc\d+"`
or `class="issuedescription|issuecomment"`.

The description and comments on issues don't support wiki syntax or HTML, so we can just convert
them to text.  User mapping will have the same issues, so whatever we end up doing in [#6461]
will apply here.


---

Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed to https://sourceforge.net/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/allura/admin/tickets/options.
 Or, if this is a mailing list, you can unsubscribe from the mailing list.
Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message