Changes between Version 2 and Version 3 of OpenData


Ignore:
Timestamp:
Sep 7, 2011, 11:44:04 AM (13 years ago)
Author:
Jacob
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OpenData

    v2 v3  
    1010.. contents:: What's available:
    1111
    12 Please take it, mash it up, and let show us the results!
     12Please take it, mash it up, and show us the results!
    1313
    1414If there's other data you'd like to see, please get in touch (``jacob -at- jacobian.org``) and let me know what you'd like to see. I'll do my best!
     
    1717===============
    1818
    19 Data dumps out of Trac, our ticket tracking software.
     19Data dumps out of Trac, our ticket tracking software. You could use this to get
     20information about our ticket workflow, patches, etc.
    2021
    2122There's two ways to access the data: `Trac's RPC interface`_ and the `daily
     
    2627
    2728These are direct data dumps of the Trac database, collected nightly, in various
    28 formats. They're sanitized to remove some tables with senstive info (session
     29formats. They're sanitized to remove some tables with sensitive info (session
    2930data, etc.) but are otherwise complete.
    3031
     
    8182.. _xmlrpclib: http://docs.python.org/library/xmlrpclib.html
    8283
     84Repository data/dumps
     85=====================
     86
     87Data and dumps from our source control repository. You could use this to mine
     88information about who's committing code, when, etc.
     89
     90There are a few ways of accessing this data: `Querying the SVN repo`_,
     91`the GitHub API`_, and `SVN data dumps`_ in a variety of formats.
     92
     93Querying the SVN repo
     94---------------------
     95
     96Django's SVN repository is at http://code.djangoproject.com/svn/django/;
     97you can use the ``svn`` client binary to interact with this as a sort of "API".
     98In particular, most ``svn`` commands take a ``--xml`` argument to return data
     99in XML. For example, to get information about a particular commit you might
     100do something like::
     101
     102    $ svn log http://code.djangoproject.com/svn/django/trunk -r1234 --xml
     103    <?xml version="1.0"?>
     104    <log>
     105    <logentry
     106       revision="1234">
     107    <author>jacob</author>
     108    <date>2005-11-14T18:50:13.298556Z</date>
     109    <msg>Added NOINDEX tag to debug 500 page (for robots)</msg>
     110    </logentry>
     111    </log>
     112
     113There are also a number of libraries in Python (and other languages) that can
     114access SVN directly. `pysvn`_ seems to be a popular choice.
     115
     116.. _pysvn: http://pysvn.tigris.org/
     117
     118The GitHub API
     119--------------
     120
     121Django's repository is mirrored onto GitHub (http://github.com/django/django),
     122which means you can use `GitHub's API`_ to to pull commit data. For example::
     123
     124    $ curl -i https://api.github.com/repos/django/django/git/commits/a0d59b49019d65b38c5612eb0b4fab0bb37271ae
     125    HTTP/1.1 200 OK
     126    Server: nginx/1.0.4
     127    Date: Wed, 07 Sep 2011 16:38:12 GMT
     128    Content-Type: application/json
     129    Connection: keep-alive
     130    Status: 200 OK
     131    X-RateLimit-Limit: 5000
     132    X-RateLimit-Remaining: 4994
     133    Content-Length: 995
     134   
     135    {
     136      "parents": [
     137        {
     138          "url": "https://api.github.com/repos/django/django/git/commits/6465e005fd564bd75ba64f2f09d5824ed2455c9c",
     139          "sha": "6465e005fd564bd75ba64f2f09d5824ed2455c9c"
     140        }
     141      ],
     142      "committer": {
     143        "date": "2005-11-14T10:50:13-08:00",
     144        "name": "jacob",
     145        "email": "jacob@bcc190cf-cafb-0310-a4f2-bffc1f526a37"
     146      },
     147      "author": {
     148        "date": "2005-11-14T10:50:13-08:00",
     149        "name": "jacob",
     150        "email": "jacob@bcc190cf-cafb-0310-a4f2-bffc1f526a37"
     151      },
     152      "message": "Added NOINDEX tag to debug 500 page (for robots)\n\ngit-svn-id: http://code.djangoproject.com/svn/django/trunk@1234 bcc190cf-cafb-0310-a4f2-bffc1f526a37\n",
     153      "url": "https://api.github.com/repos/django/django/git/commits/a0d59b49019d65b38c5612eb0b4fab0bb37271ae",
     154      "sha": "a0d59b49019d65b38c5612eb0b4fab0bb37271ae",
     155      "tree": {
     156        "url": "https://api.github.com/repos/django/django/git/trees/a5d296a396f5bbf70d074ce09fa947f95cd91523",
     157        "sha": "a5d296a396f5bbf70d074ce09fa947f95cd91523"
     158      }
     159    }
     160
     161.. _github's api: http://developer.github.com/v3/
     162
     163SVN data dumps
     164--------------
     165
     166Finally, for convenience, we provide a couple of full dumps of repository data
     167for off-line processing:
     168
     169    * `Complete SVN log`_ (bzipped XML; ~1 MB). This is the complete output of
     170      ``svn log --xml``.
     171
     172    * `Full SVN dump`_ (bziiped SVN dump; ~200 MB, expands to ~ 1.8 GB). This
     173      is the result of a ``svnadmin dump``.
     174
     175Each dump is updated nightly.
     176
     177.. _complete svn log: https://www.djangoproject.com/m/data/django-svn-log.xml.bz2
     178.. _full svn dump: https://www.djangoproject.com/m/data/django-svn.svndump.bz2
     179
    83180Mashups
    84181=======
Back to Top