To avoid low rate limits for anonymous API access, we should use an oauth app. http://developer.github.com/v3/#rate-limiting
As best I can tell https://pypi.python.org/pypi/requests-oauthlib is the best oauth v2 library to use. (The "oauth2" library we already use, despite its name, only is for oauth v1) It's license is BSD/MIT style, based on the very good 'requests' library, has good docs and has an active git repo.
I am not super familiar with oauth v2 and github's setup, but based on what I know, here's how I think it should work. Each Allura instance (e.g. your development host, SourceForge, etc) will need to set up a their own Github OAuth App. Then those keys can be placed in the ini
file. Our github importer code will then do the oauth flow to authorize the user requesting an import. No scope is necessary since we're just doing public readonly fetching. We should store the appropriate user tokens (via user.set_tool_data
) so that they are available for the background task, and also can be re-used if the user wants to run another import.
This should all go through a shared mechanism (e.g. override the base ProjectExtractor.urlopen
in GitHubProjectExtractor
) so that it's used for all github related API access. This code should also check the rate limit values and when it reaches the limit, log a warning, and sleep for the amount of time needed until the limit resets).
Of course, we can modify this as needed if my understanding of github oauth isn't correct.
Created #437: [#6656] Github oauth application (4cp)
Related
Tickets:
#6656Workflow you described is good and looks like github's web application flow. http://developer.github.com/v3/oauth/
But implementing it will require additional interaction with user, such as redirecting to github to authenticate/allow access, when user accessing import page and has no oauth token yet. Is that okay with you?
I personally prefer the above, but there also a couple of options to implement server-server auth, that doesn't require interaction with user:
(1) is very simple to implement and it increases rate limit, but the downside is we'll have rate limit for per entire app, not per user, as with tokens.
(2) requires basic auth, so we should ask user for his github credentials, which is bad, or create github user for allura instance and put credentials to ini, which also seems pretty bad.
Yep I think having the user authorize our app on GitHub is the right way to do it is we don't have to deal with the limitations you describe in the alternate options
Closed #437.
je/42cc_6656
redirect_uri
passed from authorization calls to it.Ini options:
If app isn't set up (no app keys in config) it will still work, but use unauthorized, low rate limit access