Per [#1861], chunk size will be 1. This provides best protection against protocol request limits and timeouts, and appears to streamline import too (by properly scheduling background tasks).
$ time python allura_import_test.py data/sf100.json -u https://sf-psokolovsky-3025.sb.sf.net -a tckff494ed3e68e8dafdf09 -s bc5799e658b48433ee0461da88921c9c62f3871cc6a2fa7ab612228a36fc4e062bf524bc88d7af00
real 4m55.725s
$time python allura_import_test.py data/sf1000.json -u https://sf-psokolovsky-3025.sb.sf.net -a tckff494ed3e68e8dafdf09 -s bc5799e658b48433ee0461da88921c9c62f3871cc6a2fa7ab612228a36fc4e062bf524bc88d7af00
real 51m29.992s
So, when run against sandbox, the import time is back to originally seen 1K/hr, unlike localdev improvement as reported in [#1861]. I wonder, if this has to do with amqp processing (which is now off on localdev, but as before in use on sandboxes).
Ok, this is done now, in ps/1767. Expected result is that import (e.g. allura_import_test.py) with hundreds and thousands of tickets should work now, whereas previously it timed out (against sandbox).
status: open --> code-review
assigned_to: Paul Sokolovsky --> Dave Brondsema
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
+ cnt = 0
+ for ticket_in in tickets_in:
+ cnt += 1
Is scripts/allura_api.py for just the import API? If so, let's rename the file and the class to reflect that. So it is not confused with the "regular" oauth API. Hmm... or is it using our old pre-oauth signing? That is planned to be removed.
ea52be296656a226bbee4f93b08c2443e24c790f could be improved by logging the exception when a retry happens. We've had trouble before with not knowing about exceptions because they were silenced with code similar to this.
Lastly, what would an appropriate test be? Running allura_import_test.py I assume. What about allura_import.py? Anything I need to know to run that? BTW, we could put allura_import_test.py into the forge-classic integration tests, which run against full operational sandboxes.
status: code-review --> in-progress
assigned_to: Dave Brondsema --> Paul Sokolovsky
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
allura_api.py - well, as I told previously, I was considering to make a single class to access all types of APIs we provide. So, I wonder to factor out ticket API access, remember that idea, and named if allura_api.py instead of allura_import_api.py . But looking at the evolution of API access, I guess it doesn't make much sense to have such single - it's ok to have general OAuth API client and special-purpose client(s) for restricted API(s). So, renamed to allura_import_api.py , and I made changes to [#1650].
Auto-retry on network error - added logging
For the tests, as we discussed, I'll move allura_import_test.py test (and corpus for it) to forge-classic, and look into make it fully automatic.
Test moved to forge-classic, made automatic. I also did some steps towards integration tests to be runnable on standalone Allura (e.g. localdev). The main culprit there is different set of users in Allura and small dataset, we should either bring them together, or use get_site_admin() and friends like proposed in my code.
ps/1767 in both forge and forge-classic
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Per [#1861], chunk size will be 1. This provides best protection against protocol request limits and timeouts, and appears to streamline import too (by properly scheduling background tasks).
Related
Tickets:
#1861Some stats:
$ time python allura_import_test.py data/sf100.json -u https://sf-psokolovsky-3025.sb.sf.net -a tckff494ed3e68e8dafdf09 -s bc5799e658b48433ee0461da88921c9c62f3871cc6a2fa7ab612228a36fc4e062bf524bc88d7af00
real 4m55.725s
$time python allura_import_test.py data/sf1000.json -u https://sf-psokolovsky-3025.sb.sf.net -a tckff494ed3e68e8dafdf09 -s bc5799e658b48433ee0461da88921c9c62f3871cc6a2fa7ab612228a36fc4e062bf524bc88d7af00
real 51m29.992s
So, when run against sandbox, the import time is back to originally seen 1K/hr, unlike localdev improvement as reported in [#1861]. I wonder, if this has to do with amqp processing (which is now off on localdev, but as before in use on sandboxes).
Related
Tickets:
#1861Ok, this is done now, in ps/1767. Expected result is that import (e.g. allura_import_test.py) with hundreds and thousands of tickets should work now, whereas previously it timed out (against sandbox).
The enumerate() function would be cleaner here:
Is scripts/allura_api.py for just the import API? If so, let's rename the file and the class to reflect that. So it is not confused with the "regular" oauth API. Hmm... or is it using our old pre-oauth signing? That is planned to be removed.
ea52be296656a226bbee4f93b08c2443e24c790f could be improved by logging the exception when a retry happens. We've had trouble before with not knowing about exceptions because they were silenced with code similar to this.
Lastly, what would an appropriate test be? Running allura_import_test.py I assume. What about allura_import.py? Anything I need to know to run that? BTW, we could put allura_import_test.py into the forge-classic integration tests, which run against full operational sandboxes.
enumerate() - done
allura_api.py - well, as I told previously, I was considering to make a single class to access all types of APIs we provide. So, I wonder to factor out ticket API access, remember that idea, and named if allura_api.py instead of allura_import_api.py . But looking at the evolution of API access, I guess it doesn't make much sense to have such single - it's ok to have general OAuth API client and special-purpose client(s) for restricted API(s). So, renamed to allura_import_api.py , and I made changes to [#1650].
For the tests, as we discussed, I'll move allura_import_test.py test (and corpus for it) to forge-classic, and look into make it fully automatic.
Related
Tickets: #1650
Test moved to forge-classic, made automatic. I also did some steps towards integration tests to be runnable on standalone Allura (e.g. localdev). The main culprit there is different set of users in Allura and small dataset, we should either bring them together, or use get_site_admin() and friends like proposed in my code.
ps/1767 in both forge and forge-classic
Btw, as inter-repo move was required, I merged some changes against #1768 into ps/1767 too.