https://sourceforge.net/p/allura/tickets/milestones in particular is getting really slow. I suspect its due to queries for the "Progress" column for every milestone. Perhaps those can be run in a single mongo query somehow, or parallelize with threads. Or just skipped for closed milestones.
al/7566,
Search used BTreeCursor, to optimize the performance, I've replaced the find statement with count, which is much faster. For those tickets where acl is specified, I still 'find' them. The performance was slow, because you have to place all the ticket objects in memory and go through every single one of them to verify acl compatibility. I am not very sure that these acl checks are really necessary there, but to keep it backward compatible I still check for permissions for tickets with acl specified.
Hm, where did you put your branch? Can't see it in the apache nor the sf repo
I think it's probably in his fork at https://forge-allura.apache.org/u/alexluberg/allura/ci/master/tree/
https://forge-allura.apache.org/u/alexluberg/allura/ci/al/7566/~/tree/
With current approach I see about 40% boost in overall page load on test environment so it's pretty fine now (generated 5000 tickets with a script to see the difference)
At first I had a concern about skipping permissions check for tickets with
acl=[]
, since if there's no acl for a ticket, permission checking logic checks acl in a parent security context (app_config in this example), and I thought there would be problems with this. Experimentally, though, it seems like it doesn't affect milestones page.Second concern is that there's no index on acl, so those queries might be not so fast. So, if we stick with this approach, maybe we should add index on acl and see if it affects those queries.
But, I think that removing permissions checks entirely for this method would be great, since we're presenting only counts here, not tickets. Currently it's used only in two places:
/p/test/tickets/milestone/2.0/
) - I see 2 counts here (one from pagination widget and a second one frommilestone_count()
and they do not match if you not logged in as admin (both on master and on al/7566)Also, I don't see a problem with those values mismatch, 'cause the
milestone_count()
used to represent the progress, and I believe anyone should be able to see a real counts in milestone progress, even if they don't have permissions to see some of actual tickets.I think we also need Dave's opinion on this :)
Did either or both you do tests with large number of milestones? (Not just large number of tickets). I think the problem with e.g. https://sourceforge.net/p/allura/tickets/milestones is that there are over 200 milestones. Here is a timing/call-count capture: https://sourceforge.net/p/allura/pastebin/53ee56e4b9363c5fda828201 Doing everything sequentially is what is adding up, particularly for IO operations to mongo. Each is fast on its own, but doing so many is the problem.
I'm concerned that the new implementation has 3 mongo queries instead of 1, for every single milestone. But there is also reduced ACL related queries which might help. I'd suggest mocking up data with lots of milestones and a good number of tickets too. Then enable
stats.sample_rate
in the.ini
file and see what the mongo call counts are in stats.log. Compare master to this branch. Also compare to having nosecurity.has_access
call at all (that might actually not decrease mongo calls too much since ACL info is heavily cached). See what you can get with the smallest "mongo" call count.That all said I think it would be safe to skip the ACL checks altogether for this particular page, for reasons Igor gave. I'd rather not make the milestone sidebar any more incorrect. Perhaps add a
skip_acl_checks=False
default param to themilestone_count
method and pass inTrue
when called by the milestone admin page.I have tested before/after with 5k+ tickets& about 40 milestones (I used Igor's script to generate fake data).
My results are: 24.93 s -> 8.7 s.