Apache Allura™ / Tickets / #6324 Slow project_role queries due to many unnamed roles per project

#6324 Slow project_role queries due to many unnamed roles per project

Milestone: v1.0.0

Status: closed

Owner: Cory Johns

Labels: performance (126) sf-1 (616)

Component: General

Reviewer: Tim Van Steenburgh

Updated: 2015-08-20

Created: 2013-06-03

Creator: Dave Brondsema

Private: No

The "forge" project (where we have support tickets) had thousands of unnamed roles:

> db.getSiblingDB('pyforge').project_role.find({project_id:ObjectId('4bfaab391be1ce08a2000026'), name:null}).count()
4299

Which causes slow lookups for anonymous browsing of the project (which is common, of course).

Mon Jun  3 18:31:43 [conn6267309] query pyforge.project_role query: { project_id: { $in: [ ObjectId('50a2cd1e2718461589e3795e') ] }, name: "*anonymous" } ntoreturn:0 ntoskip:0 nscanned:4645 keyUpdates:0 numYields: 24 locks(micros) r:42538 nreturned:1 reslen:108 1238ms

Are all the unnamed roles really necessary? If they actually are, maybe the best we can do is change the (project_id) index to (project_id, name). Or make use of the existing (user_id, project_id, name) index by adding user_id:null to the *anonymous query.

Discussion

Cory Johns - 2013-06-19

status: open --> in-progress

assigned_to: Cory Johns
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Cory Johns - 2013-06-20

status: in-progress --> code-review

Size: --> 1
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Cory Johns - 2013-06-20

allura:cj/6324

A few small changes, mostly just adding explicit user_id: None or name: None to force ProjectRole queries to use the index.

Unfortunately, the Credentials cache is cleared on each request due to the allura_globals_middleware and it's not at all obvious how to persist it longer due to concerns about how large it could grow and when / how to expire entries (there is no cache clearing when permissions are updated, for example).

Regarding the cache size concern, User.project_role() will create a ProjectRole for the user ID / project ID pair whenever a permissions check is done for that user against that project. This means there are potentially project.count() * user.count() number of unnamed ProjectRoles that could be created in mongo and the memory cache. If every current user visited every current project, that would create 1,980,464,408,910 unnamed ProjectRoles.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Cory Johns - 2013-06-20

Milestone: forge-backlog --> forge-jun-28
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tim Van Steenburgh - 2013-06-25

QA: Tim Van Steenburgh
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tim Van Steenburgh - 2013-06-25

status: code-review --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Apache Allura™

Forge software for hosting software projects

Milestone

Searches

Help

#6324 Slow project_role queries due to many unnamed roles per project

Discussion