Git Merge Request #358: Filter TaskCommands by age of task (merged)

Merging...

Merged

Something went wrong. Please, merge manually

Checking if merge is possible...

Something went wrong. Please, merge manually

Dillon Walls wants to merge 1 commit from /u/dill0wn/allura/ to master, 2021-05-24

This originally started out with a purge-specific age filter, but made it generic for all TaskCommand operations.

I updated and augmented the tests in test_commands to cleanup MonqTask records, and to rigorously test purges based on age.

Commit Date  
[bb4936] (dw/taskcmd_filter_age) by Dillon Walls Dillon Walls

add filter by age of task to TaskCommands

2021-05-17 22:00:18 Tree

Discussion

  • Dave Brondsema

    Dave Brondsema - 2021-05-19

    Querying by state + result_type + time_queue won't quite be able to use the index on (state, priority, time_queue). Priority is always 10 (at least in core allura), so maybe could add that into the query, but that would add complexity and maybe its fine if it runs slow? Another option is to query based on _id instead of time_queue, it'd be accurate most of the time but not technically exactly the same

    With a goal of removing old historical records, we should be able to delete all states, not just complete. I'm thinking the --state option should be expanded to work with all task commands, but it'll have to have different defaults for different cmds? And "* means all" is a nice way of handling it, although I'd maybe change the implementation (currently omits querying on state) to query state={'$in': MonQTask.states} that way there's a chance of using indexes (I think mongo can use multiple indexes at once)

     
  • Dillon Walls - 2021-05-21

    After experimenting with mongodb.cursor.explain(), performance still seems pretty fast on state, time_queue queries. Even though we are skipping the middle prefix, the query still uses the state_1_priority_-1_time_queue_1 index. Because of this, Dave and I agreed that the current query is fine rather than relying on a query that assumes priority values or uses the less-clear _id.

    I implemented Dave's suggested optimization to the TaskCommands that use the --state='*' option. If we omit state from the query when state=*, then mongo resorts to a (very slow) COLLSCAN plan. However, if we use @daves suggestion of state={'$in': MonQTask.states}, mongo leverages the index and the speed seems much better.

    Additionally this branch also:
    Adds a new count task command. It is similar to the list command except it counts the number of matching tasks instead printing them out.
    Updates nearly all docstrings in TaskCommand
    Adds a new TaskCommand._print_query method that prints the mongodb query being run by the chosen TaskCommand.
    Modifies each TaskCommand to call the aforementioned TaskCommand._print_query for easier debugging.
    Implements a system where some commands have enforced states they query on (i.e. commit and retry) but others merely have defaults that can be overriden by the --state flag.
    Updates the purge command to default to state: '*' instead of state: 'complete'.
    * Updates the purge command to respect the --state flag.

     
  • Dave Brondsema

    Dave Brondsema - 2021-05-24
    • Status: open --> merged
     

Log in to post a comment.