#2013 File browse server-side performance

v1.0.0
closed
sf-2 (994)
General
nobody
2015-08-20
2011-04-26
No

http://sourceforge.net/projects/boost/files/boost-binaries/1.46.1/ with 874 files.

PFS list response is quick, as is the PFS readme search. Instrumentation stats (pssh sfs-consume "grep boost/files/boost/1.46.1/: /var/log/sfpy/2011/04/*/stats.log") aren't super clear - some indication of mongo, some uncaptured time.

Aggregating stats for subfolders is expensive, but this folder has no subfolders. Ideas for improvement:

  • remove the naturallysorted() call, it's just a secondary sort in case of files with same date
  • Change dist.FileStats.m.get to be a single query with $in, instead of doing one query per file.
  • we could also attempt to parallelize a few things (menu fetch, readme fetch, files fetch & loop) but that could be hard and even counterproductive

Discussion

  • David Burley - 2011-04-26

    Worth adding is that we probably should figure out and define what is a reasonable count per dir if we don't limit it, even if the definition is internal. That way when the next project puts 80,000 files in a dir support and/or product can say hey, that's abuse so lets have no intent to fix -- then we can also properly benchmark performance for reasonability standards.

     
  • Dave Brondsema

    Dave Brondsema - 2011-04-27
    • milestone: limbo --> may-05
     
  • Dave Brondsema

    Dave Brondsema - 2011-04-28
    • size: --> 2
     
    • status: open --> in-progress
    • assigned_to: Tim Van Steenburgh
     
  • sfpy:tv/2013

    After profiling I found the biggest culprit to be the property access in the set_downloadability generator. Moving it out of the loop cut out about 8m function calls in my tests browsing 874 files. I saw between 45% - 80% speedup after this one change.

    To test, just make sure the file browser still works. I recommend we set this to 'validation' until we can test it on the boost binaries folder in production.

    • status: in-progress --> code-review
    • assigned_to: Tim Van Steenburgh --> Jenny Steele
     
  • Jenny Steele - 2011-05-27

    Looks good to me.

     
  • Jenny Steele - 2011-05-27
    • status: code-review --> validation
    • assigned_to: Jenny Steele --> Tim Van Steenburgh
     
    • status: validation --> closed
     

Log in to post a comment.