#6734 Research using re2 as a replacement for re module

v1.0.1
closed
sf-2 (994)
General
nobody
2015-08-20
2013-10-03
No

In researching slow-rendering discussion threads, it was discovered that some individual posts are taking a long time to render. The posts in question are sizable (23k) chunks of plain html. See is re2 can be used to speed up the rendering.

Discussion

  • Tim Van Steenburgh

    • status: in-progress --> closed
     
  • Tim Van Steenburgh

    allura:tv/6734

    TL;DR: re2 turns out to be 20x slower than re for our use cases.

    Added md_perf.py script for timing/profiling discussion thread rendering.

    re2 can't be used as a drop-in replacement in Markdown b/c back-referencing regexes are prevalent in Markdown, and not supported by re2.

    Profiling showed that for the slow-to-render posts, all the time was being spent regex matching in the HtmlPattern class in markdown.inlinepatterns. I tried to use re2 just in this class, but the resulting performance was 20x slower than with re.

    Conclusion: Speeding up our MD-rendering won't be as easy as just dropping in re2. Rendering with ForgeExtensions is about 50% slower than with a vanilla Markdown() instance, so perf gains there may be possible. Best option may be to simply cache the MD-converted values instead of rendering them on-the-fly.

     

Log in to post a comment.