burley
/p/lauge/
so -- all the wsgi instances are in use and occupied by git stuff for those 2 projects
let me strace a few of the processes and see i fthey are all stuck on a poll on a pipe
(why can't it be a pole on a pipe)
12:11
Dave
lauge looks like a typical size, perhaps a hundred commits
12:11
burley
all are stuck on a poll
all are stuck on a pull on a pipe
poll on a pipe
trying to think of what else we can learn before kicking it
...
burley
unless anyone else has other ideas -- I think we just keep an eye for more of this -- and I request that you guys add some debugging code to the git module to help us figure out where its failing if that's OK?
12:26
Wolf
sounds right ot me
12:27
burley
any chance the debug foo could be added soon ( like by early next week )?
I'll ponder ways to perhaps actively monitor for this
12:28
Wolf
yes
12:28
burley
thx -- I'll leave that in your hands then and just inform SOG of what to look for
and hopefully we can go to the next step when we get some debug info
ticketing this up SOG side and then gonna kick this head so it works again
so a few more minutes for anyone to request more poking till I unwedge us :)
so the real problem is probably in gitdb, but we need debugging foo locally to find it
see https://control.sog.geek.net/sog/trac/ticket/18371
Idea from rick: tell mod_wsgi to kill the proc after N seconds. nginx times out after 30, iirc
last insanehumaninv request was several hours before the outage:
lauge requests went right up until the outage. last successful one was
first failures were:
And nothing seems unusuable about those 3 lauge files.
Timeout requested: https://control.sog.geek.net/sog/trac/ticket/18396
_OpenedGitBlob.__iter__
infinite loopFound on https://sf-dbrondsema-2020.sb.sf.net/p/py27/code/ci/42cf02ea886a611f0b443295603b578e95687047/tree/datadiff/tools.py?diff=1f9b4a2dd8039230cf52b0e6ca36d879f17eef43
(repo imported from https://sourceforge.net/p/datadiff/code/)
forge:db/2040
Looks good. Merged to dev.