Originally created by: jwb1980
https://sourceforge.net/p/forge/site-support/8700/
[forge:site-support:#8700]
From IRC #sourceForge
download the source code of this project https://sourceforge.net/p/nhunspell/code/ci/default/tree/
3:55 When I try the snapshot Sourceforge says "We're having trouble finding that snapshot. Would you like to resubmit?"
3:55 TortoiseSVN gives me error 500 in my fork repository
Traceback (most recent call last): File "/var/local/allura/Allura/allura/model/monq_model.py", line 265, in __call__ self.result = func(*self.args, **self.kwargs) File "/var/local/allura/Allura/allura/tasks/repo_tasks.py", line 145, in tarball repo.tarball(revision, path) File "/var/local/allura/Allura/allura/model/repository.py", line 666, in tarball self._impl.tarball(revision, path) File "/var/local/env-allura/lib/python2.7/site-packages/TimerMiddleware-0.4.4-py2.7.egg/timermiddleware/__init__.py", line 117, in wrapper return self.run_and_log(func, inst, *args, **kwargs) File "/var/local/env-allura/lib/python2.7/site-packages/TimerMiddleware-0.4.4-py2.7.egg/timermiddleware/__init__.py", line 126, in run_and_log return func(*args, **kwargs) File "/var/local/env-allura/lib/python2.7/site-packages/ForgeHg-0.2.0-py2.7.egg/forgehg/model/hg.py", line 351, in tarball commands.archive(HgUI(), self._hg, path, rev=commit, prefix='') File "/var/local/env-allura/lib/python2.7/site-packages/mercurial-3.0-py2.7-linux-x86_64.egg/mercurial/commands.py", line 382, in archive matchfn, prefix, subrepos=opts.get('subrepos')) File "/var/local/env-allura/lib/python2.7/site-packages/mercurial-3.0-py2.7-linux-x86_64.egg/mercurial/archival.py", line 298, in archive write(f, 'x' in ff and 0755 or 0644, 'l' in ff, ctx[f].data) File "/var/local/env-allura/lib/python2.7/site-packages/mercurial-3.0-py2.7-linux-x86_64.egg/mercurial/archival.py", line 258, in write archiver.addfile(prefix + name, mode, islink, data) File "/var/local/env-allura/lib/python2.7/site-packages/mercurial-3.0-py2.7-linux-x86_64.egg/mercurial/archival.py", line 214, in addfile f = self.opener(name, "w", atomictemp=True) File "/var/local/env-allura/lib/python2.7/site-packages/mercurial-3.0-py2.7-linux-x86_64.egg/mercurial/scmutil.py", line 276, in __call__ f = self.join(path) File "/var/local/env-allura/lib/python2.7/site-packages/mercurial-3.0-py2.7-linux-x86_64.egg/mercurial/scmutil.py", line 338, in join return os.path.join(self.base, path) File "/var/local/env-allura/lib64/python2.7/posixpath.py", line 71, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 27: ordinal not in range(128)Diff:
This seems like a problem in mercurial-py, though I am still investigating it. The trigger for error is this directory, which contains some files with unicode symbols in their names https://sourceforge.net/p/nhunspell/code/ci/default/tree/NHunspell/UnitTests/
Interestingly, if you click on one of these, you will get a 500 error https://sourceforge.net/p/nhunspell/code/ci/default/tree/NHunspell/UnitTests/de_DE_%C3%B6_frami.aff
So I made an interesting finding. The file in question has a name de_DE_ö_frami.aff.
I cloned the repo and made some experiments in the same dir as the file. First I tried usual tricks with .encode('utf-8') and the like, but it didn't help. But then it struck me:
In [6]: path = os.listdir('.')[-11] In [7]: path Out[7]: 'de_DE_\xf6_frami.aff' In [8]: print path de_DE_�_frami.aff n [12]: "de_DE_ö_frami.aff" == 'de_DE_\xf6_frami.aff' Out[12]: False In [13]: os.listdir(u'.')[-11] Out[13]: 'de_DE_\xf6_frami.aff' In [14]: "de_DE_ö_frami.aff" Out[14]: 'de_DE_\xc3\xb6_frami.aff'So this seems like python's os.listdir reports the filename incorrectly! I experimented with cyrillic file names and found no problems
In [8]: os.listdir('.')[-8] Out[8]: '\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' In [9]: print '\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' привет In [10]: os.listdir(u'.')[-8] Out[10]: u'\u043f\u0440\u0438\u0432\u0435\u0442' In [12]: print os.listdir(u'.')[-8] приветOther random unicode file names like "ᕕ┌◕ᗜ◕┐ᕗ" don't have this problem either
In [6]: os.listdir('.')[0] Out[6]: '(\xe2\x95\xaf\xc2\xb0\xe2\x96\xa1\xc2\xb0\xef\xbc\x89\xe2\x95\xaf\xef\xb8\xb5 \xe2\x94\xbb\xe2\x94\x81\xe2\x94\xbb' In [7]: print os.listdir('.')[0] (╯°□°)╯︵ ┻━┻ In [8]: os.listdir('.')[1] Out[8]: '\xe1\x95\x95\xe2\x94\x8c\xe2\x97\x95\xe1\x97\x9c\xe2\x97\x95\xe2\x94\x90\xe1\x95\x97' In [9]: print os.listdir('.')[1] ᕕ┌◕ᗜ◕┐ᕗConclusion: we have a strange rare bug with python's os module scrambling unicode filenames.
Last edit: LXj 2015-06-30
I saw this error while working on a docker ticket on all repos with unicode filenames. Generating UTF-8 locale and setting it as default fixed that for docker:
It seems like deployment specific thing to me. Server might be missing some locale, which is needed to properly decode filenames. We'll investigate it further to confirm.
Closed #811.
forgehg:ib/7757The problem was that we had path to archive directory as unicode, and mercurial tried to decode it while concatenating it with file name, which is utf-8 encoded plain string, not unicode. I've fixed it by encoding path to archive directory as utf-8 plain string.
I could not fix the issue with browsing, though https://sourceforge.net/p/nhunspell/code/ci/default/tree/NHunspell/UnitTests/de_DE_%C3%B6_frami.aff
The error is:
I did some digging:
u'/NHunspell/UnitTests/de_DE_\xf6_frami.aff''de_DE_\xc3\xb6_frami.aff'.'NHunspell/UnitTests/de_DE_\xf6_frami.aff'(looks like (1), but str, not unicode)I've tried several places to fix it, but did't succeed.
Looks good, one step forward for the code snapshot. We can leave the
ManifestLookupErrorissue when browsing for another day I guess.