Can we de-dupe the block of code following # remove extraneous /wiki/[PageName], which is in both repos?
It would be nice to validate more than a 200 response, since I could try to import from http://google.com and it'll let me. I don't see any HTML or HTTP header that would be useful to check though.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The URL checks for all the other importer types are all just doing a 200 check. I agree something more robust would be nice but this will catch all the errors we've seen so far.
The URL normalization in tracwikiexporter.scripts.wiki_from_trac.extractors.WikiExporter should be redundant now, since it should happen in the controller, so we could just remove it, but I wasn't sure why it was put so deep in the first place so I left it.
We could also switch the WikiExtractor code to use the validator to do the normalization, which would at least reduce the duplication.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I put the URL normalization in the WikiExporter because that class can be used from elsewhere besides the form controller. For example, the commandline script tracwikiimporter/scripts/wiki_from_trac/wiki_from_trac.py. Having it use the validator's normalization sounds good to me
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
21:43:56,170 ERROR [taskd:forgeimporters.base.import_tool:52f2b099d8be352dc973da38:allura.model.monq_model] Error "global name 'urlopen' is not defined" on job <MonQTask 52f2b099d8be352dc973da38 (busy)P:10 forgeimporters.base.import_tool h1v1019.sb.sf.net pid 12628project:/p/testit/app:adminuser:admin1>
Traceback (most recent call last):
File "/home/dbrondsema/dbrondsema-1019/forge/Allura/allura/model/monq_model.py", line 267,in __call__
self.result = func(*self.args,**self.kwargs)
File "/home/dbrondsema/dbrondsema-1019/forge/ForgeImporters/forgeimporters/base.py", line 131,in import_tool
mount_point=mount_point,mount_label=mount_label,**kw)
File "/nfs/home/dbrondsema/tracwikiimporter/tracwikiimporter/importer.py", line 116,in import_tool
WikiExporter(trac_url, options).export(f)
File "/nfs/home/dbrondsema/tracwikiimporter/tracwikiimporter/scripts/wiki_from_trac/extractors.py", line 103,in export
for title in self.page_list():
File "/nfs/home/dbrondsema/tracwikiimporter/tracwikiimporter/scripts/wiki_from_trac/extractors.py", line 131,in page_list
r = self.fetch(url)
File "/nfs/home/dbrondsema/tracwikiimporter/tracwikiimporter/scripts/wiki_from_trac/extractors.py", line 126,in fetch
return urlopen(url)
NameError: global name 'urlopen' is not defined
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Found one other issue, unhandled errors from requests if an invalid domain is entered, for example. This should handle it: https://sourceforge.net/p/allura/pastebin/52f4fee43e5e8346ccf15780 If that's good by you, I can commit & merge the branches (there's some rebasing and conflicts that I've got resolved locally already)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
allura:cj/7124
tracwikiimporter:cj/7124
sfx:cj/7124
Can we de-dupe the block of code following
# remove extraneous /wiki/[PageName]
, which is in both repos?It would be nice to validate more than a 200 response, since I could try to import from http://google.com and it'll let me. I don't see any HTML or HTTP header that would be useful to check though.
The URL checks for all the other importer types are all just doing a 200 check. I agree something more robust would be nice but this will catch all the errors we've seen so far.
The URL normalization in
tracwikiexporter.scripts.wiki_from_trac.extractors.WikiExporter
should be redundant now, since it should happen in the controller, so we could just remove it, but I wasn't sure why it was put so deep in the first place so I left it.We could also switch the
WikiExtractor
code to use the validator to do the normalization, which would at least reduce the duplication.I put the URL normalization in the
WikiExporter
because that class can be used from elsewhere besides the form controller. For example, the commandline scripttracwikiimporter/scripts/wiki_from_trac/wiki_from_trac.py
. Having it use the validator's normalization sounds good to meChange pushed to
tracwikiimporter:cj/7124
Test fixes pushed to:
tracwikiimporter:cj/7124
Unchanged:
allura:cj/7124
sfx:cj/7124
Got this when trying to run a trac wiki import:
Forgot the aliasing of that in:
tracwikiimporter:cj/7124
Found one other issue, unhandled errors from requests if an invalid domain is entered, for example. This should handle it: https://sourceforge.net/p/allura/pastebin/52f4fee43e5e8346ccf15780 If that's good by you, I can commit & merge the branches (there's some rebasing and conflicts that I've got resolved locally already)
+1