#4254 Issues with RSS feed blog import

v1.0.0
closed
nobody
None
General
Cory Johns
2015-08-20
2012-05-21
Cory Johns
No

There are some issues with the RSS feed feature for the ForgeBlog. The interface to add a feed has been temporarily disabled (commented out) until these are resolved.

  • The date of the post is not set to the date from the RSS feed item
  • The status is set to and left at Draft, so they are not viewable
  • There is no check to see if the entry is already in the blog, so duplicates will be made
  • Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872). Implement a <plaintext> tag to escape the markdown formatting. This could be implemented either as a markdown extension, or change the forge_markdown() method to check to see if the content starts with <plaintext> since everything should go through forge_markdown())
  • The session is not being flushed; it should be explicitly flushed at the end of each feed process
  • The pull-rss-feeds command should be in the ForgeBlog tool, not in Allura

Related

Tickets: #4254

Discussion

  • Cory Johns

    Cory Johns - 2012-05-21
    • Description has changed:

    Diff:

    --- old 
    +++ new 
    @@ -3,5 +3,5 @@
     * The date of the post is not set to the date from the RSS feed item
     * The status is set to and left at Draft, so they are not viewable
     * There is no check to see if the entry is already in the blog, so duplicates will be made
    -* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872)
    +* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one [https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872](https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872)
     * The session is not being flushed; it should be explicitly flushed at the end of each feed process
    
    • Description has changed:

    Diff:

    --- old 
    +++ new 
    @@ -3,5 +3,5 @@
     * The date of the post is not set to the date from the RSS feed item
     * The status is set to and left at Draft, so they are not viewable
     * There is no check to see if the entry is already in the blog, so duplicates will be made
    -* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one [https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872](https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872)
    +* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872).  Implement a <nomarkdown> tag to escape the markdown formatting.
     * The session is not being flushed; it should be explicitly flushed at the end of each feed process
    
     
  • Cory Johns

    Cory Johns - 2012-05-21
    • Description has changed:

    Diff:

    --- old 
    +++ new 
    @@ -3,5 +3,6 @@
     * The date of the post is not set to the date from the RSS feed item
     * The status is set to and left at Draft, so they are not viewable
     * There is no check to see if the entry is already in the blog, so duplicates will be made
    -* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872).  Implement a <nomarkdown> tag to escape the markdown formatting.
    +* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one <https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872>).  Implement a &lt;plaintext&gt; tag to escape the markdown formatting.
     * The session is not being flushed; it should be explicitly flushed at the end of each feed process
    +* The pull-rss-feeds command should be in the ForgeBlog tool, not in Alluraa
    
     
  • Dave Brondsema

    Dave Brondsema - 2012-05-21
    • Description has changed:

    Diff:

    --- old 
    +++ new 
    @@ -3,6 +3,6 @@
     * The date of the post is not set to the date from the RSS feed item
     * The status is set to and left at Draft, so they are not viewable
     * There is no check to see if the entry is already in the blog, so duplicates will be made
    -* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one <https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872>).  Implement a &lt;plaintext&gt; tag to escape the markdown formatting.
    +* Markdown-like text is not escaped, so erroneous formatting is applied (test with this feed https://twitter.com/statuses/user_timeline/openatadobe.atom and see the posts with hashtags in them such as this one <https://twitter.com/#!/OpenatAdobe/statuses/181788972010319872>).  Implement a &lt;plaintext&gt; tag to escape the markdown formatting.  This could be implemented either as a markdown extension, or change the `forge_markdown()` method to check to see if the content starts with `<plaintext>` since everything should go through `forge_markdown()`)
     * The session is not being flushed; it should be explicitly flushed at the end of each feed process
    -* The pull-rss-feeds command should be in the ForgeBlog tool, not in Alluraa
    +* The pull-rss-feeds command should be in the ForgeBlog tool, not in Allura
    
     
  • Yaroslav Luzin

    Yaroslav Luzin - 2012-05-22

    Created #64: [#4254] Issues with RSS feed blog import (2cp)

     

    Related

    Tickets: #4254

  • Yaroslav Luzin

    Yaroslav Luzin - 2012-05-28

    closed #64 and pushed changes into 42cc_4524

     
  • Dave Brondsema

    Dave Brondsema - 2012-05-29
    • status: open --> code-review
    • qa: Cory Johns
     
  • Cory Johns

    Cory Johns - 2012-06-01
    • status: code-review --> closed
     
  • Cory Johns

    Cory Johns - 2012-06-04
    • status: closed --> open
     
  • Cory Johns

    Cory Johns - 2012-06-04

    The MDHTMLParser class has no test coverage (other than the high-level functional test of the entire feature). There should be tests that cover each of the implemented methods (handle_*)

     
  • Cory Johns

    Cory Johns - 2012-06-04

    Also, some comments documenting the reason why we need a custom HTMLParser implementation (i.e., so that the "text" part of the HTML doesn't have any Markdown syntax interpreted, while still allowing the HTML tags to be converted to Markdown and interpreted properly) should be added.

     
  • Yaroslav Luzin

    Yaroslav Luzin - 2012-06-05

    created a technical debt ticket #77: [#4254] TD: Tests for MDHTMLParser (1cp)

     

    Related

    Tickets: #4254

  • Yaroslav Luzin

    Yaroslav Luzin - 2012-06-07

    closed #77, changes in 42cc_4254

    • status: open --> code-review
     
  • Cory Johns

    Cory Johns - 2012-06-08

    Tests and doc string look good, but handle_data is kind of dense and could use some comments, plus I'm fairly sure that lines 81-83 will never be executed. If I'm wrong, can you please adjust the test case to cover those lines as well?

     
  • Yaroslav Luzin

    Yaroslav Luzin - 2012-06-11

    Yes, you're right. Seems like that code will never be executed. We'll make a small refactoring and change handle_data in #78: [#4254] Refactoring (1cp)

     

    Related

    Tickets: #4254

  • Dave Brondsema

    Dave Brondsema - 2012-06-11

    I've made 3 fixes on dev today, so please review them so you understand what happened and update your branch to include them before conflicts might arise.

    • Commit c23e328 - plain tags were causing more problems then solving problems. I did a little cleanup and then disabled it, creating [#4345] for followup on the plain tag.
    • Commit c3595fc - view & edit links were broken, so I removed the hash which wasn't really necessary
    • Commit 3ef4894 - minor fix for same feed used on multiple tools
     

    Related

    Tickets: #4345

  • Cory Johns

    Cory Johns - 2012-06-12
    • status: code-review --> in-progress
     
  • Cory Johns

    Cory Johns - 2012-06-12

    Changing to in-progress pending #78.

     
  • Cory Johns

    Cory Johns - 2012-06-22

    Closing, as these issues have been resolved in [#4345].

     

    Related

    Tickets: #4345

  • Cory Johns

    Cory Johns - 2012-06-22
    • status: in-progress --> closed
     
  • Cory Johns

    Cory Johns - 2012-06-22
    • milestone: forge-backlog --> forge-jun-29