TVDB char entity fix (Update 'webif/lib/xml.class') #2

Merged
af123 merged 2 commits from :tvdb-entity-decode into master 2020-06-03 11:01:14 +00:00
Owner

Attempt to fix the issue reported at https://hummy.tv/forum/threads/special-characters-sweeper-rename-39-instead-of.9638/#post-140275

Decode numeric and subset of named character entities (without reference to DTD) in TXT node that isn't CDATA.

Add parser test harness.

Add attribution of source and include more of it for the above.

Works for me but not tested with TVDB.

Attempt to fix the issue reported at https://hummy.tv/forum/threads/special-characters-sweeper-rename-39-instead-of.9638/#post-140275 Decode numeric and subset of named character entities (without reference to DTD) in TXT node that isn't CDATA. Add parser test harness. Add attribution of source and include more of it for the above. Works for me but not tested with TVDB.
df changed title from WIP: TVDB char entity fix (Update 'webif/lib/xml.class') to TVDB char entity fix (Update 'webif/lib/xml.class') 2020-05-05 15:22:53 +00:00
Author
Owner

This ought to deal with the issue in TVDB.

Sweeper has two roll-your-own versions of the entity decoding; it would be better to add a routine to cgi.tcl, like this:

...
# return string but with html-special characters escaped,                     
# necessary if you want to send unknown text to an html-formatted page.       
proc cgi_quote_html {s} {                                                     
    # substitute & before the other encodings that use &                  
    return [string map {                                                      
                {"} {"}                                                 
                {<} {&lt;}                                                   
                {>} {&gt;}                                                   
        } [string map {{&} {&amp;}} $s]]                                     
}                                                                            
                                                                             
# return string but with html-special characters unescaped, reversing above  
proc cgi_unquote_html {s} {                                                  
    # substitute &amp; last in case of eg "xxx&amp;quot;yyy"                 
    return [string map -nocase {{&amp;} {&}                                  
                } [string map -nocase {                                              
                        {&quot;} {"}                                       
                        {&lt;}  {<}                                        
                        {&gt;}  {>}                                        
                        } $s]]                                             
}
...

(assuming in the above revised cgi_quote_html that 1 [string map] is faster than several [regsub])

This ought to deal with the issue in TVDB. Sweeper has two roll-your-own versions of the entity decoding; it would be better to add a routine to cgi.tcl, like this: ``` ... # return string but with html-special characters escaped, # necessary if you want to send unknown text to an html-formatted page. proc cgi_quote_html {s} { # substitute &amp; before the other encodings that use & return [string map { {"} {&quot;} {<} {&lt;} {>} {&gt;} } [string map {{&} {&amp;}} $s]] } # return string but with html-special characters unescaped, reversing above proc cgi_unquote_html {s} { # substitute &amp; last in case of eg "xxx&amp;quot;yyy" return [string map -nocase {{&amp;} {&} } [string map -nocase { {&quot;} {"} {&lt;} {<} {&gt;} {>} } $s]] } ... ``` (assuming in the above revised cgi_quote_html that 1 [string map] is faster than several [regsub])
af123 closed this pull request 2020-06-03 11:01:14 +00:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: hummypkg/webif#2
No description provided.