df-xmldecode-patch #1

Merged
af123 merged 3 commits from df/tvdb:df-xmldecode-patch into master 2021-01-18 22:57:11 +00:00
Owner

theTVDB.com started sending single quotes in XML encoded as ' rather than " as observed in this thread.

One fix was to webif/lib/xml.class; however this program replicates the decoding process in the Jim class (which is a lot slower) and needs to be fixed as well.

The patch uses sscanf() to detect and decode numerically coded XML character entities within the original framework of the unescape() function.

Modified compilation options:

  • -std=c99 for the hh format size prefix
  • -D_XOPEN_SOURCE=700 for correct declaration of strdup() in the on-box build environment.
theTVDB.com started sending single quotes in XML encoded as `'` rather than `"` as observed in [this thread](https://hummy.tv/forum/threads/special-characters-sweeper-rename-39-instead-of.9638). One fix was to `webif/lib/xml.class`; however this program replicates the decoding process in the Jim class (which is a lot slower) and needs to be fixed as well. The patch uses `sscanf()` to detect and decode numerically coded XML character entities within the original framework of the `unescape()` function. Modified compilation options: * `-std=c99` for the `hh` format size prefix * `-D_XOPEN_SOURCE=700` for correct declaration of `strdup()` in the on-box build environment.
df added 3 commits 2021-01-15 01:20:49 +00:00
df changed title from WIP: df-xmldecode-patch to df-xmldecode-patch 2021-01-16 11:28:56 +00:00
Author
Owner

Apparently does the job but the build needs to be tested in the usual environment.

Apparently does the job but the build needs to be tested in the usual environment.
af123 reviewed 2021-01-16 21:27:41 +00:00
@ -61,0 +58,4 @@
/* sscanf -> 1: the code was read; ll>0: ';' came next */
if ((1 == sscanf( p, "&#%hhu;%n", &icode, &ll) ||
1 == sscanf( p, "&#%*[xX]%hhx;%n", &icode, &ll)) &&
ll > 0) {
Owner

sscanf() will return 2 if two input items were converted, won't it?

`sscanf()` will return 2 if two input items were converted, won't it?
af123 marked this conversation as resolved
Author
Owner

sscanf() will return 2 if two input items were converted, no?

You'd think so, but free30 and I have both run the version built from that code, so apparently not, and I am more confident of that based on the following.

POSIX (admittedly a racily anachronistic 2017 version says):

"Upon successful completion, these functions shall return the number of
successfully matched and assigned input items;"

Arguably, as %n doesn't involve any matching, it doesn't count towards the result.

Also, as %*[xX] doesn't involve any assigning, it also doesn't count.

In the library source I see this (libc/stdio/_scanf.c l.1331ff):

                        if (psfs.conv_num == CONV_n) {               
#ifdef __UCLIBC_MJN3_ONLY__
#warning CONSIDER: Should %n count as a conversion as far as EOF return value?
#endif                                                               
/*                              zero_conversions = 0; */      
                                if (psfs.store) {                     
                                        _store_inttype(psfs.cur_ptr, psfs.dataargtype,   
                                                          (uintmax_t) sc.nread);
                                }
                                goto NEXT_FMT;                          
                        }                                             

So MJN3 decided that %n didn't cause zero_conversions to be cleared, which is done for formats like %s, %d.

Also, the 0/1 flag (unsigned char store) member that is cleared/set according to whether the format contains * to suppress assignment is used to increment the return count.

> sscanf() will return 2 if two input items were converted, no? You'd think so, but free30 and I have both run the version built from that code, so apparently not, and I am more confident of that based on the following. POSIX (admittedly a racily anachronistic 2017 version says): "Upon successful completion, these functions shall return the number of successfully matched and assigned input items;" Arguably, as `%n` doesn't involve any matching, it doesn't count towards the result. Also, as `%*[xX]` doesn't involve any assigning, it also doesn't count. In the library source I see this (`libc/stdio/_scanf.c` l.1331ff): ``` if (psfs.conv_num == CONV_n) { #ifdef __UCLIBC_MJN3_ONLY__ #warning CONSIDER: Should %n count as a conversion as far as EOF return value? #endif /* zero_conversions = 0; */ if (psfs.store) { _store_inttype(psfs.cur_ptr, psfs.dataargtype, (uintmax_t) sc.nread); } goto NEXT_FMT; } ``` So MJN3 decided that %n didn't cause `zero_conversions` to be cleared, which is done for formats like %s, %d. Also, the 0/1 flag (`unsigned char store`) member that is cleared/set according to whether the format contains `*` to suppress assignment is used to increment the return count.
af123 merged commit 4d47fea182 into master 2021-01-18 22:57:11 +00:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: hummypkg/tvdb#1
No description provided.