Daniel
2013-02-05 04:18:37 UTC
I started a new topic (should have done this before),
because this has diverged from N / N+1 topic. You can
see previous posts under N / N+1 topic. I know this
"mangling" is not directly about sed. But I think it
matters that everyone sees syntax as a poster intended.
Thanks for sending screenshot of your MUA. I repeated
the test from a test yahoo group I made. Email sent to
Thunderbird email client displays the syntax correctly.
So not a total failure. MUA works. But here's the
apparent score so far, for three environments:
A) yahoo groups archive mangles some syntax.
B) yahoo email reader mangles some syntax.
C) MUA (Thunderbird and at least one other) works.
Two out of three tested environments seem subject
to failure. Kind of a problem... Especially because
A) should be the archive, the common reference.
The basic cause seems to be yahoo "knows about"
HTML tags and entities, is not a text-only thing.
But our discussion are really text-only (I think).
I have no idea how often this arises in practice.
Certainly caused me confusion in viewing (on web
site) N / N+1 replies. The replies seemed way off.
Then I looked back at mine, saw the mangling...
A partial workaround might be "don't use LT and GT
as word delimiters when posting", use \b instead.
Here's some tests I devised. If anyone wants, you
can see what gets mangled in your environment. In
my tests, 2, 4, 6, 7, 16, 17, 18, 19 don't display
properly for environments A) and B). Dan
#1 - Using \bfoo\b syntax:
s:(.*)\bfoo\b(.*)\n(.*)\bbar\b(.*):\1FOO\2\n\3BAR\4:
#2 - Using \< \> (BackSlash-LT BackSlash-GT) syntax:
s:(.*)\<foo\>(.*)\n(.*)\<bar\>(.*):\1FOO\2\n\3BAR\4:
#3 Using HTML entities for GT and LT:
s:(.*)\<foo\>(.*)\n(.*)\<bar\>(.*):\1FOO\2\n\3BAR\4:
#4 \<foo\> is foo surrounded by BackSlash-LT and BackSlash-GT.
#5 \<foo\> is same as #4, with HTML entities for LT and GT.
#6 • is HTML bull entity. Displays as bullet or as literal?
#7 is HTML nbsp entity. Shows space or literal?
#8 < is LT all by it's lonesome.
#9 < is also LT, but as HTML entity.
#10 > is GT all by it's lonesome.
#11 > is also GT, but as HTML entity.
#12 \< is BackSlash-LT all by it's lonesome.
#13 \< is also BackSlash-LT, but LT as HTML entity.
#14 \> is BackSlash-GT all by it's lonesome.
#15 \> is also BackSlash-GT, but GT as HTML entity.
#16 <strong>Hi</strong> is Hi surrounded by "strong" tags.
#17 <i>Hello</i> is Hello surrounded by "italic" tags.
#18 <br> is HTML br tag. Shows as literal?
#19 <p> is HTML p tag. Shows as literal?
because this has diverged from N / N+1 topic. You can
see previous posts under N / N+1 topic. I know this
"mangling" is not directly about sed. But I think it
matters that everyone sees syntax as a poster intended.
Thanks for sending screenshot of your MUA. I repeated
the test from a test yahoo group I made. Email sent to
Thunderbird email client displays the syntax correctly.
So not a total failure. MUA works. But here's the
apparent score so far, for three environments:
A) yahoo groups archive mangles some syntax.
B) yahoo email reader mangles some syntax.
C) MUA (Thunderbird and at least one other) works.
Two out of three tested environments seem subject
to failure. Kind of a problem... Especially because
A) should be the archive, the common reference.
The basic cause seems to be yahoo "knows about"
HTML tags and entities, is not a text-only thing.
But our discussion are really text-only (I think).
I have no idea how often this arises in practice.
Certainly caused me confusion in viewing (on web
site) N / N+1 replies. The replies seemed way off.
Then I looked back at mine, saw the mangling...
A partial workaround might be "don't use LT and GT
as word delimiters when posting", use \b instead.
Here's some tests I devised. If anyone wants, you
can see what gets mangled in your environment. In
my tests, 2, 4, 6, 7, 16, 17, 18, 19 don't display
properly for environments A) and B). Dan
#1 - Using \bfoo\b syntax:
s:(.*)\bfoo\b(.*)\n(.*)\bbar\b(.*):\1FOO\2\n\3BAR\4:
#2 - Using \< \> (BackSlash-LT BackSlash-GT) syntax:
s:(.*)\<foo\>(.*)\n(.*)\<bar\>(.*):\1FOO\2\n\3BAR\4:
#3 Using HTML entities for GT and LT:
s:(.*)\<foo\>(.*)\n(.*)\<bar\>(.*):\1FOO\2\n\3BAR\4:
#4 \<foo\> is foo surrounded by BackSlash-LT and BackSlash-GT.
#5 \<foo\> is same as #4, with HTML entities for LT and GT.
#6 • is HTML bull entity. Displays as bullet or as literal?
#7 is HTML nbsp entity. Shows space or literal?
#8 < is LT all by it's lonesome.
#9 < is also LT, but as HTML entity.
#10 > is GT all by it's lonesome.
#11 > is also GT, but as HTML entity.
#12 \< is BackSlash-LT all by it's lonesome.
#13 \< is also BackSlash-LT, but LT as HTML entity.
#14 \> is BackSlash-GT all by it's lonesome.
#15 \> is also BackSlash-GT, but GT as HTML entity.
#16 <strong>Hi</strong> is Hi surrounded by "strong" tags.
#17 <i>Hello</i> is Hello surrounded by "italic" tags.
#18 <br> is HTML br tag. Shows as literal?
#19 <p> is HTML p tag. Shows as literal?
Well, I viewed it with a MUA, not using webmail.
Screenshot here: Loading Image...
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Hope this helps.
--
D.
Screenshot here: Loading Image...
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Hope this helps.
--
D.