Sven Guckes maillists-yahoo@guckes.net [sed-users]
2016-01-21 04:23:42 UTC
yet another problem.. "comb-quoted" text.
this happens on replies to messages where the lines extend 72chars.
the mailer cuts of the words which extend that limit and then put
the word into the next line. usually these result in short lines.
(besides, this happens a lot with gmail user using the browser.)
so - how to fix that?
basically, a simple algorithm could work like this:
if current line is quoted and is less than M chars
and previous line was quoted and is more than N chars
then remove quotation and join with previous line.
do you see an easy way to do this within sed?
or is this a better job for awk already
(easier for branches and comparions)?
ps: yes, i aware of the t-prot tool
http://www.escape.de/~tolot/mutt/
but this is sed country, right?
Sven
this happens on replies to messages where the lines extend 72chars.
the mailer cuts of the words which extend that limit and then put
the word into the next line. usually these result in short lines.
(besides, this happens a lot with gmail user using the browser.)
I ****'* ****** ********* *** ****'* ******* *** ******* ** *** ******
*******, *** ** ** ****'* *** *** ******** ** ******* *** **** **** ***
**
***** ** ** **** ******* *** ***** ****** *** ** ** *** *******, **
*****
*** **** *****'** **** **** *********.
* ***** ** ***** ** ****** ** ***** *** * ************ ****** ******
******** *** ********* ************* (**** * "***** *** ** *****"
********,
********* ** *********; *** * ******* ** ******* ** ** ****).
as you can see: long and short lines alternate. looks like a comb. ;)*******, *** ** ** ****'* *** *** ******** ** ******* *** **** **** ***
**
***** ** ** **** ******* *** ***** ****** *** ** ** *** *******, **
*****
*** **** *****'** **** **** *********.
* ***** ** ***** ** ****** ** ***** *** * ************ ****** ******
******** *** ********* ************* (**** * "***** *** ** *****"
********,
********* ** *********; *** * ******* ** ******* ** ** ****).
so - how to fix that?
basically, a simple algorithm could work like this:
if current line is quoted and is less than M chars
and previous line was quoted and is more than N chars
then remove quotation and join with previous line.
do you see an easy way to do this within sed?
or is this a better job for awk already
(easier for branches and comparions)?
ps: yes, i aware of the t-prot tool
http://www.escape.de/~tolot/mutt/
but this is sed country, right?
Sven