Discussion:
to remove leading and trailing blank lines
contact2vikas
2013-07-01 10:24:43 UTC
Permalink
Hello,
I want to remove/delete leading and trailing blank lines from a file.

Vikas
Davide Brini
2013-07-01 10:38:39 UTC
Permalink
On Mon, 01 Jul 2013 10:24:43 -0000, "contact2vikas"
Post by contact2vikas
Hello,
I want to remove/delete leading and trailing blank lines from a file.
Try this:

sed '

/./,$b ok

d

:ok

/^\n*$/{
$d
N
b ok
}

' file
--
D.
Davide Brini
2013-07-01 11:12:25 UTC
Permalink
On Mon, 01 Jul 2013 10:53:37 -0000, "contact2vikas"
Thanks Dave,
Its working fine now.
I am want to avoide creating sed file. Is it possible give me one or two
command from I can run from command line.
Please keep replies on the list. The command I used can be directly used on
the command line, there's no need to create a file.
--
D.
Daniel
2013-07-02 21:29:20 UTC
Permalink
Here is another way to remove leading and trailing blank lines:

--------------------------------
$ cat test.txt


non-blank 1
non-blank 2

non-blank 3


--------------------------------


--------------------------------
$ sed -n '/[^ ]/,$ p' test.txt | tac | sed -n '/[^ ]/,$ p' | tac
non-blank 1
non-blank 2

non-blank 3
--------------------------------

This assumes "blank line" means "line has some character other than blank".

Daniel

PS - Unfortunately, the yahoo groups web interface defaults to send the message to the previous poster, not to sed-users group. At least that's how it behaves when I use it. That's probably what happened.
Post by Davide Brini
On Mon, 01 Jul 2013 10:53:37 -0000, "contact2vikas"
Thanks Dave,
Its working fine now.
I am want to avoide creating sed file. Is it possible give me one or two
command from I can run from command line.
Please keep replies on the list. The command I used can be directly used on
the command line, there's no need to create a file.
--
D.
prem
2013-07-01 17:29:58 UTC
Permalink
Try thisecho -e " cccc\n dddddd \neeeeeee \nfffff" | sed -e '/^ /d' -e '/ $/d'
 
Sent: Monday, July 1, 2013 5:38 AM
Subject: Re: to remove leading and trailing blank lines
On Mon, 01 Jul 2013 10:24:43 -0000, "contact2vikas"
Post by contact2vikas
Hello,
I want to remove/delete  leading and trailing blank lines from a file.
sed '
/./,$b ok
d
:ok
 
/^\n*$/{
  $d
  N
  b ok
}
' file
--
D.
------------------------------------
--
Yahoo! Groups Links
 
[Non-text portions of this message have been removed]
Alexander Krasnukhin
2013-07-01 17:36:42 UTC
Permalink
It won't remove leading and trailing blank lines.

$ echo -e "\n\n\ncafebabe\n\n\n" | sed -e '/^ /d' -e '/ $/d'

--
Regards,
Alexander
Post by prem
Try thisecho -e " cccc\n dddddd \neeeeeee \nfffff" | sed -e '/^ /d' -e '/ $/d'
Sent: Monday, July 1, 2013 5:38 AM
Subject: Re: to remove leading and trailing blank lines
On Mon, 01 Jul 2013 10:24:43 -0000, "contact2vikas"
Post by contact2vikas
Hello,
I want to remove/delete leading and trailing blank lines from a file.
sed '
/./,$b ok
d
:ok
/^\n*$/{
$d
N
b ok
}
' file
--
D.
------------------------------------
--
Yahoo! Groups Links
[Non-text portions of this message have been removed]
------------------------------------
--
Yahoo! Groups Links
[Non-text portions of this message have been removed]
prem
2013-07-02 07:42:05 UTC
Permalink
The requirement is as follows:
=====================================================================
Hello,
I want to remove/delete leading and trailing blank lines from a file.
Vikas
====================================================================="leading and trailing blank lines" not blank lines.
If that is what you want, this shoud resolve:
echo -e "  ccc\n  dddd  \neeeee  \nfff\n \n\n\n\n" | sed -e '/^ /d' -e '/ $/d' -e '/^ *$/d'
 
Sent: Monday, July 1, 2013 12:36 PM
Subject: Re: to remove leading and trailing blank lines
It won't remove leading and trailing blank lines.
$ echo -e "\n\n\ncafebabe\n\n\n" | sed -e '/^ /d' -e '/ $/d'
--
Regards,
Alexander
Post by prem
Try thisecho -e " cccc\n dddddd \neeeeeee \nfffff" | sed -e '/^ /d' -e '/ $/d'
 
Sent: Monday, July 1, 2013 5:38 AM
Subject: Re: to remove leading and trailing blank lines
On Mon, 01 Jul 2013 10:24:43 -0000, "contact2vikas"
Post by contact2vikas
Hello,
I want to remove/delete  leading and trailing blank lines from a file.
sed '
/./,$b ok
d
:ok
 
/^\n*$/{
  $d
  N
  b ok
}
' file
--
D.
------------------------------------
--
Yahoo! Groups Links
 
[Non-text portions of this message have been removed]
------------------------------------
--
Yahoo! Groups Links
[Non-text portions of this message have been removed]
------------------------------------
--
Yahoo! Groups Links
[Non-text portions of this message have been removed]
prem
2013-07-08 04:48:44 UTC
Permalink
Please read the requirement.

"I want to remove/delete leading and trailing blank lines from a file."
The LINES need to be removed, not the space.
In your example, it does not have any leading or trailing space.
 
Also, see my resent email, that deletes blank lines also as requested later.
Sent: Monday, July 1, 2013 12:36 PM
Subject: Re: to remove leading and trailing blank lines
It won't remove leading and trailing blank lines.
$ echo -e "\n\n\ncafebabe\n\n\n" | sed -e '/^ /d' -e '/ $/d'
--
Regards,
Alexander
Try thisecho -e " cccc\n dddddd \neeeeeee \nfffff" | sed -e '/^ /d' -e '/ $/d'
 
Sent: Monday, July 1, 2013 5:38 AM
Subject: Re: to remove leading and trailing blank lines
On Mon, 01 Jul 2013 10:24:43 -0000, "contact2vikas"
Hello,
I want to remove/delete  leading and trailing blank lines from a file.
sed '
/./,$b ok
d
:ok
 
/^\n*$/{
  $d
  N
  b ok
}
' file
--
D.
------------------------------------
--
Yahoo! Groups Links
 
[Non-text portions of this message have been removed]
------------------------------------
--
Yahoo! Groups Links
[Non-text portions of this message have been removed]
RAKESH
2013-07-05 11:37:25 UTC
Permalink
Note: blank line => /^$/ i.e., one which has no characters, not even
spaces &/or TABs in it.

sed -e '
# delete all leading blank lines
/./,$!d
/./b
# accumulate a bunch of consecutive blank lines
# then delete that bunch only if u run out of lines.
:a
$d
N
/\n\n*$/ba
' yourfile

HTH
Post by contact2vikas
Hello,
I want to remove/delete leading and trailing blank lines from a file.
Vikas
Sven Guckes
2013-07-05 13:19:35 UTC
Permalink
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".

so - here's my definition of "blank lines":
a "blank line" is contains only "blank characters";
"blank characters" are characters
which do not show up with any dots
(unless a font gives them any dots).

an empty line is a special case of a blank line
because all characters contained are "blank".
it is certainly valid for every contained element
as there are none for which it requires such quality.
(okay.. a little bit of set logic here..)

you may define a subset of these blank characters
to be valid only, eg spaces and tabs.

in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from
"perl compatible regular expressions" aka PCRE.)

so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.

too bad we dont have "anchors" for
"start of data" and "end of data".
these would probably make the
given problem a breeze.

Sven
--
$ man 7 regex
Standard character class names are:
alnum alpha blank cntrl digit graph
lower print punct space upper xdigit
RAKESH
2013-07-06 04:45:24 UTC
Permalink
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.

recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------

Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t
and there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------

Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.

Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.

Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters";
"blank characters" are characters
which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line
because all characters contained are "blank".
it is certainly valid for every contained element
as there are none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters
to be valid only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from
"perl compatible regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for
"start of data" and "end of data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph
lower print punct space upper xdigit
RAKESH
2013-07-06 14:00:40 UTC
Permalink
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t
and there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters";
"blank characters" are characters
which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line
because all characters contained are "blank".
it is certainly valid for every contained element
as there are none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters
to be valid only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from
"perl compatible regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for
"start of data" and "end of data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph
lower print punct space upper xdigit
Daniel
2013-07-06 18:13:50 UTC
Permalink
That is really great.

The first part (remove leading blank lines) is clear to me: "Find the range from the first non-blank line to the end of the stream. Delete the opposite range (any leading blank lines)".

The second part (removes trailing blank lines) is less clear, even after using sedsed some. It works in all the test cases I tried. But can you explain a little how / why it works?

Thanks,
Daniel
Post by RAKESH
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t
and there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters";
"blank characters" are characters
which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line
because all characters contained are "blank".
it is certainly valid for every contained element
as there are none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters
to be valid only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from
"perl compatible regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for
"start of data" and "end of data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph
lower print punct space upper xdigit
Logan Palanisamy
2013-07-07 01:44:00 UTC
Permalink
Another way to solve this problem is to use "tac" two times. tac lists the contents reverse order (opposite of cat)


$sed -e '/[^ \t]/,$!d' input_file.txt | tac | sed '/[^ \t]/,$!d' | tac

Steps: Trim the leading blank lines, pipe the output to tac to reverse the lines, trim the leading blank lines which used to be the trailing blank lines, pipe the output to tac again to get the original order.


-----Original Message-----
From: sed-***@yahoogroups.com [mailto:sed-***@yahoogroups.com] On Behalf Of Daniel
Sent: Saturday, July 06, 2013 11:14 AM
To: sed-***@yahoogroups.com
Subject: Re: blank lines as "^\s*$"

That is really great.

The first part (remove leading blank lines) is clear to me: "Find the range from the first non-blank line to the end of the stream. Delete the opposite range (any leading blank lines)".

The second part (removes trailing blank lines) is less clear, even after using sedsed some. It works in all the test cases I tried. But can you explain a little how / why it works?

Thanks,
Daniel
Post by RAKESH
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t and
there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters"; "blank
characters" are characters which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line because all
characters contained are "blank".
it is certainly valid for every contained element as there are
none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters to be valid
only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from "perl compatible
regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for "start of data" and "end of
data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph lower print punct space upper
xdigit
------------------------------------

--
Yahoo! Groups Links
Daniel
2013-07-07 04:24:47 UTC
Permalink
I agree, that's simpler. Actually, see my previous post two posts back, which uses basically the same "tacky" solution. :) But I think it's worthwhile learning how the sed-only solution works, or how the author dreamed it up.
Post by Logan Palanisamy
Another way to solve this problem is to use "tac" two times. tac lists the contents reverse order (opposite of cat)
$sed -e '/[^ \t]/,$!d' input_file.txt | tac | sed '/[^ \t]/,$!d' | tac
Steps: Trim the leading blank lines, pipe the output to tac to reverse the lines, trim the leading blank lines which used to be the trailing blank lines, pipe the output to tac again to get the original order.
-----Original Message-----
Sent: Saturday, July 06, 2013 11:14 AM
Subject: Re: blank lines as "^\s*$"
That is really great.
The first part (remove leading blank lines) is clear to me: "Find the range from the first non-blank line to the end of the stream. Delete the opposite range (any leading blank lines)".
The second part (removes trailing blank lines) is less clear, even after using sedsed some. It works in all the test cases I tried. But can you explain a little how / why it works?
Thanks,
Daniel
Post by RAKESH
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t and
there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters"; "blank
characters" are characters which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line because all
characters contained are "blank".
it is certainly valid for every contained element as there are
none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters to be valid
only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from "perl compatible
regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for "start of data" and "end of
data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph lower print punct space upper
xdigit
------------------------------------
--
Yahoo! Groups Links
RAKESH
2013-07-07 07:38:56 UTC
Permalink
We could also do it making use of the -0 777 option of perl where
the whole file is slurped in, & then we proceed to
chop off the leading & trailing blank lines, like as if chopping the front/tails of a carrot.

perl -0777wple 's/\A\s+//ms;s/\s+\z//ms' yourfile

But since it's a sed forum so ... sed rules, atleast here ;-)

-Rakehs
Post by Logan Palanisamy
Another way to solve this problem is to use "tac" two times. tac lists the contents reverse order (opposite of cat)
$sed -e '/[^ \t]/,$!d' input_file.txt | tac | sed '/[^ \t]/,$!d' | tac
Steps: Trim the leading blank lines, pipe the output to tac to reverse the lines, trim the leading blank lines which used to be the trailing blank lines, pipe the output to tac again to get the original order.
-----Original Message-----
Sent: Saturday, July 06, 2013 11:14 AM
Subject: Re: blank lines as "^\s*$"
That is really great.
The first part (remove leading blank lines) is clear to me: "Find the range from the first non-blank line to the end of the stream. Delete the opposite range (any leading blank lines)".
The second part (removes trailing blank lines) is less clear, even after using sedsed some. It works in all the test cases I tried. But can you explain a little how / why it works?
Thanks,
Daniel
Post by RAKESH
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t and
there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters"; "blank
characters" are characters which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line because all
characters contained are "blank".
it is certainly valid for every contained element as there are
none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters to be valid
only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from "perl compatible
regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for "start of data" and "end of
data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph lower print punct space upper
xdigit
------------------------------------
--
Yahoo! Groups Links
Ruud H.G. van Tol
2013-07-07 07:59:42 UTC
Permalink
Post by RAKESH
We could also do it making use of the -0 777 option of perl where
the whole file is slurped in, & then we proceed to
chop off the leading & trailing blank lines, like as if chopping the front/tails of a carrot.
perl -0777wple 's/\A\s+//ms;s/\s+\z//ms' yourfile
That has issues:

- the 'ms' modifiers are superfluous
- the starting whitespace of the first non-whitespace line is removed
- the newline after the last non-whitespace line is removed
--
Ruud
RAKESH
2013-07-08 10:46:49 UTC
Permalink
Post by Ruud H.G. van Tol
Post by RAKESH
We could also do it making use of the -0 777 option of perl where
the whole file is slurped in, & then we proceed to
chop off the leading & trailing blank lines, like as if chopping the front/tails of a carrot.
perl -0777wple 's/\A\s+//ms;s/\s+\z//ms' yourfile
- the 'ms' modifiers are superfluous
- the starting whitespace of the first non-whitespace line is removed
- the newline after the last non-whitespace line is removed
That was a very good catch, (especially point #2),
I was not checking the code b4 posting.

perl -0777wple 's/\A\s*\n(?:\s*$)*//m;s/(?:^\s*$)*\z//m' yourfile

Or this:

perl -0777wple '1 while s/\A\s*\n//m;1 while s/^\s*\z//m' yourfile

As for the 'ms' modifier, I actually have a habit of putting them at the end of every regex that I write just to make me remember less their functionality. Actually, the "-l" option is redundant here as it's being set to the 0777 char in this case.

-Rakesh
RAKESH
2013-07-07 07:26:11 UTC
Permalink
I imagine the non-blank lines to be like islands in a sea of blank
lines. Then how we deal with going from island to island is what the sed code does. Here is the "verbo-sed" version of the sed code presented earlier.

Notes:
blank line => /^[ \t]*$/
non-blank line => /[^ \t]/

sed -e '

# delete all leading blank lines
/[^ \t]/,$!d

# non-blanks to be displayed
/[^ \t]/b

# as soon as you hit a blank line which is not leading
# (since, all the leading blanks have already been take care of above)
# start accumulating them. Then in that process one of 2 things can
# happen: i) we either run out of lines => it was the bunch of
# trailing blank lines , which need to be deleted per specs. Or,
# ii) we hit a non-blank line, meaning we need to output that bunch
# and need to restart. Now, it's important to realize that when this
# happens sed will NOT use the first line of code, since it's a range
# operator which has already been turned OFF after all the leading
# blanks have been deleted.
# the command duo: N; /\n[ \t]*$/ba => after adding the next line
# into the pattern space, check whether the just added line was
# blank. If it was, then just loop back to accumulate more,
# otherwise, display the pattern space & start all over.
# the below can be looked upon as the sed version of a do-while loop.
:loop
$d
N
/\n[ \t]*$/bloop

' yourfile

HTH

Rakesh
Post by Daniel
That is really great.
The first part (remove leading blank lines) is clear to me: "Find the range from the first non-blank line to the end of the stream. Delete the opposite range (any leading blank lines)".
The second part (removes trailing blank lines) is less clear, even after using sedsed some. It works in all the test cases I tried. But can you explain a little how / why it works?
Thanks,
Daniel
Post by RAKESH
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t
and there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters";
"blank characters" are characters
which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line
because all characters contained are "blank".
it is certainly valid for every contained element
as there are none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters
to be valid only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from
"perl compatible regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for
"start of data" and "end of data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph
lower print punct space upper xdigit
Daniel
2013-07-08 21:13:46 UTC
Permalink
Thank you for the detailed "verbo-sed" explanation. It's very helpful.
Post by RAKESH
I imagine the non-blank lines to be like islands in a sea of blank
lines. Then how we deal with going from island to island is what the sed code does. Here is the "verbo-sed" version of the sed code presented earlier.
blank line => /^[ \t]*$/
non-blank line => /[^ \t]/
sed -e '
# delete all leading blank lines
/[^ \t]/,$!d
# non-blanks to be displayed
/[^ \t]/b
# as soon as you hit a blank line which is not leading
# (since, all the leading blanks have already been take care of above)
# start accumulating them. Then in that process one of 2 things can
# happen: i) we either run out of lines => it was the bunch of
# trailing blank lines , which need to be deleted per specs. Or,
# ii) we hit a non-blank line, meaning we need to output that bunch
# and need to restart. Now, it's important to realize that when this
# happens sed will NOT use the first line of code, since it's a range
# operator which has already been turned OFF after all the leading
# blanks have been deleted.
# the command duo: N; /\n[ \t]*$/ba => after adding the next line
# into the pattern space, check whether the just added line was
# blank. If it was, then just loop back to accumulate more,
# otherwise, display the pattern space & start all over.
# the below can be looked upon as the sed version of a do-while loop.
:loop
$d
N
/\n[ \t]*$/bloop
' yourfile
HTH
Rakesh
Post by Daniel
That is really great.
The first part (remove leading blank lines) is clear to me: "Find the range from the first non-blank line to the end of the stream. Delete the opposite range (any leading blank lines)".
The second part (removes trailing blank lines) is less clear, even after using sedsed some. It works in all the test cases I tried. But can you explain a little how / why it works?
Thanks,
Daniel
Post by RAKESH
After posting the reply I saw the typo(line-2), & then while fixing it re-wrote the regex.
sed -e '
/[^ \t]/,$!d
/[^ \t]/b
:a
$d
N
/\n[ \t]*$/ba
' yourfile
Post by RAKESH
Thanks for the detailed distinctions in blank lines vs. the empty ones. Although I was aware of the differences(hence mentioned upfront my assumptions), and chose to use the /^$/ variety since it doesn't get in the way of the sed code in terms of legibility.
recapitulating, (i.e., /^$/ based )
#-----------------------------
/./,$!d
/./b
:a
$d
N
/\n\n*$/ba
#-----------------------------
Now this is how the above would appear using the /^\s*$/ approach
Note: repl. \t => literal TAB since "sed" doesn't support \t
and there's no way to show it's a TAB otherwise in here.
#------------------------------------------
/[^ \t]/,$!d
/[^ \t/b
:a
$d
N
h
s/.*\n//
/^[ \t]*$/{
g;ba
}
g
#------------------------------------------
Notice the complexity increase in this case, which is due to fact that "sed" doesn't come armed with the \s \S regexes, otherwise this would have been a breeze, as you yourself state.
Just so that the OP gets an idea of a sed-based flow is why I chose the code that I gave. Once the hang of that is got, then the OP can upgrade to the /^\s*$/ approach if needed.
Rakehs
Post by Sven Guckes
Post by RAKESH
Note: blank line => /^$/ i.e.,
one which has no characters,
not even spaces &/or TABs in it.
well, a line with *no* characters in it i'd call "empty".
a "blank line" is contains only "blank characters";
"blank characters" are characters
which do not show up with any dots
(unless a font gives them any dots).
an empty line is a special case of a blank line
because all characters contained are "blank".
it is certainly valid for every contained element
as there are none for which it requires such quality.
(okay.. a little bit of set logic here..)
you may define a subset of these blank characters
to be valid only, eg spaces and tabs.
in the editor vim you can use the pattern "\s" for these.
(i think the notation of '\s' has been taken from
"perl compatible regular expressions" aka PCRE.)
so "blank lines" match the pattern "^\s*$"
which also matches with empty lines.
too bad we dont have "anchors" for
"start of data" and "end of data".
these would probably make the
given problem a breeze.
Sven
--
$ man 7 regex
alnum alpha blank cntrl digit graph
lower print punct space upper xdigit
Loading...