Discussion:
A Simple Text-Region Deletion Question . . .
'Marcus L. Thompson' marcus@colisp.com [sed-users]
2014-10-04 00:44:36 UTC
Permalink
Greetings to all.


I have come across what seems to be a quite simple problem involving an
application of sed in the context of text-block deletion. Here's the
hypothetical text block in question:


<property name="name">
...
</property>
</property>




Now, deleting all from <property name="name"> to the _/first/_
</property> tag is understandable enough:


Code:


sed -i '/<property name="name">*$/,/<\/property>*$/d' /file.xml




However, incorporating the 2nd </property> tag in the deletion sweep is
a bit tougher; and just that much outside my new skillset range ;o)


Any ideas as to how one might amend this "one-liner" to make it all work?


Thanks!




----


[Non-text portions of this message have been removed]
Ronaldo Ferreira de Lima jimmy.tty@gmail.com [sed-users]
2014-10-04 22:13:49 UTC
Permalink
Greetings Marcus,
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Greetings to all.
I have come across what seems to be a quite simple problem involving an
application of sed in the context of text-block deletion. Here's the
<property name="name">
...
</property>
</property>
Now, deleting all from <property name="name"> to the _/first/_
sed -i '/<property name="name">*$/,/<\/property>*$/d' /file.xml
However, incorporating the 2nd </property> tag in the deletion sweep is
a bit tougher; and just that much outside my new skillset range ;o)
Any ideas as to how one might amend this "one-liner" to make it all work?
You might try:


$ sed '/<property name="name">/{:loop;/<\/property>.*<\/property>/{s/.*//;b;};N;b loop;};' file.xml


But this is not a good approach to edit XML files. Have a look in XML
parsers like xmllint or xsh (command-line applications).
'Marcus L. Thompson' marcus@colisp.com [sed-users]
2014-10-05 22:12:09 UTC
Permalink
Thank you, Ronaldo, for your help in this small matter.

I've been struggling with just how to expand my usage of sed beyond the
rudimentary s/a/b/ type implementation; and this was a great excuse for
a question ;)

Syntax formation seems to be the biggest obstacle for me with regard to
sed. Thank you again for the working snippet to work through & learn
from...

Cheers!


-Marcus
Post by Ronaldo Ferreira de Lima ***@gmail.com [sed-users]
Greetings Marcus,
On Fri, Oct 03, 2014 at 08:44:36PM -0400, 'Marcus L. Thompson'
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Greetings to all.
I have come across what seems to be a quite simple problem involving an
application of sed in the context of text-block deletion. Here's the
<property name="name">
...
</property>
</property>
Now, deleting all from <property name="name"> to the _/first/_
sed -i '/<property name="name">*$/,/<\/property>*$/d' /file.xml
However, incorporating the 2nd </property> tag in the deletion sweep is
a bit tougher; and just that much outside my new skillset range ;o)
Any ideas as to how one might amend this "one-liner" to make it all
work?
$ sed '/<property
name="name">/{:loop;/<\/property>.*<\/property>/{s/.*//;b;};N;b
loop;};' file.xml
But this is not a good approach to edit XML files. Have a look in XML
parsers like xmllint or xsh (command-line applications).
----

[Non-text portions of this message have been removed]
Daniel Goldman dgoldman@ehdp.com [sed-users]
2014-10-06 06:24:19 UTC
Permalink
I don't know why one would want to delete that kind of block, since it's
not symmetrical. But you must have your reasons, it's better to just
answer the question, as Ronaldo did. And as you say, it led to a great
example to work through. sed can do an incredible amount in one line.
I'd be interested in how Ronaldo came up with the solution, kind of his
thinking process, if he cares to explain.

FWIW, here is a variation of Ronaldo's solution that is perhaps a little
better and simpler:

$ cat file.xml
Stuff that comes before
More stuff that comes before
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after

$ sed '/<property name="name">/{:loop /<\/property>.*<\/property>/d; N;
b loop}' file.xml
Stuff that comes before
More stuff that comes before
Stuff that comes after
More stuff that comes after
Stuff that comes after
More stuff that comes after

BTW, in the orig post, I think you meant to say #1, not #2?
#1 <property name="name">.*$
#2 <property name="name">*$

Daniel
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Thank you, Ronaldo, for your help in this small matter.
I've been struggling with just how to expand my usage of sed beyond the
rudimentary s/a/b/ type implementation; and this was a great excuse for
a question ;)
Syntax formation seems to be the biggest obstacle for me with regard to
sed. Thank you again for the working snippet to work through & learn
from...
Cheers!
-Marcus
Post by Ronaldo Ferreira de Lima ***@gmail.com [sed-users]
Greetings Marcus,
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Greetings to all.
I have come across what seems to be a quite simple problem involving an
application of sed in the context of text-block deletion. Here's the
<property name="name">
...
</property>
</property>
Now, deleting all from <property name="name"> to the _/first/_
sed -i '/<property name="name">*$/,/<\/property>*$/d' /file.xml
However, incorporating the 2nd </property> tag in the deletion sweep is
a bit tougher; and just that much outside my new skillset range ;o)
Any ideas as to how one might amend this "one-liner" to make it all
work?
$ sed '/<property
name="name">/{:loop;/<\/property>.*<\/property>/{s/.*//;b;};N;b
loop;};' file.xml
But this is not a good approach to edit XML files. Have a look in XML
parsers like xmllint or xsh (command-line applications).
----
[Non-text portions of this message have been removed]
------------------------------------
------------------------------------
'Marcus L. Thompson' marcus@colisp.com [sed-users]
2014-10-06 22:03:50 UTC
Permalink
Thank you, Daniel. And, as you mentioned, sed really can do an amazing
amount of work as a one-liner; a sharp tool for the job...

Of note, the example was nothing but a strawman; and perhaps not even a
particularly good one. Just thought folks could relate to something
like this; hence the example's form.

Hope the group wouldn't mind if I came back again sometime with another
question. In any event, I'd rather take first recourse to sed for quick
one-line automated filework than implementing something significantly
more massive on balance.

Thanks again --


Marcus
Post by Daniel Goldman ***@ehdp.com [sed-users]
I don't know why one would want to delete that kind of block, since it's
not symmetrical. But you must have your reasons, it's better to just
answer the question, as Ronaldo did. And as you say, it led to a great
example to work through. sed can do an incredible amount in one line.
I'd be interested in how Ronaldo came up with the solution, kind of his
thinking process, if he cares to explain.
FWIW, here is a variation of Ronaldo's solution that is perhaps a little
$ cat file.xml
Stuff that comes before
More stuff that comes before
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after
$ sed '/<property name="name">/{:loop /<\/property>.*<\/property>/d; N;
b loop}' file.xml
Stuff that comes before
More stuff that comes before
Stuff that comes after
More stuff that comes after
Stuff that comes after
More stuff that comes after
BTW, in the orig post, I think you meant to say #1, not #2?
#1 <property name="name">.*$
#2 <property name="name">*$
Daniel
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Thank you, Ronaldo, for your help in this small matter.
I've been struggling with just how to expand my usage of sed beyond the
rudimentary s/a/b/ type implementation; and this was a great excuse for
a question ;)
Syntax formation seems to be the biggest obstacle for me with regard to
sed. Thank you again for the working snippet to work through & learn
from...
Cheers!
-Marcus
Post by Ronaldo Ferreira de Lima ***@gmail.com [sed-users]
Greetings Marcus,
On Fri, Oct 03, 2014 at 08:44:36PM -0400, 'Marcus L. Thompson'
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Greetings to all.
I have come across what seems to be a quite simple problem
involving an
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Post by Ronaldo Ferreira de Lima ***@gmail.com [sed-users]
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
application of sed in the context of text-block deletion. Here's the
<property name="name">
...
</property>
</property>
Now, deleting all from <property name="name"> to the _/first/_
sed -i '/<property name="name">*$/,/<\/property>*$/d' /file.xml
However, incorporating the 2nd </property> tag in the deletion
sweep is
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
Post by Ronaldo Ferreira de Lima ***@gmail.com [sed-users]
Post by 'Marcus L. Thompson' ***@colisp.com [sed-users]
a bit tougher; and just that much outside my new skillset range ;o)
Any ideas as to how one might amend this "one-liner" to make it all
work?
$ sed '/<property
name="name">/{:loop;/<\/property>.*<\/property>/{s/.*//;b;};N;b
loop;};' file.xml
But this is not a good approach to edit XML files. Have a look in XML
parsers like xmllint or xsh (command-line applications).
----
[Non-text portions of this message have been removed]
------------------------------------
------------------------------------
----

[Non-text portions of this message have been removed]
Ronaldo Ferreira de Lima jimmy.tty@gmail.com [sed-users]
2014-10-06 22:53:39 UTC
Permalink
Greetings Daniel,
Post by Daniel Goldman ***@ehdp.com [sed-users]
I don't know why one would want to delete that kind of block, since it's
not symmetrical. But you must have your reasons, it's better to just
answer the question, as Ronaldo did. And as you say, it led to a great
example to work through. sed can do an incredible amount in one line.
I'd be interested in how Ronaldo came up with the solution, kind of his
thinking process, if he cares to explain.
(Maybe I can't explain very well in English yet...)

I start thinking about how to put entire block in pattern space to
simplify tests and other manipulations:

/<property name="name">/ { # search for regexp and start a block
:loop;
/<\/property>.*<\/property>/ { # when pattern space had two end-tags
s/.*//; # substitute for nothing (I made a
# mistake here, but your code fix
# this)
b; # avoid the next loop (not really
# needed in this case)
};
N; # append next line
b loop; # jump to label ":loop"
};
Post by Daniel Goldman ***@ehdp.com [sed-users]
FWIW, here is a variation of Ronaldo's solution that is perhaps a little
$ cat file.xml
Stuff that comes before
More stuff that comes before
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after
Good sample.
Post by Daniel Goldman ***@ehdp.com [sed-users]
$ sed '/<property name="name">/{:loop /<\/property>.*<\/property>/d; N;
b loop}' file.xml
[...]
A hypothetical problem with my solution and your solution is in this case:

Stuff that comes before
More stuff that comes before
<property name="name">
...
</property>
1
2
<property name="name">
</property>
Stuff that comes after
More stuff that comes after
<property name="name">
...
</property>
</property>
Stuff that comes after
More stuff that comes after

Our codes removes the "symmetric" block between "asymmetric" blocks. But
this type of things need attention from Marcus. Maybe this isn't a
problem.

A hypothetical problem with the Marcus solution is in this case:

Stuff that comes before
More stuff that comes before
<property name="name">
...
Stuff that comes after
More stuff that comes after

Only the first two lines of this example will be printed because sed
delete the address regexp '/<property name="name">*$/' until find
'/<\/property>*$/' OR find end of file. Perhaps this is an undesirable
behavior.

[]'s
--
"Não manejo bem as palavras
Mas manipulo bem as strings."
------------------------------
http://tecnoveneno.blogspot.com

Loading...