Discussion:
Suggest new fixed flag for s command?
Daniel
2013-05-08 09:27:41 UTC
Permalink
I have run into a situation sometimes where literal strings I want to replace that have metacharacters, and I want to replace it with some other similarly odd string.

For example, suppose I want to replace the literal string "jj*hh" with the literal string "foo\1bar".

To make it work, I would have to use escape characters:

$ echo 'jj*hh' | sed 's/jj\*hh/foo\\1bar/'
foo\1bar

I don't mind doing this for one or two strings, but sometimes I have a lot of strings, and don't want to escape them all by hand or write some complex script procedure.

What if the s command could have a f "fixed" flag, similar to the grep -F fixed flag, so the following would work:

$ echo 'jj*hh' | sed 's/jj*hh/foo\1bar/f'
foo\1bar

The f flag would tell s to ignore all metacharacters, to just interpret A and B in s/A/B/f as fixed strings.

Maybe there is some simpler way to do this I am overlooking? Anyway, just a suggestion.
Aurelio Jargas
2013-05-08 16:22:13 UTC
Permalink
I like that suggestion. I have also needed that behavior many times. And
not just in s///, but in addressing in general.

It would be nice to have a generic way of specifying a fixed string in the
address. Borrowing your f flag idea, it could be something as:

/foo/f, /bar/f { ... }

But this can be confused with a possible f command.

Anyway, regardless of the chosen syntax, I would like to have a fixed
string flag to use in sed addresses and s///. Not a global command line
flag as grep, but an individual flag to use just in the required places.
Post by Daniel
I have run into a situation sometimes where literal strings I want to
replace that have metacharacters, and I want to replace it with some other
similarly odd string.
For example, suppose I want to replace the literal string "jj*hh" with the
literal string "foo\1bar".
$ echo 'jj*hh' | sed 's/jj\*hh/foo\\1bar/'
foo\1bar
I don't mind doing this for one or two strings, but sometimes I have a lot
of strings, and don't want to escape them all by hand or write some complex
script procedure.
What if the s command could have a f "fixed" flag, similar to the grep -F
$ echo 'jj*hh' | sed 's/jj*hh/foo\1bar/f'
foo\1bar
The f flag would tell s to ignore all metacharacters, to just interpret A
and B in s/A/B/f as fixed strings.
Maybe there is some simpler way to do this I am overlooking? Anyway, just a suggestion.
------------------------------------
--
Yahoo! Groups Links
--
Aurelio | www.aurelio.net | @oreio


[Non-text portions of this message have been removed]
Thierry Blanc
2013-05-08 18:20:28 UTC
Permalink
very good idea.

to not escape strings by hand, you can use preprocessing, something like:

echo 'jj*hh.?' | sed 's|[.?*]|\\&|g'
Post by Daniel
I have run into a situation sometimes where literal strings I want to replace that have metacharacters, and I want to replace it with some other similarly odd string.
For example, suppose I want to replace the literal string "jj*hh" with the literal string "foo\1bar".
$ echo 'jj*hh' | sed 's/jj\*hh/foo\\1bar/'
foo\1bar
I don't mind doing this for one or two strings, but sometimes I have a lot of strings, and don't want to escape them all by hand or write some complex script procedure.
$ echo 'jj*hh' | sed 's/jj*hh/foo\1bar/f'
foo\1bar
The f flag would tell s to ignore all metacharacters, to just interpret A and B in s/A/B/f as fixed strings.
Maybe there is some simpler way to do this I am overlooking? Anyway, just a suggestion.
------------------------------------
Pedro Izecksohn
2013-05-09 06:48:37 UTC
Permalink
  The character f may mean fixed, but it also may mean flag. I thought about a command that would clear the state of the substitution flag, but as I'm minimalist ... I'm against any new command that is not strictly necessary, where necessary means not replaceable. Why? Every new command makes the switch loop longer, IOW: makes sed run slower.

  A much more useful command would: Take the hold space and compile it as a regular expression; to be used by the next empty regular expression, that is // . The character for such command: I'd use C ompile.

----- Original Message -----
Sent: Wednesday, May 8, 2013 3:20 PM
Subject: Re: Suggest new fixed flag for s command?
very good idea.
echo 'jj*hh.?' | sed 's|[.?*]|\\&|g'
Post by Daniel
I have run into a situation sometimes where literal strings I want to
replace that have metacharacters, and I want to replace it with some other
similarly odd string.
Post by Daniel
For example, suppose I want to replace the literal string "jj*hh"
with the literal string "foo\1bar".
Post by Daniel
$ echo 'jj*hh' | sed 's/jj\*hh/foo\\1bar/'
foo\1bar
I don't mind doing this for one or two strings, but sometimes I have a
lot of strings, and don't want to escape them all by hand or write some
complex script procedure.
Post by Daniel
What if the s command could have a f "fixed" flag, similar to the
$ echo 'jj*hh' | sed 's/jj*hh/foo\1bar/f'
foo\1bar
The f flag would tell s to ignore all metacharacters, to just interpret A
and B in s/A/B/f as fixed strings.
Post by Daniel
Maybe there is some simpler way to do this I am overlooking? Anyway, just a
suggestion.
Brandon W Maister
2013-05-09 13:56:38 UTC
Permalink
FWIW in every language that I know of that implements this feature it's
implemented as [raw strings] with an 'r' (sometimes 'R') flag. "Raw regular
expression" doesn't really make sense, but the term and flag will be
recognized by programmers.

I, too, have occasionally wanted this.

[raw strings] https://www.google.com/search?q=raw+string

brandon
Post by Pedro Izecksohn
The character f may mean fixed, but it also may mean flag. I thought
about a command that would clear the state of the substitution flag, but as
I'm minimalist ... I'm against any new command that is not strictly
necessary, where necessary means not replaceable. Why? Every new command
makes the switch loop longer, IOW: makes sed run slower.
A much more useful command would: Take the hold space and compile it as
a regular expression; to be used by the next empty regular expression, that
is // . The character for such command: I'd use C ompile.
----- Original Message -----
Sent: Wednesday, May 8, 2013 3:20 PM
Subject: Re: Suggest new fixed flag for s command?
very good idea.
echo 'jj*hh.?' | sed 's|[.?*]|\\&|g'
Post by Daniel
I have run into a situation sometimes where literal strings I want to
replace that have metacharacters, and I want to replace it with some
other
similarly odd string.
Post by Daniel
For example, suppose I want to replace the literal string "jj*hh"
with the literal string "foo\1bar".
Post by Daniel
$ echo 'jj*hh' | sed 's/jj\*hh/foo\\1bar/'
foo\1bar
I don't mind doing this for one or two strings, but sometimes I have a
lot of strings, and don't want to escape them all by hand or write some
complex script procedure.
Post by Daniel
What if the s command could have a f "fixed" flag, similar to the
$ echo 'jj*hh' | sed 's/jj*hh/foo\1bar/f'
foo\1bar
The f flag would tell s to ignore all metacharacters, to just
interpret A
and B in s/A/B/f as fixed strings.
Post by Daniel
Maybe there is some simpler way to do this I am overlooking? Anyway,
just a
suggestion.
------------------------------------
--
Yahoo! Groups Links
[Non-text portions of this message have been removed]
Jim Hill
2013-05-08 19:37:26 UTC
Permalink
Would vim's \v and \V flags be worth implementing? I've been a fan of those
for a very long time.


[Non-text portions of this message have been removed]
Thierry Blanc
2013-05-09 08:40:02 UTC
Permalink
Can sed negate matching an atom, i.e. multiple letters?

Example: match any line starting with foo, but NOT ending with a dot
followd by zero or more spaces

(end of line marked with $)

foo rhabarber etc .$
foo rhabarber etc . $
foo rhabarber etc$
foo rhabarber etc $


only match third and forth line.

atom is:
(\. *)

negate atom???
sed '/foo.*^(\. *)$/ # does not work, of course, matches beginning of line
Thierry Blanc
2013-05-09 08:44:57 UTC
Permalink
Can sed negate matching an atom, i.e. multiple letters?

Example: match any line starting with foo, but NOT ending with a dot
followd by zero or more spaces

(end of line marked with $)

foo rhabarber etc .$
foo rhabarber etc . $
foo rhabarber etc$
foo rhabarber etc $


only match third and forth line.

atom is:
(\. *)

negate atom???
sed '/foo.*^(\. *)$/ # does not work, of course, matches beginning of line
Davide Brini
2013-05-09 08:52:15 UTC
Permalink
Post by Thierry Blanc
Can sed negate matching an atom, i.e. multiple letters?
Example: match any line starting with foo, but NOT ending with a dot
followd by zero or more spaces
(end of line marked with $)
foo rhabarber etc .$
foo rhabarber etc . $
foo rhabarber etc$
foo rhabarber etc $
only match third and forth line.
(\. *)
negate atom???
sed '/foo.*^(\. *)$/ # does not work, of course, matches beginning of line
Why not do

sed '/^foo/{ /\. *$/b; process matching lines here; }
--
D.
Thierry Blanc
2013-05-09 08:55:51 UTC
Permalink
Post by Davide Brini
Post by Thierry Blanc
Can sed negate matching an atom, i.e. multiple letters?
Example: match any line starting with foo, but NOT ending with a dot
followd by zero or more spaces
(end of line marked with $)
foo rhabarber etc .$
foo rhabarber etc . $
foo rhabarber etc$
foo rhabarber etc $
only match third and forth line.
(\. *)
negate atom???
sed '/foo.*^(\. *)$/ # does not work, of course, matches beginning of line
Why not do
sed '/^foo/{ /\. *$/b; process matching lines here; }
yes, this is what I just found out, but it is far less elegant ...
Jim Hill
2013-05-09 09:15:06 UTC
Permalink
Yes - /\. *$/! { /^foo/ d; }


[Non-text portions of this message have been removed]
Thierry Blanc
2013-05-09 11:42:58 UTC
Permalink
well, the case was a html parsing, it was not end of line.

here is the line where I wanted the do-not-match-dot-followed-by-spaces
before the '<o:p>'

s/.*<b style="mso-bidi-font-weight: normal.*>([^<>][^<>]*)*[HERE NO DOT
FOLLOWED BY ZERO OR MORE SPACES]*<o:p>.*/\\subsection*{\1}/

a not-the-following-atom syntax would be very handy ...


here is the solution I eventually used:

# subsection insert
/<b style="mso-bidi-font-weight: normal/{
/><o:p>/bsubsection
/\. *<o:p>/bsubsection
s/.*<b style="mso-bidi-font-weight:
normal.*>([^<>][^<>]*)<o:p>.*/\\subsection*{\1}/
}
:subsection
Post by Jim Hill
Yes - /\. *$/! { /^foo/ d; }
[Non-text portions of this message have been removed]
------------------------------------
[Non-text portions of this message have been removed]
Jim Hill
2013-05-09 17:30:52 UTC
Permalink
To handle patterns like that I sub in newlines or nulls and character-set
them:

h;s/\. */\0/g
/first[^\0]second/{d}
g
Post by Thierry Blanc
well, the case was a html parsing, it was not end of line.
here is the line where I wanted the do-not-match-dot-followed-by-spaces
before the '<o:p>'
s/.*<b style="mso-bidi-font-weight: normal.*>([^<>][^<>]*)*[HERE NO DOT
FOLLOWED BY ZERO OR MORE SPACES]*<o:p>.*/\\subsection*{\1}/
a not-the-following-atom syntax would be very handy ...
# subsection insert
/<b style="mso-bidi-font-weight: normal/{
/><o:p>/bsubsection
/\. *<o:p>/bsubsection
normal.*>([^<>][^<>]*)<o:p>.*/\\subsection*{\1}/
}
:subsection
Yes - /\. *$/! { /^foo/ d; }
[Non-text portions of this message have been removed]
------------------------------------
[Non-text portions of this message have been removed]

Loading...