Discussion:
sed file processing?
Jude DaShiell
2014-02-14 11:05:12 UTC
Permalink
Is it within sed's logical capacity to check a file and only do a command
on that file if it has at least one line of text in it? For what I'm
trying to do with a script, if the file has zero lines it should not have
anything done to it.



jude <***@shellworld.net>
Davide Brini
2014-02-14 11:13:43 UTC
Permalink
On Fri, 14 Feb 2014 06:05:12 -0500 (EST), Jude DaShiell
Post by Jude DaShiell
Is it within sed's logical capacity to check a file and only do a command
on that file if it has at least one line of text in it? For what I'm
trying to do with a script, if the file has zero lines it should not have
anything done to it.
If a file has zero lines sed can't do anything on it anyway, since by
definition sed works on input lines.

$ touch a
$ sed 's/^/foobar/' a
$

Pleae note that a file that is NOT 0 bytes in size is neither an empty file
nor one with no lines of text.
--
D.
d***@ehdp.com
2014-02-14 17:48:57 UTC
Permalink
That was a great question.

As in previous response, if the file is totally empty, sed tries to read the first line, fails, and exits.

If the file has one line, but no EOL marker, sed reads the line, but somehow figures out NOT to an EOL marker when it outputs the pattern space.

Here is the normal case with EOL marker:

$ echo "X" | sed "s/X/Y/"
Y
$

Here is the case with EOL marker missing:

$ echo -n "X" | sed "s/X/Y/"
Y$

Are you wanting sed to take no action if no EOL marker on single line?

Daniel

---In sed-***@yahoogroups.com, <***@...> wrote:

On Fri, 14 Feb 2014 06:05:12 -0500 (EST), Jude DaShiell
Post by Jude DaShiell
Is it within sed's logical capacity to check a file and only do a command
on that file if it has at least one line of text in it? For what I'm
trying to do with a script, if the file has zero lines it should not have
anything done to it.
If a file has zero lines sed can't do anything on it anyway, since by
definition sed works on input lines.

$ touch a
$ sed 's/^/foobar/' a
$

Pleae note that a file that is NOT 0 bytes in size is neither an empty file
nor one with no lines of text.

--
D.
Jude DaShiell
2014-02-14 19:15:00 UTC
Permalink
This action is happening on a free-bsd system and I don't know if eol
markers are put on ends of lines in native format. I am using
spamassassin and sending spam to probably-spam for further action.
Aside from what the program learns or fails to learn, the script
searches down through probably-spam and finds all of the From: lines and
saves those to a file called rf. What I'm going to do with sed is to do
a command like sed -e "%s/^/blacklist-/ rf <cr> that should tack
blacklist- onto the beginning of every line in rf. Once that's done,
~/.spamassassin/user_prefs needs a little concatenation of the file
called rf. Once that's done I delete the file called rf and a .logout
script runs sa-learn --spam then deletes all email in probably-spam.
Last time I checked, 794 unique entries in user_prefs that are
blacklisted. This has started in October 2013 so that will give you an
idea as to the spam volume around here. I figure the more I can
automate the fewer mistakes happen and sed is golden in terms of these
capabilities! I could probably have automated more in this work flow
with sed than was done, but I figured at least this operation might have
been something sed could handle well and it appears sed will do it
better than I expected.
Post by d***@ehdp.com
That was a great question.
As in previous response, if the file is totally empty, sed tries to read the first line, fails, and exits.
If the file has one line, but no EOL marker, sed reads the line, but somehow figures out NOT to an EOL marker when it outputs the pattern space.
$ echo "X" | sed "s/X/Y/"
Y
$
$ echo -n "X" | sed "s/X/Y/"
Y$
Are you wanting sed to take no action if no EOL marker on single line?
Daniel
On Fri, 14 Feb 2014 06:05:12 -0500 (EST), Jude DaShiell
Post by Jude DaShiell
Is it within sed's logical capacity to check a file and only do a command
on that file if it has at least one line of text in it? For what I'm
trying to do with a script, if the file has zero lines it should not have
anything done to it.
If a file has zero lines sed can't do anything on it anyway, since by
definition sed works on input lines.
$ touch a
$ sed 's/^/foobar/' a
$
Pleae note that a file that is NOT 0 bytes in size is neither an empty file
nor one with no lines of text.
--
D.
jude <***@shellworld.net>
Jude DaShiell
2014-02-14 19:45:11 UTC
Permalink
cut here.
#!/usr/local/bin/bash
# file: sp.sh
grep "^From:" /home/jdashiel/mail/probably-spam | cut -d \ -f 1-8 - >/home/jdashiel/.spamassassin/rf
cd /home/jdashiel/.spamassassin
sed -e 's/^/blacklist-/' rf
cut here.
When I run this script, the output sounds exactly as I'd like however the
rf file remains untouched. Could that be because the file is in a hidden
directory? If so, I can fix that problem.
jude <***@shellworld.net>
d***@ehdp.com
2014-02-14 21:07:33 UTC
Permalink
Thank you for posting the script. Those "From" lines do have an EOL marker. If I understand your script right, the answer is that sed does not normally save the file. It just sends the output to stdout (rf file is unchanged), for good reason. This gives you all the flexibility to automate things as you discuss.

So you have two choices:

#1 - Save the output and then overwrite original file with edited version:

$ sed -e 's/^/blacklist-/' rf > edited-file.txt
$ mv edited-file.txt rf

#2 - Use -i option (if available) to edit the file "in-place" (it actually creates an intermediate temporary file behind the scenes).

$ sed -i -e 's/^/blacklist-/' rf

Daniel
Jude DaShiell
2014-02-14 21:55:28 UTC
Permalink
Yes, the -i option is available thanks much for this assist. On Fri, 14
Post by d***@ehdp.com
Thank you for posting the script. Those "From" lines do have an EOL marker. If I understand your script right, the answer is that sed does not normally save the file. It just sends the output to stdout (rf file is unchanged), for good reason. This gives you all the flexibility to automate things as you discuss.
$ sed -e 's/^/blacklist-/' rf > edited-file.txt
$ mv edited-file.txt rf
#2 - Use -i option (if available) to edit the file "in-place" (it actually creates an intermediate temporary file behind the scenes).
$ sed -i -e 's/^/blacklist-/' rf
Daniel
jude <***@shellworld.net>
Jude DaShiell
2014-02-14 22:16:50 UTC
Permalink
That's very useful having restrictions on input files being overwritten,
that way we get to see if we like the results before doing anything
permanent.



jude <***@shellworld.net>
Cameron Simpson
2014-02-15 00:52:08 UTC
Permalink
Post by Jude DaShiell
That's very useful having restrictions on input files being overwritten,
that way we get to see if we like the results before doing anything
permanent.
I'm not sure if it was mentioned, but the "-i" flag has two modes.

The plain "-i" mentioned lets you overwrite in place. But adding a
suffix, eg "-i.bak" or even "-i.bak-`date +%Y%m%d`" takes a backup
of the original version as well.

On a personal basis I have a script called "bsed":
https://bitbucket.org/cameron_simpson/css/src/tip/bin/bsed
that does essentially what GNU sed's -i option does (though it
rewrites the file instead of making a copy and swapping it in, for
reasons I can belabour if anyone really cares), partly to work with
non-GNU seds which don't have -i, and partly because it will
optionally report a diff and also check the sed output status; a
failure such as from a syntacticly invalid sed script won't destroy
the input/output file.

Cheers,
--
Cameron Simpson <***@zip.com.au>

Don't have awk? Use this simple sh emulation:
#!/bin/sh
echo 'Awk bailing out!' >&2
exit 2
- Tom Horsley <***@csd.harris.com>
Loading...