Discussion:
how to extract exact text
MOKRANI Rachid
2013-09-19 11:29:59 UTC
Permalink
Hi



I have the following file



Cat test.txt



Other text [bug 5] hardware

[bug 256] my software

My text [bug 1256]



How can I extract only the text

[bug 5]

[bug 256]

[bug 1256]





The text I need to extract is always [bug XXXXX]

XXXX is always different number.



Regards .

__________________________
Avant d'imprimer, pensez à l'environnement ! Please consider the environment before printing !
Ce message et toutes ses piÚces jointes sont confidentiels et établis à l'intention exclusive de ses destinataires. Toute utilisation non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. IFP Energies nouvelles décline toute responsabilité au titre de ce message. This message and any attachments are confidential and intended solely for the addressees. Any unauthorised use or dissemination is prohibited. IFP Energies nouvelles should not be liable for this message.
__________________________
Davide Brini
2013-09-19 11:49:05 UTC
Permalink
On Thu, 19 Sep 2013 13:29:59 +0200, "MOKRANI Rachid"
Post by MOKRANI Rachid
Other text [bug 5] hardware
[bug 256] my software
My text [bug 1256]
How can I extract only the text
[bug 5]
[bug 256]
[bug 1256]
The text I need to extract is always [bug XXXXX]
XXXX is always different number.
The answer depends on whether there can be multiple occurrences of [bug
XXX] on the same line. If not, it is trivially done with

sed -n 's/.*\(\[bug [0-9]*\)\].*/\1/p' test.txt

If not, the solution is more complex. A solution using \x1 as marker (needs
GNU sed), which also works in the single-occurrence case:

s/\[bug [0-9]*\]/\x1&/g
t ok
d
:ok
s/^[^\x1]*\x1//
s/\(\[bug [0-9]*\]\)[^\x1]*/\1/g
s/\x1/\n/g

In this case it's probably easier to use GNU grep, eg

grep -Eo '\[bug [0-9]+\]' test.txt

or awk or perl, eg

gawk -v RS='\\[bug [0-9]+\\]' 'RT{print RT}' test.txt

perl -ne 'print "$_\n" for (/\[bug \d+\]/g)' test.txt
--
D.
Werfgam Nadler
2013-09-19 13:21:55 UTC
Permalink
Hi,

another simpler way to extract multiple patterns is

sed '

/\n/!s/\(\[[^]]*\]\)/\n&\n/g;
/^\[[^]]*\]\n/P;
D
'   myFile


[bug 5]
[bug 256]
[bug 1256]


regards W.


________________________________
From: Davide Brini <***@gmx.com>
To: sed-***@yahoogroups.com
Sent: Thursday, September 19, 2013 5:19 PM
Subject: Re: how to extract exact text


On Thu, 19 Sep 2013 13:29:59 +0200, "MOKRANI Rachid"
Post by MOKRANI Rachid
Other text [bug 5] hardware
[bug 256] my software
My text [bug 1256]
 
How can I extract only the text
[bug 5]
[bug 256]
[bug 1256]
 
 
The text I need to extract is always [bug XXXXX]
XXXX is always different number.
The answer depends on whether there can be multiple occurrences of [bug
XXX] on the same line. If not, it is trivially done with

sed -n 's/.*\(\[bug [0-9]*\)\].*/\1/p' test.txt

If not, the solution is more complex. A solution using \x1 as marker (needs
GNU sed), which also works in the single-occurrence case:

s/\[bug [0-9]*\]/\x1&/g
t ok
d
:ok
s/^[^\x1]*\x1//
s/\(\[bug [0-9]*\]\)[^\x1]*/\1/g
s/\x1/\n/g

In this case it's probably easier to use GNU grep, eg

grep -Eo '\[bug [0-9]+\]' test.txt

or awk or perl, eg

gawk -v RS='\\[bug [0-9]+\\]' 'RT{print RT}' test.txt

perl -ne 'print "$_\n" for (/\[bug \d+\]/g)' test.txt
--
D.
Loading...