Discussion:
keep only numeric character
MOKRANI Rachid rachid.mokrani@ifpen.fr [sed-users]
2016-02-09 16:25:27 UTC
Permalink
Hi,

Please, some assistance for the sed command to get the result below.

Input.txt
"Paul";"01 26 30 69 69";"NY"
"Dan";"05 26 30 69";"CA"
"Jane";"26 30 69";"SL"
"Bill";"03 26 30 69 69";"BT"
"Steve";"03-26-30-69-69";"BT"
"Daniel";"NA";"BO"
"Karen";"01/02/03";"YO"


I would like to have (only line with 10 digits) - remove all field less or greater than 10 digits and all fields with not numeric

Output.txt
"Paul";"01 26 30 69 69";"NY"
"Dan";"";"CA"
"Jane";"";"SL"
"Bill";"03 26 30 69 69";"BT"
"Steve";"";"BT"
"Daniel";"";"BO"
"Karen";"";"YO"

Thanks.


__________________________
Avant d'imprimer, pensez à l'environnement ! Please consider the environment before printing !
Ce message et toutes ses piÚces jointes sont confidentiels et établis à l'intention exclusive de ses destinataires. Toute utilisation non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. IFP Energies nouvelles décline toute responsabilité au titre de ce message. This message and any attachments are confidential and intended solely for the addressees. Any unauthorised use or dissemination is prohibited. IFP Energies nouvelles should not be liable for this message.
__________________________


[Non-text portions of this message have been removed]
Tim Chase sed@thechases.com [sed-users]
2016-02-09 16:44:55 UTC
Permalink
Post by MOKRANI Rachid ***@ifpen.fr [sed-users]
Please, some assistance for the sed command to get the result below.
Input.txt
"Paul";"01 26 30 69 69";"NY"
"Dan";"05 26 30 69";"CA"
"Jane";"26 30 69";"SL"
"Bill";"03 26 30 69 69";"BT"
"Steve";"03-26-30-69-69";"BT"
"Daniel";"NA";"BO"
"Karen";"01/02/03";"YO"
I would like to have (only line with 10 digits) - remove all
field less or greater than 10 digits and all fields with not
numeric
Output.txt
"Paul";"01 26 30 69 69";"NY"
"Dan";"";"CA"
"Jane";"";"SL"
"Bill";"03 26 30 69 69";"BT"
"Steve";"";"BT"
"Daniel";"";"BO"
"Karen";"";"YO"
While ugly, you can do it in sed:

sed '/^\("[^"]*";"\)\(\( *[0-9]\)\{10\}\)\(".*"\)/!s/^\("[^"]*";"\)[^"]*\(".*"\)/\1\2/'


Might be clearer in awk:

awk -vOFS=\; -F\; '$2 !~ /^"(\s*[0-9]){10}\s*"$/{$2="\"\""}{print}'

which roughly translates to

$2 where field #1
!~ doesn't contain
/^"(\s*[0-9]){10}\s*"$/ 10 digits (ignoring optional spaces)
{$2="\"\""} set the 2nd column to double-quotes
{print} and print the resulting row

The "-vOFS=\;" and "-F\;" specify the output column-delimiter and the
input column-delimiter.

-tim
Jim Hill gjthill@gmail.com [sed-users]
2016-02-09 17:13:19 UTC
Permalink
sed -E '/;"([0-9] *){10}";/! s/;".*";/;"";/' works on your sample, where
you're after the second of three fields.


[Non-text portions of this message have been removed]
sharma__r@hotmail.com [sed-users]
2016-02-09 19:21:23 UTC
Permalink
The 2nd field appears to be where all the action is, so we fence it with newline characters since it is guaranteed never to come in a line.


sed -e '
s/;/\n/;s/;/\n/; # fence the 2nd field


# null the 2nd field in lines that have a nonnumeric/non-whitespace
# character present in their 2nd fields.
/\n".*[^0-9 ].*"\n/bnull


# if we reach here, means the 2nd field has just spaces and/or
# numerics in them.
# just need to determine whether the digits are exactly 10 or not.


# store for later recall
# strip whitespaces globally, dont worry as only 2nd field matters.
h;s/ //g


# Bingo!! we have a line which has precisely 10 digits in its 2nd field.
# Remember we had shaved off the whitespaces,
# hence, we recall from hold area & revert the newlines to field seps
# and display the line without any modifs
/\n".\{10\}"\n/{
g;y/\n/;/;b
}


# these lines, though have only spaces&/or numbers but they are
# either != 10 (digits) or all spaces. Hence this 2nd field needs to
# nulled out. But first we recover from the hold area the actual line.
g
:null
s/\n.*\n/;"";/
' yourfile








---In sed-***@yahoogroups.com, <***@...> wrote :

Hi,

Please, some assistance for the sed command to get the result below.

Input.txt
"Paul";"01 26 30 69 69";"NY"
"Dan";"05 26 30 69";"CA"
"Jane";"26 30 69";"SL"
"Bill";"03 26 30 69 69";"BT"
"Steve";"03-26-30-69-69";"BT"
"Daniel";"NA";"BO"
"Karen";"01/02/03";"YO"


I would like to have (only line with 10 digits) - remove all field less or greater than 10 digits and all fields with not numeric

Output.txt
"Paul";"01 26 30 69 69";"NY"
"Dan";"";"CA"
"Jane";"";"SL"
"Bill";"03 26 30 69 69";"BT"
"Steve";"";"BT"
"Daniel";"";"BO"
"Karen";"";"YO"

Thanks.


__________________________
Avant d'imprimer, pensez à l'environnement ! Please consider the environment before printing !
Ce message et toutes ses piÚces jointes sont confidentiels et établis à l'intention exclusive de ses destinataires. Toute utilisation non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. IFP Energies nouvelles décline toute responsabilité au titre de ce message. This message and any attachments are confidential and intended solely for the addressees. Any unauthorised use or dissemination is prohibited. IFP Energies nouvelles should not be liable for this message.
__________________________


[Non-text portions of this message have been removed]



[Non-text portions of this message have been removed]
dgoldman@ehdp.com [sed-users]
2016-02-09 21:50:20 UTC
Permalink
Post by MOKRANI Rachid ***@ifpen.fr [sed-users]
Input.txt
"Bill";"03 26 30 69 69";"BT"
"Steve";"03-26-30-69-69";"BT"
I would like to have (only line with 10 digits) - remove all
field less or greater than 10 digits and all fields with not numeric
Output.txt
"Bill";"03 26 30 69 69";"BT"
"Steve";"";"BT"
Seems maybe a bit odd you keep "Bill" but blank out "Steve". I guess you consider "Steve" not "numeric" because of hyphens. On the other hand, both "Steve" and "Bill" seem to maybe have 0326306969. - Daniel




[Non-text portions of this message have been removed]

Loading...