开发者

Shell Scripting: how to pick value of an expression from each line of a file

开发者 https://www.devze.com 2023-01-27 12:05 出处:网络
I am new with shell scripting. I am having a file containing some records of the form: \"text1:text2=value2,text3=value3,text4=value4,text5=value5\"text1:text6:value6\"

I am new with shell scripting.

I am having a file containing some records of the form:

"text1:text2=value2,text3=value3,text4=value4,text5=value5"text1:text6:value6" "text1:text2=value2,text3=value3,text4=value4,text5开发者_运维知识库=value5"text1:text6:value6" "text1:text2=value2,text3=value3,text4=value4,text5=value5"text1:text6:value6" "text1:text2=value2,text3=value3,text4=value4,text5=value5"text1:text6:value6"

Now I want to write a shell script that picks up the value field for some text. Eg: I want value2 and value5 and I know that they will exist in front of text2= and text5=

Also there is no blank space in complete line. The file contains n lines and I want to have 2 values from each line(ie value2 and value5) and store then in a variables for further processing.

Can someone help.

Thanks


Using sed:

while read text2var text5var
do
    #something with text2var and text5var
done < sed 's/.*:text2=\([^,]*\),.*,text5=\([^"]*\)".*/\1 \2/') inputfile

Using GNU AWK (gawk):

while read text2var text5var
do
    #something with text2var and text5var
done < gawk -F ',|:|"' '{sub("[^=]*=","",$3); sub("[^=]*=","",$6); print $3, $6}' inputfile

To use other versions of AWK that don't have regular expressions for field separators, use a regex similar to the sed command or use a lot of splitting:

while read text2var text5var
do
    #something with text2var and text5var
done < awk -F ',' '{split($1,t2,"text2="); split($4,t5,"\""); split(t5[1],t5,"="); print t2[2], t5[2]}' inputfile

Using cut:

while read text2var text5var
do
    #something with text2var and text5var
done < cut -d , -f 1,4 --output-delimiter='=' inputfile | cut -d '"' -f2 | cut -d = -f1,3 | cut -d : -f 2 | cut -d = --output-delimiter=' ' -f1,2 

GNU cut may be required to be able to use the --output-delimiter option. It may be ugly but at least it's not being called four times on every line.


I'm sure that some more elegant solution is possible, but this bash script just loops through the input and filters out

  • the value between the first = and the following , and
  • the value between the fourth = and the following ":

    while read line
    do
        value2=`echo "$line" | cut -d = -f 2 | cut -d , -f 1`
        value5=`echo "$line" | cut -d = -f 5 | cut -d \" -f 1`
        echo $value2 - $value5   # do something with $value2 and $value5
    done
    

You call the script like this:

bash myscript.sh < mytextfile.txt


From the Command Line with text in q.text:

gawk -F\" '{print $2}' < q.txt | gawk -F: '{print $2 }' | gawk -F, '{print $1 "=" $4}'| gawk -F= '{print $2 "," $4}'

Tried on Cygwin bash and it will work. I am not a programmer but I use the cygwin shell and thought it would be fun to try doing this with gawk.

0

精彩评论

暂无评论...
验证码 换一张
取 消