Regular expressions (Bash)
Versie door Jeroen Strompf (overleg | bijdragen) op 31 okt 2022 om 12:19 (→Casus: Tags & associate arrays - 2022.10)
Regular expressions, or regex to make it sound more mysterious, are the black magic of programming. It's like doing mathematics with text. Or slightly more specific: advanced pattern matching.
Places where you find regex:
- In
grep
for filtering output of a command - In comparisons, through use of the operator
=~
- Probably lots of other places.
I have the impression that regular expressions (regex) in Bash may not be the same as in MySQL, hence some more details in this article.
Match a substring
Probably the easiest case - No special characters or whatever needed:
i="blub"; [[ $i =~ blubber ]] && echo "i contains the substring 'blubber' " # False i="blub"; [[ $i =~ blub ]] && echo "i contains the substring 'blub' " # True
Match a single digit
[]
denotes single character-comparison, meaning that the comparison is true as soon as the string contains one of the characters indicated within []
.
[[ $i =~ [2] ]] && echo "i contains '2'" [[ $i =~ [12] ]] && echo "i contains '1' and/or '2'"
Match ranges of numbers or letters
i="blub"; [[ $i =~ [0-9] ]] && echo "i contains a number" # False i="blu1"; [[ $i =~ [0-9] ]] && echo "i contains a number" # True i="1111"; [[ $i =~ [0-9] ]] && echo "i contains a number" # True i="blub"; [[ $i =~ [A-Z] ]] && echo "i contains at least one uppercase letter" # False i="BLUB"; [[ $i =~ [A-Z] ]] && echo "i contains at least one uppercase letter" # True i="blub"; [[ $i =~ [a-z] ]] && echo "i contains at least one lowercase letter" # True i="blub"; [[ $i =~ [a-zA-Z] ]] && echo "i contains at least one letter" # True
Using grub:
echo 12345 | grep [0-9] 12345
Sequences
^
: Beginning of the string$
: End of the string
i="BLuB"; [[ $i =~ ^[A-Z]+$ ]] && echo "i contains only capital letters" # False i="BLUB"; [[ $i =~ ^[A-Z]+$ ]] && echo "i contains only capital letters" # True
Capture group
Really cool stuff: Substring extraction (Bash)
Logical OR
Filter using regex
How can I filter stuff? E.g.:
12.5(diameter) → 12.5
Some tentative impressions:
- https://stackoverflow.com/questions/5624969/how-to-reference-captures-in-bash-regex-replacement
- https://stackoverflow.com/questions/13043344/search-and-replace-in-bash-using-regular-expressions
- https://www.google.com/search?q=bash%20parameter%20substitution%20with%20regular%20expressions
- https://stackoverflow.com/questions/19246966/regular-expression-in-bash-filter - Not really
Casus: Tags & associate arrays - 2022.10
Everything works:
################################################################################ # Compare tags ################################################################################ # source load_site_array.sh # Show all entries ######################################## # # echo ${site_array[@]} # Show all - Kinda messy # echo ${#site_array[@]} # She entries - Doesn't say much # # for i in "${site_array[@]}" # do # echo " Entry: $i" # done # for i in "${!site_array[@]}" # do # echo " Entry: $i" # done echo "site_array_rows: $site_array_rows" for i in `seq 0 $site_array_rows` do echo "Row $i:" echo " Tag: ${site_array[$i,tag]}" echo " URL: ${site_array[$i,url]}" # OK - Compare - letter 'a' ######################################## # if [[ "${site_array[$i,tag]}" =~ [a] ]] then # echo " Tag contains an 'a'" else echo " Tag doesn't contain an 'a'" fi # OK - Compare - word 'bal' ######################################## # if [[ "${site_array[$i,tag]}" =~ bal ]] then # echo " Tag contains word 'bal'" else echo " Tag doesn't contain word 'bal'" fi # OK - Compare - word '_bal_' ######################################## # if [[ "${site_array[$i,tag]}" =~ _bal_ ]] then # echo " Tag contains word '_bal_'" else echo " Tag doesn't contain word '_bal_'" fi # OK - Compare - word '_bal_' & '_dvb8_' ######################################## # if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && [[ "${site_array[$i,tag]}" =~ _dvb8_ ]] then # echo " Tag contains words '_bal_' & '_dvb8_'" else echo " Tag doesn't contain words '_bal_' & '_dvb8_'" fi # OK - Compare - word '_bal_' & '_dvb8_' - Split lines ######################################## # if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && \ [[ "${site_array[$i,tag]}" =~ _dvb8_ ]] then # echo " Tag contains words '_bal_' & '_dvb8_'" else echo " Tag doesn't contain words '_bal_' & '_dvb8_'" fi # OK - Compare - 3 words + Split lines ######################################## # if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && \ [[ "${site_array[$i,tag]}" =~ _dvb8_ ]] && \ [[ "${site_array[$i,tag]}" =~ _cb_ ]] then # echo " Tag contains words '_bal_', '_dvb8_' & '_cb_'" else echo " Tag doesn't contain words '_bal_', '_dvb8_' & '_cb_'" fi # done