Regular expressions (Bash): verschil tussen versies

Uit De Vliegende Brigade
Naar navigatie springen Naar zoeken springen
 
(9 tussenliggende versies door dezelfde gebruiker niet weergegeven)
Regel 1: Regel 1:
To use regular expressions in Bash comparisons, use operator <code>=~</code>, like
+
''Regular expressions'', or ''regex'' to make it sound more mysterious, are the black magic of programming. It's like doing mathematics with text. Or slightly more specific: ''advanced pattern matching''.
  
<pre>
+
Places where you find regex:
if [[ "$switches" =~ [f] ]]; then
+
 
  echo "f - Create folder structure"
+
* In <code>grep</code> for filtering output of a command
  mappenstructuur=true
+
* In comparisons, through use of the operator <code>=~</code>
fi
+
* Probably lots of other places.
</pre>
 
  
 
I have the impression that regular expressions (regex) in Bash may not be the same as in MySQL, hence some more details in this article.
 
I have the impression that regular expressions (regex) in Bash may not be the same as in MySQL, hence some more details in this article.
Regel 40: Regel 39:
  
 
i="blub"; [[ $i =~ [a-zA-Z] ]] && echo "i contains at least one letter"  # True
 
i="blub"; [[ $i =~ [a-zA-Z] ]] && echo "i contains at least one letter"  # True
 +
</pre>
 +
 +
Using grub:
 +
 +
<pre>
 +
echo 12345 | grep [0-9]
 +
12345
 
</pre>
 
</pre>
  
Regel 55: Regel 61:
  
 
Really cool stuff: [[Substring extraction (Bash)]]
 
Really cool stuff: [[Substring extraction (Bash)]]
 +
 +
== Logical OR ==
 +
 +
== Filter using regex ==
 +
 +
How can I filter stuff? E.g.:
 +
 +
<pre>
 +
12.5(diameter) → 12.5
 +
</pre>
 +
 +
Some tentative impressions:
 +
 +
* https://stackoverflow.com/questions/5624969/how-to-reference-captures-in-bash-regex-replacement
 +
* https://stackoverflow.com/questions/13043344/search-and-replace-in-bash-using-regular-expressions
 +
* https://www.google.com/search?q=bash%20parameter%20substitution%20with%20regular%20expressions
 +
* https://stackoverflow.com/questions/19246966/regular-expression-in-bash-filter - Not really
 +
 +
== Casus: Tags & associate arrays - 2022.10 ==
 +
 +
Everything works:
 +
 +
<pre>
 +
################################################################################
 +
# Compare tags
 +
################################################################################
 +
#
 +
source load_site_array.sh
 +
 +
 +
# Show all entries
 +
########################################
 +
#
 +
# echo ${site_array[@]} # Show all - Kinda messy
 +
# echo ${#site_array[@]} # She entries - Doesn't say much
 +
#
 +
# for i in "${site_array[@]}"
 +
# do
 +
# echo " Entry: $i"
 +
# done
 +
 +
 +
# for i in "${!site_array[@]}"
 +
# do
 +
# echo " Entry: $i"
 +
# done
 +
 +
echo "site_array_rows: $site_array_rows"
 +
 +
for i in `seq 0 $site_array_rows`
 +
do
 +
echo "Row $i:"
 +
echo " Tag: ${site_array[$i,tag]}"
 +
echo " URL: ${site_array[$i,url]}"
 +
 +
 +
# OK - Compare - letter 'a'
 +
########################################
 +
#
 +
if [[ "${site_array[$i,tag]}" =~ [a] ]]
 +
then
 +
#
 +
echo " Tag contains an 'a'"
 +
else
 +
echo " Tag doesn't contain an 'a'"
 +
fi
 +
 +
 +
# OK - Compare - word 'bal'
 +
########################################
 +
#
 +
if [[ "${site_array[$i,tag]}" =~ bal ]]
 +
then
 +
#
 +
echo " Tag contains word 'bal'"
 +
else
 +
echo " Tag doesn't contain word 'bal'"
 +
fi
 +
 +
 +
# OK - Compare - word '_bal_'
 +
########################################
 +
#
 +
if [[ "${site_array[$i,tag]}" =~ _bal_ ]]
 +
then
 +
#
 +
echo " Tag contains word '_bal_'"
 +
else
 +
echo " Tag doesn't contain word '_bal_'"
 +
fi
 +
 +
 +
# OK - Compare - word '_bal_' & '_dvb8_'
 +
########################################
 +
#
 +
if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && [[ "${site_array[$i,tag]}" =~ _dvb8_ ]]
 +
then
 +
#
 +
echo " Tag contains words '_bal_' & '_dvb8_'"
 +
else
 +
echo " Tag doesn't contain words '_bal_' & '_dvb8_'"
 +
fi
 +
 +
 +
# OK - Compare - word '_bal_' & '_dvb8_' - Split lines
 +
########################################
 +
#
 +
if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && \
 +
  [[ "${site_array[$i,tag]}" =~ _dvb8_ ]]
 +
then
 +
#
 +
echo " Tag contains words '_bal_' & '_dvb8_'"
 +
else
 +
echo " Tag doesn't contain words '_bal_' & '_dvb8_'"
 +
fi
 +
 +
 +
# OK - Compare - 3 words + Split lines
 +
########################################
 +
#
 +
if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && \
 +
  [[ "${site_array[$i,tag]}" =~ _dvb8_ ]] && \
 +
  [[ "${site_array[$i,tag]}" =~ _cb_ ]]
 +
then
 +
#
 +
echo " Tag contains words '_bal_', '_dvb8_' & '_cb_'"
 +
else
 +
echo " Tag doesn't contain words '_bal_', '_dvb8_' & '_cb_'"
 +
fi
 +
#
 +
done
 +
</pre>
  
 
== See also ==
 
== See also ==
  
 +
* [[Parameter Substitution (Bash)]]
 
* [[Regex (MySQL)]]
 
* [[Regex (MySQL)]]
 
* [[Bash - String comparison | String comparison (Bash)]]
 
* [[Bash - String comparison | String comparison (Bash)]]
Regel 64: Regel 203:
 
== Sources ==
 
== Sources ==
  
* https://www.networkworld.com/article/2693361/operating-systems/unix-tip-using-bash-s-regular-expressions.html
+
* https://www.networkworld.com/article/2693361/operating-systems/unix-tip-using-bash-s-regular-expressions.html - Limited
 +
* https://linuxreviews.org/Bash_Guide_for_Beginners_Chapter_4._Regular_expressions - Great overview

Huidige versie van 31 okt 2022 om 14:19

Regular expressions, or regex to make it sound more mysterious, are the black magic of programming. It's like doing mathematics with text. Or slightly more specific: advanced pattern matching.

Places where you find regex:

  • In grep for filtering output of a command
  • In comparisons, through use of the operator =~
  • Probably lots of other places.

I have the impression that regular expressions (regex) in Bash may not be the same as in MySQL, hence some more details in this article.

Match a substring

Probably the easiest case - No special characters or whatever needed:

i="blub"; [[ $i =~ blubber ]] && echo "i contains the substring 'blubber' "   # False
i="blub"; [[ $i =~ blub ]] && echo "i contains the substring 'blub' "   # True

Match a single digit

[] denotes single character-comparison, meaning that the comparison is true as soon as the string contains one of the characters indicated within [].

[[ $i =~ [2] ]] && echo "i contains '2'"
[[ $i =~ [12] ]] && echo "i contains '1' and/or '2'"

Match ranges of numbers or letters

i="blub"; [[ $i =~ [0-9] ]] && echo "i contains a number"   # False
i="blu1"; [[ $i =~ [0-9] ]] && echo "i contains a number"   # True
i="1111"; [[ $i =~ [0-9] ]] && echo "i contains a number"   # True

i="blub"; [[ $i =~ [A-Z] ]] && echo "i contains at least one uppercase letter"   # False
i="BLUB"; [[ $i =~ [A-Z] ]] && echo "i contains at least one uppercase letter"   # True
i="blub"; [[ $i =~ [a-z] ]] && echo "i contains at least one lowercase letter"   # True

i="blub"; [[ $i =~ [a-zA-Z] ]] && echo "i contains at least one letter"   # True

Using grub:

echo 12345 | grep [0-9]
12345

Sequences

  • ^: Beginning of the string
  • $: End of the string
i="BLuB"; [[ $i =~ ^[A-Z]+$ ]] && echo "i contains only capital letters"   # False
i="BLUB"; [[ $i =~ ^[A-Z]+$ ]] && echo "i contains only capital letters"   # True

Capture group

Really cool stuff: Substring extraction (Bash)

Logical OR

Filter using regex

How can I filter stuff? E.g.:

12.5(diameter) → 12.5

Some tentative impressions:

Casus: Tags & associate arrays - 2022.10

Everything works:

################################################################################
# Compare tags
################################################################################
#
source load_site_array.sh


# Show all entries
########################################
#
# echo ${site_array[@]}	# Show all - Kinda messy
# echo ${#site_array[@]}	# She entries - Doesn't say much
#
# for i in "${site_array[@]}"
# do
# 	echo "	Entry: $i"
# done


# for i in "${!site_array[@]}"
# do
# 	echo "	Entry: $i"
# done

echo "site_array_rows: $site_array_rows"

for i in `seq 0 $site_array_rows`
do
	echo "Row $i:"
	echo "	Tag: ${site_array[$i,tag]}"
	echo "	URL: ${site_array[$i,url]}"


	# OK - Compare - letter 'a'
	########################################
	#
	if [[ "${site_array[$i,tag]}" =~ [a] ]]
	then
		#
		echo "		Tag contains an 'a'"
	else
		echo "		Tag doesn't contain an 'a'"		
	fi		


	# OK - Compare - word 'bal'
	########################################
	#
	if [[ "${site_array[$i,tag]}" =~ bal ]]
	then
		#
		echo "		Tag contains word 'bal'"
	else
		echo "		Tag doesn't contain word 'bal'"		
	fi		


	# OK - Compare - word '_bal_'
	########################################
	#
	if [[ "${site_array[$i,tag]}" =~ _bal_ ]]
	then
		#
		echo "		Tag contains word '_bal_'"
	else
		echo "		Tag doesn't contain word '_bal_'"		
	fi


	# OK - Compare - word '_bal_' & '_dvb8_'
	########################################
	#
	if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && [[ "${site_array[$i,tag]}" =~ _dvb8_ ]]
	then
		#
		echo "		Tag contains words '_bal_' & '_dvb8_'"
	else
		echo "		Tag doesn't contain words '_bal_' & '_dvb8_'"		
	fi


	# OK - Compare - word '_bal_' & '_dvb8_' - Split lines
	########################################
	#
	if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && \
	   [[ "${site_array[$i,tag]}" =~ _dvb8_ ]]
	then
		#
		echo "		Tag contains words '_bal_' & '_dvb8_'"
	else
		echo "		Tag doesn't contain words '_bal_' & '_dvb8_'"		
	fi


	# OK - Compare - 3 words + Split lines
	########################################
	#
	if [[ "${site_array[$i,tag]}" =~ _bal_ ]] && \
	   [[ "${site_array[$i,tag]}" =~ _dvb8_ ]] && \
	   [[ "${site_array[$i,tag]}" =~ _cb_ ]]
	then
		#
		echo "		Tag contains words '_bal_', '_dvb8_' & '_cb_'"
	else
		echo "		Tag doesn't contain words '_bal_', '_dvb8_' & '_cb_'"
	fi
	#
done

See also

Sources