Substring extraction (Bash)

Uit De Vliegende Brigade
Naar navigatie springen Naar zoeken springen

How to retrieve a substring, similar to the SQL command substring_index? In Sep. 2022, I ran into this problem for retrieving parts of the index of an associate array.

The problem

I use an associative array to emulate a multidimensional array. An example of two arrays:

# Assigning values
########################################
#
imwiz[393,title]="Hello, world!"
imwiz[393,name]="hello-world"

Inside a loop, this element of the array becomes available like this:

for i in "${!imwiz[@]}"
do
   echo "Index (i): $i"
   echo "Contents:  ${imwiz[$i]}"
   ...
done
  • How to efficiently extract the part until the comma from the index?
  • How to efficiently extract the part after the comma from the index?

Inventory

Parameter substitution example

This impressive answer:

string="US/Central - 10:26 PM (CST)"
etime="${string% [AP]M*}"
etime="${etime#* - }"

echo $etime

Leads to:

for i in 

Nicerobot's regular expression example

I suspect that this is the best solution: No outside tools, no subshells, supposingly fast, and as an extra: The chaining is probably what I actually need. A quick test, showed that it indeed works (add a line echo $NUM). Now the details of this compact code:

FN=someletters_12345_moreleters.ext
[[ ${FN} =~ _([[:digit:]]{5})_ ]] && NUM=${BASH_REMATCH[1]}
echo $NUM

Details:

  • [ [...] ]: A condition
  • =~: The condition will be a regular expression
  • _([ [:digit:] ]{5})_: Regular expression that stores the matching string in a capture group (see below. Note that the string cannot be verbosely printed in MediaWiki like this)
  • &&: Chains the commands: The following command will only be executed if the first command (the conditional command) was successful
  • NUM: Just the name of a variable
  • BASH_REMATCH[1]: Capture group.

Current solution

########################################################################
# test_loop_through_databases()
########################################################################
#
test_loop_through_databases()
{
	#
	# * This is a "test function": It contains a lot of comments and
	#   maybe some alternative code. It's meant for learning, not for
	#   production
	# * Keep this function: When I have to review this stuff in e.g.,
	#   6 months, I might be really happy for it
	# * By keeping this test code in a separate function, it isn't
	#   in the way of the production code (because it becomes quickly
	#   annoying to me, if production code contains comments that are
	#   obvious for me at that time)
	# * When a row will be processed, use 'real' variables, not just
	#   "i"
	#
	#
	echo ""; echo ""; echo "# test_loop_through_databases()"; echo ""
	#
	for i in ${!site[@]}
	do
		# Detect if we have a db entry
		########################################
		#
		if [[ $i =~ db ]]; then


			# Retrieve database name
			########################################
			#
			site_db=${site[$i]}

			echo "	Database: ${site_db}"


			# Retrieve url (=index)
			########################################
			#
			# * Below, I want to retrieve the currency. To do so, I first
			#   need the first part of the index (=url)
			# * I suspect that this code is quite essential in many
			#   operations around 'pseude multidimensional arrays'
			#
			site_index=$i
			echo "	site_index: $site_index"
			site_url=${site_index%,*}
			echo "	site_url: $site_url"


			# Retrieve currency
			########################################
			#
			# * First compile a new (compounded) index for retrieving the
			#   currency
			# * I'm surprised that I can have two parameter expansions
			#   in one expression. I thought that wasn't possible. This
			#   implies that I don't need to first explicitly create a
			#   new index
			#
			new_index=${i%,*},currency
			echo "	new_index: $new_index"
			site_currency=${site[${site_url},currency]}
			#
			echo "	Currency (direct):    ${site[${i%,*},currency]}"
			echo "	Currency (new_index): ${site[$new_index]}"
			echo "	Currency (real vars): ${site[${site_url},currency]}"
			echo "	site_currency:        $site_currency"
			#


			# Invoke update_prices_euro_site
			########################################
			#
			# [[ ${site[${i%,*},currency]} = "eur" ]] && update_prices_euro_site
			# [[ ${site[$site_url,currency]} = "eur" ]] && update_prices_euro_site
			#
			[[ $site_currency = "eur" ]] && update_prices_euro_site

		fi
	done		
}


See also

Sources