Substring extraction (Bash)
Versie door Jeroen Strompf (overleg | bijdragen) op 3 okt 2022 om 09:51
How to retrieve a substring, similar to the SQL command substring_index
? In Sep. 2022, I ran into this problem for retrieving parts of the index of an associate array.
The problem
I use an associative array to emulate a multidimensional array. An example of two arrays:
# Assigning values ######################################## # imwiz[393,title]="Hello, world!" imwiz[393,name]="hello-world"
Inside a loop, this element of the array becomes available like this:
for i in "${!imwiz[@]}" do echo "Index (i): $i" echo "Contents: ${imwiz[$i]}" ... done
- How to efficiently extract the part until the comma from the index?
- How to efficiently extract the part after the comma from the index?
Inventory
- awk
- Bash parameter substitution - Preferred
- cut
- expr
- Regular expressions - icw. Bash
- sed
Parameter substitution example
string="US/Central - 10:26 PM (CST)" etime="${string% [AP]M*}" etime="${etime#* - }" echo $etime
Leads to:
for i in
Nicerobot's regular expression example
I suspect that this is the best solution: No outside tools, no subshells, supposingly fast, and as an extra: The chaining is probably what I actually need. A quick test, showed that it indeed works (add a line echo $NUM
). Now the details of this compact code:
FN=someletters_12345_moreleters.ext [[ ${FN} =~ _([[:digit:]]{5})_ ]] && NUM=${BASH_REMATCH[1]} echo $NUM
Details:
[ [...] ]
: A condition=~
: The condition will be a regular expression_([ [:digit:] ]{5})_
: Regular expression that stores the matching string in a capture group (see below. Note that the string cannot be verbosely printed in MediaWiki like this)&&
: Chains the commands: The following command will only be executed if the first command (the conditional command) was successfulNUM
: Just the name of a variableBASH_REMATCH[1]
: Capture group.
Current solution
######################################################################## # test_loop_through_databases() ######################################################################## # test_loop_through_databases() { # # * This is a "test function": It contains a lot of comments and # maybe some alternative code. It's meant for learning, not for # production # * Keep this function: When I have to review this stuff in e.g., # 6 months, I might be really happy for it # * By keeping this test code in a separate function, it isn't # in the way of the production code (because it becomes quickly # annoying to me, if production code contains comments that are # obvious for me at that time) # * When a row will be processed, use 'real' variables, not just # "i" # # echo ""; echo ""; echo "# test_loop_through_databases()"; echo "" # for i in ${!site[@]} do # Detect if we have a db entry ######################################## # if [[ $i =~ db ]]; then # Retrieve database name ######################################## # site_db=${site[$i]} echo " Database: ${site_db}" # Retrieve url (=index) ######################################## # # * Below, I want to retrieve the currency. To do so, I first # need the first part of the index (=url) # * I suspect that this code is quite essential in many # operations around 'pseude multidimensional arrays' # site_index=$i echo " site_index: $site_index" site_url=${site_index%,*} echo " site_url: $site_url" # Retrieve currency ######################################## # # * First compile a new (compounded) index for retrieving the # currency # * I'm surprised that I can have two parameter expansions # in one expression. I thought that wasn't possible. This # implies that I don't need to first explicitly create a # new index # new_index=${i%,*},currency echo " new_index: $new_index" site_currency=${site[${site_url},currency]} # echo " Currency (direct): ${site[${i%,*},currency]}" echo " Currency (new_index): ${site[$new_index]}" echo " Currency (real vars): ${site[${site_url},currency]}" echo " site_currency: $site_currency" # # Invoke update_prices_euro_site ######################################## # # [[ ${site[${i%,*},currency]} = "eur" ]] && update_prices_euro_site # [[ ${site[$site_url,currency]} = "eur" ]] && update_prices_euro_site # [[ $site_currency = "eur" ]] && update_prices_euro_site fi done }