Associative arrays (Bash): verschil tussen versies
(→Scope) |
|||
(12 tussenliggende versies door dezelfde gebruiker niet weergegeven) | |||
Regel 105: | Regel 105: | ||
* [[Substring extraction (Bash)]] | * [[Substring extraction (Bash)]] | ||
* Below: Chapter about having a numberical index. | * Below: Chapter about having a numberical index. | ||
+ | |||
+ | == Scope == | ||
+ | |||
+ | When an associative array is defined through <code>declare -gA</code>, the array is available in all recursively invoked functions within the same shell. E.g.: | ||
+ | |||
+ | <pre> | ||
+ | ################################################################################ | ||
+ | # Associative arrays & scope | ||
+ | ################################################################################ | ||
+ | # | ||
+ | function sub1() | ||
+ | { | ||
+ | echo "Within sub1: ${j[@]}" | ||
+ | sub2 | ||
+ | } | ||
+ | |||
+ | |||
+ | function sub2() | ||
+ | { | ||
+ | echo "Within sub2: ${j[@]}" | ||
+ | } | ||
+ | |||
+ | |||
+ | # Main | ||
+ | ######################################## | ||
+ | # | ||
+ | unset j | ||
+ | declare -gA j | ||
+ | |||
+ | j["foo"]=1 | ||
+ | j["bar"]=2 | ||
+ | |||
+ | echo "Within main: ${j[@]}" | ||
+ | |||
+ | sub1 | ||
+ | </pre> | ||
+ | |||
+ | Output: | ||
+ | |||
+ | <pre> | ||
+ | Within main: 1 2 | ||
+ | Within sub1: 1 2 | ||
+ | Within sub2: 1 2 | ||
+ | </pre> | ||
+ | |||
+ | Results are the same if the two functions are defined in reverse order in this script. | ||
+ | |||
+ | This is a different situation from where stuff is | ||
+ | |||
+ | == Export to subshells == | ||
+ | |||
+ | Associative arrays cannot be exported to subshells like variables or functions [https://stackoverflow.com/questions/12944674/how-to-export-an-associative-array-hash-in-bash]. To make the content of an associate array available in subshells, you might have to use some tricks: | ||
+ | |||
+ | * Export only needed values as variables | ||
+ | * Use files for storage & retrieval - Will probably have quite some overhead | ||
+ | * Convert the associative array to several regular arrays for each index - This doesn't seem too hard. | ||
== Conventions == | == Conventions == | ||
Regel 414: | Regel 470: | ||
I experimented with separate arrays for different sites (as part of a server update script), but it didn't work very well: For every array I had to duplicate the loop do do stuff. I also couldn't concatenate these arrays, for there wouldn't be a unique PK anymore. | I experimented with separate arrays for different sites (as part of a server update script), but it didn't work very well: For every array I had to duplicate the loop do do stuff. I also couldn't concatenate these arrays, for there wouldn't be a unique PK anymore. | ||
− | Seemed much easier to create one large table, and include an index <code> | + | Seemed much easier to create one large table, and include an index <code>tag</code> with values like e.g., <code>zwk_woo</code> to indicate customer ''zwk'' and that this is a WooCommerce site. In a loop, it would be easy to take these into account. |
+ | |||
+ | == Detect an missing entry? == | ||
+ | |||
+ | === Problem === | ||
+ | |||
+ | I use an associative array for translations. Consider these items: | ||
+ | |||
+ | <pre> | ||
+ | ((i++)) | ||
+ | tr[$i,tag]="_empty_pt_strange" | ||
+ | tr[$i,nl]="Zwuk - Overig" | ||
+ | tr[$i,en]="" | ||
+ | |||
+ | |||
+ | ((i++)) | ||
+ | tr[$i,tag]="_px_" | ||
+ | tr[$i,nl]="Stofzuiger" | ||
+ | tr[$i,de]="Staubsauger" | ||
+ | </pre> | ||
+ | |||
+ | * In the first row, something is translated to an empty string - That's fine! | ||
+ | * In the second row, there is a German translation, but no English translation. | ||
+ | |||
+ | The problem: How to distinguish between an 'empty translation' and a missing translation? | ||
+ | |||
+ | === Ideas === | ||
+ | |||
+ | * In the first example, one of the tags is <code>_empty_</code> so that could be used, but that's computatively intense, plus human as I am, I'm likely to forget to include this tag at times. | ||
+ | * Can you distinguish between an empty entry and a non-existent entry? → Yes. See solution | ||
+ | * Can you detect a missing index? → This would be ideal → Nope | ||
+ | |||
+ | === Solution === | ||
+ | |||
+ | Check that the entity is ''set'': | ||
+ | |||
+ | <pre> | ||
+ | unset j | ||
+ | declare -gA j | ||
+ | |||
+ | j[one]="Eén" | ||
+ | j[two]="" | ||
+ | |||
+ | [[ -v j[one] ]] && echo " true - j[one]" # True: Entry exists (and may be empty) | ||
+ | [[ -v j[two] ]] && echo " true - j[two]" # True: Entry exists (and may be empty) | ||
+ | [[ -v j[three] ]] && echo " true - j[three]" # False: Entry doesn't exist - unset | ||
+ | </pre> | ||
== Alternatives? == | == Alternatives? == | ||
Regel 442: | Regel 544: | ||
* [[Awk | Awk]] | * [[Awk | Awk]] | ||
* [[Declare (Bash)]] | * [[Declare (Bash)]] | ||
+ | * [[String comparison (Bash)]] | ||
+ | * [[Subshells (Bash)]] | ||
* [[Substring extraction (Bash)]] | * [[Substring extraction (Bash)]] | ||
* [[Unset (Bash)]] | * [[Unset (Bash)]] |
Versie van 7 nov 2022 17:30
An associative array is an array where the index can be symbolic, rather than only numerical. E.g.:
declare -A j j[fruit]=apple j[color]=blue
- You can use associative arrays to mimic multi-dimensional arrays, with emphasize on mimic
- Associative arrays are new in Bash 4. To verify which version of Bash you have:
bash --version
.
There are no multidimensional arrays
Bash doesn't have multidimensional arrays (as of 2022.09.29). Associative arrays aren't multidimensional arrays either, but they can emulate them. This has some limitations and this can be tricky if you're not aware of them.
As an example:
unset j declare -A j j[0,0]="00" j[0,1]="01" j[0, 1]="0 1" for i in "${!j[@]}" do echo "Index: $i - Value: ${j[$i]}" done echo "Length of this array: ${#j[@]}"
Output:
Index: 0, 1 - Value: 0 1 Index: 0,1 - Value: 01 Index: 0,0 - Value: 00 Length of this array: 3
What this shows:
- The entry with index
[0, 1]
, is not the same as the entry with index[0,1]
. This shows that everything between[]
is regarded as just one index and not as something multidimensional - When retrieving the dimension of the array, it returns only one number. Because it's still just a vector.
But does this actually matter? Sometimes it probably doesn't: It took me a while between adopting associative arrays and realizing their limitations. So far, these are the issues I've encountered:
- Retrieve number of rows? There is no meaningful way of retrieving the number of rows, as the matrix is actually turned into a vector
- Iterate over rows? As you don't have real rows, you can't iterate over them either
- Retrieve a specific entry: This is probably the hardest challenge: How to retrieve an entry with a specific index? In an efficient way, concerning CPU time and code overhead?
- More?
These issues are discussed in the following chapters:
Retrieve number of rows or columns?
A problem with this 'emulated' multi-dimensional arrays: You can't read-out the number of rows or columns, for there are no real rows and columns. Illustration:
# The array below is 3x2 # unset j declare -A j j[1,1]="11" j[1,2]="12" j[2,1]="21" j[2,2]="22" j[3,1]="31" j[3,2]="32" echo "Length: ${#j[@]}" echo "Complete array: ${j[@]}"
Output:
Length: 6 Complete array: 21 22 31 31 12 11
This can be a problem, when you want to loop over the rows - Can't do that. But see below for a solution.
Iterate over rows?
If there is not really such a thing as rows and columns, than how to iterate over them?
That's probably not so difficult, Just remember that the 'multidimensional index' is just one index, and that you can iterate over it. E.g.:
for i in "${!imwiz[@]}" do echo "Index: $i" echo "Value: ${imwiz[$i]}" done
Retrieve a specific entry?
How to retrieve a specific entry? I don't have the final answer yet, but it's coming:
- Substring extraction (Bash)
- Below: Chapter about having a numberical index.
Scope
When an associative array is defined through declare -gA
, the array is available in all recursively invoked functions within the same shell. E.g.:
################################################################################ # Associative arrays & scope ################################################################################ # function sub1() { echo "Within sub1: ${j[@]}" sub2 } function sub2() { echo "Within sub2: ${j[@]}" } # Main ######################################## # unset j declare -gA j j["foo"]=1 j["bar"]=2 echo "Within main: ${j[@]}" sub1
Output:
Within main: 1 2 Within sub1: 1 2 Within sub2: 1 2
Results are the same if the two functions are defined in reverse order in this script.
This is a different situation from where stuff is
Export to subshells
Associative arrays cannot be exported to subshells like variables or functions [1]. To make the content of an associate array available in subshells, you might have to use some tricks:
- Export only needed values as variables
- Use files for storage & retrieval - Will probably have quite some overhead
- Convert the associative array to several regular arrays for each index - This doesn't seem too hard.
Conventions
To alluviate some of the issues discussed before:
- When arrays have a numerical index, always start with the same number ⇒ I prefer base 1, just as in matrix algebra
- For two-dimensional arrays, the first dimensions is always rows (x) and the second is always columns (y) - Just as in matrix algebra
- Don't use spaces around the
,
that separates indices: You need a uniform syntax, and using spaces actually messes up language highlighting in Sublime Text - Store the dimensions in associated variables (when needed). E.g., in the example above:
j_rows=3
andj_columns=2
Examples
As stated above, these are not really multidimensional arrays, just arrays with fancy indices. It doesn't matter if these indices are numerical or symbolic:
declare -A j j[0,0,0]="000" j[0,0,1]="001" j[0,1,0]="010" j[0,1,1]="011" j[1,0,0]="100" j[1,0,1]="101" j[1,1,0]="110" j[1,1,1]="111" echo "${j[0,0,0]} ${j[0,0,1]} ${j[0,1,0]} ${j[0,1,1]}" echo "${j[1,0,0]} ${j[1,0,1]} ${j[1,1,0]} ${j[1,1,1]}"
unset j declare -A j j[fruit,one]=Mango j[fruit,two]=Apple j[bird,1]=Cockatail j[bird,2]=Spottingbird j[flower,1]=Rose j[flower,2]=Sunflower j[animal]=Tiger for i in "${j[@]}" do echo "Entry: $i" done
Output:
Entry: Cockatail Entry: Spottingbird Entry: Rose Entry: Sunflower Entry: Tiger Entry: Apple Entry: Mango
Note that the entries seem to appear in arbitrary order
Loop through an array
Again: These are not multidimensional arrays. They only seem that way.....
This works [2]:
declare -A j j[fruit]=Mango j[bird]=Cockatail j[flower]=Rose j[animal]=Tiger for i in "${j[@]}" do echo "Entry: $i" done
Output:
Entry: Mango Entry: Rose Entry: Tiger Entry: Cockatail
This works:
declare -A j j[fruit,1]=Mango j[fruit,2]=Apple j[bird,1]=Cockatail j[bird,2]=Spottingbird j[flower,1]=Rose j[flower,2]=Sunflower j[animal,1]=Tiger j[animal,1]=Mouse for i in "${j[@]}" do echo "Entry: $i" done
Output:
Entry: Cockatail Entry: Spottingbird Entry: Rose Entry: Sunflower Entry: Mouse Entry: Mango Entry: Apple
The only problem: The entries seem to be quite random. This is also the case if I insert statement unset j
at the beginning of the script.
Looping through the rows of an associate array: This one isn't as cool as the code before, because the index is given explicitly:
i=1 for ((i; i<=$number_of_sites; i++)) do echo "Row ${i}: ${site[$i,1]} & ${site[$i,2]}" done
Loop through the index of an array
Use the symbol !
to refer to an array's index, rather than its content. Remember that with associative arrays, you define the index yourself. There is no numerical index:
declare -A j j[fruit,1]=Mango j[fruit,2]=Apple j[bird,1]=Cockatail j[bird,2]=Spottingbird j[flower,1]=Rose j[flower,2]=Sunflower j[animal,1]=Tiger j[animal,1]=Mouse for i in "${!j[@]}" do echo "Index: $i" done
With output:
Index: bird,1 Index: bird,2 Index: flower,1 Index: flower,2 Index: animal,1 Index: fruit,1 Index: fruit,2
Loop over index + value
Again, not very exciting, but maybe instructive at times:
unset j declare -A j j[fruit,one]=Mango j[fruit,two]=Apple j[bird,1]=Cockatail j[bird,2]=Spottingbird j[flower,1]=Rose j[flower,2]=Sunflower j[animal]=Tiger for i in "${!j[@]}" do echo "Index: $i - Value: ${j[$i]}" done
Output:
Index: bird,1 - Value: Cockatail Index: bird,2 - Value: Spottingbird Index: flower,1 - Value: Rose Index: flower,2 - Value: Sunflower Index: animal - Value: Tiger Index: fruit,two - Value: Apple Index: fruit,one - Value: Mango
Length of an array
Use the symbol #
to retrieve the length of an array. Since associative arrays are just vectors with fancy indices, there is only one dimension to retrieve: Its length:
unset j declare -A j j[0,0]="00" j[0,1]="01" j[1,0]="10" j[1,1]="11" j[2,0]="20" j[2,1]="21" echo "Length: ${#j[@]}"
Output:
Length: 6
Have a numerical index?
Example: Let's consider a matrix like:
example.com example_com us_en example.nl example_nl nl_en example.de example_de de_en
Without numerical index
In Sep. 2022, I found it attractive to use an associative array like this:
site[example.com,example_com] site[example.com,us_en] site[example.nl,example_nl] site[example.nl,nl_en] site[example.de,example_de] site[example.de,de_en] site_rows=3
- Advantage: No additonal 'column' for primary keys - Small matrix
- Disadvantage: It becomes a bit tricky to collect the data that would be part of one 'row': I have to use one of the other fields as a make-shift primary key.
With numerical index
Let's include a numerical index like this:
site[1,example.com example_com] site[1,example.com us_en] site[2,example.nl example_nl] site[2,example.nl nl_en] site[3,example.de example_de] site[3,example.de de_en] site_rows=3
- Disadvantage: Additional index - But no additional rows
- Advantage: It's easier to collect the data that would be part of a 'row' as there is now a genuine index.
However, in practice it's much easier to loop through this array and use its entries. E.g.:
i=1; for ((i; i<=$site_rows; i++)) do # # Extract row entities ######################################## # site_cat=${site[$i,cat]} site_url=${site[$i,url]} site_db=${site[$i,db]} # Check ######################################## # echo ""; echo "### Loop - row: $i - site_url: $site_url" # echo " site_cat: $site_cat" echo " site_url: $site_url" echo " site_db: $site_db" # Execute ######################################## # backup_database disable_woocommerce_attribute_lookup delete_transients wp_update_site # done
Additional index vs. separate array
I experimented with separate arrays for different sites (as part of a server update script), but it didn't work very well: For every array I had to duplicate the loop do do stuff. I also couldn't concatenate these arrays, for there wouldn't be a unique PK anymore.
Seemed much easier to create one large table, and include an index tag
with values like e.g., zwk_woo
to indicate customer zwk and that this is a WooCommerce site. In a loop, it would be easy to take these into account.
Detect an missing entry?
Problem
I use an associative array for translations. Consider these items:
((i++)) tr[$i,tag]="_empty_pt_strange" tr[$i,nl]="Zwuk - Overig" tr[$i,en]="" ((i++)) tr[$i,tag]="_px_" tr[$i,nl]="Stofzuiger" tr[$i,de]="Staubsauger"
- In the first row, something is translated to an empty string - That's fine!
- In the second row, there is a German translation, but no English translation.
The problem: How to distinguish between an 'empty translation' and a missing translation?
Ideas
- In the first example, one of the tags is
_empty_
so that could be used, but that's computatively intense, plus human as I am, I'm likely to forget to include this tag at times. - Can you distinguish between an empty entry and a non-existent entry? → Yes. See solution
- Can you detect a missing index? → This would be ideal → Nope
Solution
Check that the entity is set:
unset j declare -gA j j[one]="Eén" j[two]="" [[ -v j[one] ]] && echo " true - j[one]" # True: Entry exists (and may be empty) [[ -v j[two] ]] && echo " true - j[two]" # True: Entry exists (and may be empty) [[ -v j[three] ]] && echo " true - j[three]" # False: Entry doesn't exist - unset
Alternatives?
It's still messy. Let's have an open mind concerning alternatives:
Awk
And for something entirely different: I kinda move from spreadsheets to associative arrays and back. A while ago I saw on YouTube Gary Explains: EVERYONE Needs to Learn a Little Bit of AWK! - Maybe awk is what I have been looking for whole my life?
Database table
This actually sounds like a perfect job for a database table.
Python?
Maybe use Python for this, rather than Bash?
Spreadsheet?
Maybe retrieve data from a spreadsheet?
A basic reason for not using spreadsheets for this kind of data: Just like not using a word processor for programming, but rather an editor, a spreadsheet is not precise enough. Auto-corrections like capitalisations, changing dashes, and not being able to store whitespace reliably.
See also
- Awk
- Declare (Bash)
- String comparison (Bash)
- Subshells (Bash)
- Substring extraction (Bash)
- Unset (Bash)