[*]
Command interpreters and scripting languages like the Bash shell are essential tools of any operating system. Here’s how to use powerful data structures in Bash called associative arrays or hashes.
In Bash, a hash is a data structure that can contain many sub-variables, of the same or different kinds, but indexes them with user-defined text strings, or keys, instead of fixed numeric identifiers. Besides being extremely flexible, hashes also make scripts more readable. If you need to process the areas of certain countries, for example, a syntax like:
print area_of('Germany')
would be as self-documenting as it can be, right?
SEE: Hiring Kit: JavaScript Developer (TechRepublic Premium)
How to create and fill Bash hashes
Bash hashes must be declared with the uppercase A switch (meaning Associative Array), and can then be filled by listing all their key/value pairs with this syntax:
# Country areas, in square miles
declare -A area_of
area_of=( [Italy]="116347" [Germany]="137998" [France]="213011" [Poland]="120728" [Spain]="192476" )
The first thing to notice here is that the order in which the elements are declared is irrelevant. The shell will just ignore it, and store everything according to its own internal algorithms. As proof, this is what happens when you retrieve those data as they were stored:
print ${area_of[*]}
213011 120728 137998 192476 116347
print ${!area_of[*]}
France Poland Germany Spain Italy
By default, the asterisk inside the square brackets extracts all and only the values of a hash. Adding the exclamation mark, instead, retrieves the hash keys. But in both cases there is no easily recognizable order.
You may also populate a hash dynamically, by calling other programs. If you, for example, had another shell script called hash-generator, that outputs all the pairs as one properly formatted string:
#! /bin/bash
printf '[Italy]="116347" [Germany]="137998" [France]="213011" [Poland]="120728" [Spain]="192476"'
calling hash-generator in this way from the script that actually uses the area_of hash:
VALS=$( hash-generator )
eval declare -A area_of=( $VALS )
would fill that hash with exactly the same keys and values. Of course, the message here is that “hash-generator” can be any program, maybe much more powerful than Bash, as long as it can output data in that format. To fill a hash with the content of an already existing plain text file, instead, follow these suggestions from Stack Overflow.
How to process hashes
The exact syntax to refer to a specific element of a hash, or delete it, is this:
print ${area_of['Germany]}
unset ${area_of['Germany]}
To erase a whole hash, pass just its name to unset, and then re-declare it:
unset area_of
declare -A area_of
The number of key/value pairs stored into a hash is held by the special variable called “${#HASHNAME[@]}” (don’t look at me, I did not invent this syntax). But if all you need is to process all the elements of a hash, regardless of their number or internal order, just follow this example:
for country in "${!area_of[@]}"
do
echo "Area of $country: ${area_of[$country]}"
done
whose output is:
Area of France: 213011 square miles
Area of Poland: 120728 square miles
Area of Germany: 137998 square miles
You can use basically the same procedure to create a “mirror” hash, with keys and values inverted:
declare -A country_whose_area_is
for country in "${!area_of[@]}"; do
country_whose_area_is[${area_of[$country]}]=$country
done
Among other things, this “mirroring” may be the easiest way to process the original hash looking at its values, instead of keys.
How to sort hashes
If hash elements are stored in semi-random sequences, what is the most efficient way to handle them in any alphanumerical order? The answer is that it depends on what exactly should be ordered and when. In the many cases when what should be sorted is only the final output of a loop, and all is needed to do that is a sort command right after the closing statement:
for country in "${!area_of[@]}"
do
echo "$country: ${area_of[$country]}"
done | sort
To sort the output by key (even if keys were not retrieved in that order!):
France: 213011 square miles
Germany: 137998 square miles
Italy: 116347 square miles.
Sorting the same lines numerically, by country area, is almost as easy. Prepending the areas at the beginning of each line:
for aa in "${!area_of[@]}"
do
printf "%s|%s = %s square milesn" "${area_of[$aa]}" "$aa" "${area_of[$aa]}"
done
yields lines like these:
213011|France = 213011 sq. miles
120728|Poland = 120728 sq. miles
137998|Germany = 137998 sq. miles
that, while still unsorted, now start with just the strings on which we want to sort. Therefore, using sort again, but piped to the cut command with “|” as column separator:
1 for aa in "${!area_of_generated[@]}"
2 do
3 printf "%s|%s = %s square milesn" "${area_of_generated[$aa]}" "$aa" "${area_of_generated[$aa]}"
4 done | sort | cut '-d|' -f2-
will sort by areas and then remove them, to finally produce the desired result:
Italy = 116347 sq. miles
Poland = 120728 sq. miles
Germany = 137998 sq. miles
Multi-level hashes
While Bash does not support nested, multi-level hashes, it is possible to emulate them with some auxiliary arrays. Consider this code, that stores the areas of European regions, while also cataloging them by country:
1 declare -a european_regions=('Bavaria' 'Lazio' 'Saxony' 'Tuscany')
2 declare -a european_countries=('Italy' 'Germany')
3 declare -A area_of_country_regions
4 area_of_country_regions=( [Lazio in Italy]="5000" [Tuscany in Italy]="6000" [Bavaria in Germany]="9500" [Saxony in Germany]="7200" )
5
6 for country in "${european_countries[@]}"
7 do
8 for region in "${european_regions[@]}"
9 do
10 cr="$region in $country"
11 if test "${area_of_country_regions[$cr]+isset}"
12 then
13 printf "Area of %-20.20s: %sn" "$cr" "${area_of_country_regions[$cr]}"
14 fi
15 done
16 done
The code creates two normal arrays, one for countries and one for regions, plus one hash with composite keys that associate each region to its country and emulate a two-level hash. The code then generates all possible combinations of regions and countries, but only processes existing elements of areaofcountry_regions, recognizing them with the *isset test of line 11. Rough, but effective, isn’t it?
Also see
Stay connected with us on social media platform for instant update click here to join our Twitter, & Facebook
We are now on Telegram. Click here to join our channel (@TechiUpdate) and stay updated with the latest Technology headlines.
For all the latest Technology News Click Here
For the latest news and updates, follow us on Google News.