JSON is ubiquitous. Maybe you work with a service that ingests or produces JSON output.
And maybe you want to be able to automate some tasks using shell scripts. However, the
hierarchical structure of JSON makes this difficult using
jq is a utility that lets you “slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text”. In this post, I’ll show how I’ve used
jq as part of a backup script.
I needed to back up some databases hosted using AWS’s RDS service. These databases can only be accessed from IP addresses in the same subnet. I needed to run the backups every day, but didn’t want to pay for and EC2 instance up to be active all day for less than 10 minutes of work. So, I configured an instance that I would use for the backups, and placed it in a “stopped” state. My script would then do the following.
- Start the EC2 instance
- Wait until the server has started.
- Get service IP address
- Wait for the instance to accept an SSH connection.
- Fetch DB credentials for Database
pg_dumpvia ssh and store the dump files locally.
The third and fifth steps require
Getting Server IP address.
Once the server is started it’s details can be accessed via
aws ec2 describe-instances.
This command will return details about CPU architecture, subnet configuration, mounted volumes.
All we need is the public IP address.
One of the basic functions of
jq filtering JSON objects.
1 2 INSTANCE_DESCRIPTION=$(aws ec2 describe-instances --instance-ids $PROXY_SERVER_ID) IP_ADDRESS=$(echo $INSTANCE_DESCRIPTION | jq -r '.Reservations.Instances.NetworkInterfaces.Association.PublicIp')
In the above snippet. First, the full description is loaded into
$INSTANCE_DESCRIPTION. Then this output is
jq. The public ip address is then accessed at the path
This syntax should be fairly familiar.
The initial dot operator (
.) is a reference to the root object.
The path then drills down the hierarchy using array accesses when necessary
jq allows access to multiple array indices.
Because we specified the instance id (
--instance-ids $PROXY_SERVER_ID), we only need the first (and only) index.
-r indicates that
jq will output “raw result”, rather than a string wrapped in quote marks.
Fetching DB credentials
Fetching the DB credentials is much the same. The raw blob is fetched via AWS command-line tools.
Then parse out the desired data use
1 2 3 4 5 6 7 8 9 10 11 DB_SECRET_PATH=[DB PATH WITHIN AWS SECRETS] SECRET=$(aws secretsmanager get-secret-value --secret-id $DB_SECRET_PATH | jq -r '.SecretString') DBUSER=$(echo $SECRET | jq '.username') DBNAME=$(echo $SECRET | jq '.dbname') PASS=$(echo $SECRET | jq '.password') HOST=$(echo $SECRET | jq '.host') DATE_STR=$(date +"%y-%m-%d") ssh ubuntu@$IP_ADDRESS -i $SSH_KEY \ "echo $PASS | pg_dump --host=$HOST --user=$DBUSER $DBNAME | gzip" \ > $BACKUP_DIR/db-name.dump.$DATE_STR.sql.gz
Parsing out the secret is a little more complicated because the secret body is a serialized JSON string within a JSON document.
1 SECRET=$(aws secretsmanager get-secret-value --secret-id $DB_SECRET_PATH | jq -r '.SecretString')
The result of
aws secretsmanager get-secret-value
includes the secret body at the path
.SecretString. Again the raw result is stored in an ENV variable.
Next, each value is picked off the resulting secret object.
1 2 3 4 DBUSER=$(echo $SECRET | jq '.username') DBNAME=$(echo $SECRET | jq '.dbname') PASS=$(echo $SECRET | jq '.password') HOST=$(echo $SECRET | jq '.host')
$INSTANCE_DESCRIPTION in the previous section, we pipe the JQ string through
jq and pick specific value by name.
The values are then used to connect to the database.
1 2 3 4 DATE_STR=$(date +"%y-%m-%d") ssh ubuntu@$IP_ADDRESS -i $SSH_KEY \ "echo $PASS | pg_dump --host=$HOST --user=$DBUSER $DBNAME | gzip" \ > $BACKUP_DIR/db-name.dump.$DATE_STR.sql.gz
BONUS! echoing password for pg_dump
Like other Postgres utilities
pg_dump doesn’t accept passwords as a command-line argument.
It reads them from standard input.
We can avoid user interaction can by piping the password into
1 echo $PASS | pg_dump --host=$HOST --user=$DBUSER $DBNAME
jq is a great tool, and is really easy to get a grip on if you are already familiar with JSON.
Hopefully you found this post helpful. Thanks for reading!