Query nested arrays in PostgreSQL JSON data

The following is an example showing how to query multiple nested arrays in JSON data in a PostgreSQL database.

Given

CREATE TABLE sample_json (
  id serial PRIMARY KEY,
  name varchar(64),
  json_data json
);


INSERT INTO sample_json (name, json_data)
VALUES
	(
	'NA',
	'
	{
	    "location": "US",
	    "topLevelArray": [
	        {
	            "id": 1,
	            "secondLevelArray": [
	                {
	                    "key": "someKey",
	                    "operator": "=",
	                    "value": 10
	                },
	                {
	                    "key": "foo",
	                    "operator": ">=",
	                    "value": 5
	                },
	                {
	                    "key": "someOtherKey",
	                    "operator": ">",
	                    "value": 647
	                }
	            ]
	        },
	        {
	            "id": 2,
	            "secondLevelArray": [
	                {
	                    "key": "blah",
	                    "operator": "<",
	                    "value": 7
	                }
	            ]
	        }
	    ]
	}
	'
	),
	(
	'EU',
	'
	{
	    "location": "poland",
	    "topLevelArray": [
	        {
	            "id": 2,
	            "secondLevelArray": [
	                {
	                    "key": "bar",
	                    "operator": "<",
	                    "value": 10
	                },
	                {
	                    "key": "moo",
	                    "operator": ">=",
	                    "value": 16
	                },
	                {
	                    "key": "baz",
	                    "operator": "!=",
	                    "value": 9
	                }
	            ]
	        }
	    ]
	}
	'	
	)
;

With the aforementioned data, let’s say we want to know the id of the rows that have an object in the secondLevelArray with operator equal to >= and the value of the key field.

The concept to understand to be able to search for all of the keys in the secondLevelArray where the operator is >= is the lateral join. The TLDR; is that a subquery appearing in the FROM clause can reference columns provided by preceding items. Or, you can write clauses in the FROM clause that read from the result of previous FROM clauses.

Let’s run some queries and go through them, line-by-line. First we will just select everything in the table.

SELECT * FROM sample_json;

|id |name|json_data                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|---|----|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|1  |NA  |"
	{
	    \"location\": \"US\",
	    \"topLevelArray\": [
	        {
	            \"id\": 1,
	            \"secondLevelArray\": [
	                {
	                    \"key\": \"someKey\",
	                    \"operator\": \"=\",
	                    \"value\": 10
	                },
	                {
	                    \"key\": \"foo\",
	                    \"operator\": \">=\",
	                    \"value\": 5
	                },
	                {
	                    \"key\": \"someOtherKey\",
	                    \"operator\": \">\",
	                    \"value\": 647
	                }
	            ]
	        },
	        {
	            \"id\": 2,
	            \"secondLevelArray\": [
	                {
	                    \"key\": \"blah\",
	                    \"operator\": \"<\",
	                    \"value\": 7
	                }
	            ]
	        }
	    ]
	}
	"|
|2  |EU  |"
	{
	    \"location\": \"poland\",
	    \"topLevelArray\": [
	        {
	            \"id\": 2,
	            \"secondLevelArray\": [
	                {
	                    \"key\": \"bar\",
	                    \"operator\": \"<\",
	                    \"value\": 10
	                },
	                {
	                    \"key\": \"moo\",
	                    \"operator\": \">=\",
	                    \"value\": 16
	                },
	                {
	                    \"key\": \"baz\",
	                    \"operator\": \"!=\",
	                    \"value\": 9
	                }
	            ]
	        }
	    ]
	}
	"                                                                                                                                                                                                                                                                      |

As expected, we just get back everything.

Now, let’s start to drill down into the JSON object. First we will select the row id and just the data from the topLevelArray.

SELECT
	sj.id
	topLevelArray
FROM
	sample_json sj,
	json_array_elements(json_data -> 'topLevelArray') topLevelArray
;

|id |toplevelarray                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|---|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|1  |"{
	            \"id\": 1,
	            \"secondLevelArray\": [
	                {
	                    \"key\": \"someKey\",
	                    \"operator\": \"=\",
	                    \"value\": 10
	                },
	                {
	                    \"key\": \"foo\",
	                    \"operator\": \">=\",
	                    \"value\": 5
	                },
	                {
	                    \"key\": \"someOtherKey\",
	                    \"operator\": \">\",
	                    \"value\": 647
	                }
	            ]
	        }"|
|1  |"{
	            \"id\": 2,
	            \"secondLevelArray\": [
	                {
	                    \"key\": \"blah\",
	                    \"operator\": \"<\",
	                    \"value\": 7
	                }
	            ]
	        }"                                                                                                                                                                                                                                                                                                                                    |
|2  |"{
	            \"id\": 2,
	            \"secondLevelArray\": [
	                {
	                    \"key\": \"bar\",
	                    \"operator\": \"<\",
	                    \"value\": 10
	                },
	                {
	                    \"key\": \"moo\",
	                    \"operator\": \">=\",
	                    \"value\": 16
	                },
	                {
	                    \"key\": \"baz\",
	                    \"operator\": \"!=\",
	                    \"value\": 9
	                }
	            ]
	        }"             |

The “columns” selected are the id and topLevelArray. The id is straightforward. The topLevelArray is a lateral join clause statement and we use the json_array_elements() function to select the contents of the json_data.topLevelArray key.

We can continue to traverse deeper into the JSON object with another lateral join clause that accesses the next nested array from the topLevelArray key.

SELECT
	sj.id,
	secondLevelElements
FROM
	sample_json sj,
	json_array_elements(json_data -> 'topLevelArray') topLevelArray,
	json_array_elements(topLevelArray -> 'secondLevelArray') secondLevelElements
;

|id |secondlevelelements                                                                                                                                   |
|---|----------------------------------------------------------------------------------------------------------------------------------------------------|
|1  |"{
	                    \"key\": \"someKey\",
	                    \"operator\": \"=\",
	                    \"value\": 10
	                }"      |
|1  |"{
	                    \"key\": \"foo\",
	                    \"operator\": \">=\",
	                    \"value\": 5
	                }"          |
|1  |"{
	                    \"key\": \"someOtherKey\",
	                    \"operator\": \">\",
	                    \"value\": 647
	                }"|
|1  |"{
	                    \"key\": \"blah\",
	                    \"operator\": \"<\",
	                    \"value\": 7
	                }"          |
|2  |"{
	                    \"key\": \"bar\",
	                    \"operator\": \"<\",
	                    \"value\": 10
	                }"          |
|2  |"{
	                    \"key\": \"moo\",
	                    \"operator\": \">=\",
	                    \"value\": 16
	                }"         |
|2  |"{
	                    \"key\": \"baz\",
	                    \"operator\": \"!=\",
	                    \"value\": 9
	                }"          |

We add an additional FROM clause. This one referencing the result of the previous FROM clause.

json_array_elements(json_data -> 'topLevelArray') topLevelArray,
json_array_elements(topLevelArray -> 'secondLevelArray') secondLevelElements

Now we have access to the elements in the secondLevelArray and we can add a WHERE clause to select only what we want from that nested array.

select
	sj.id,
	secondLevelElements ->> 'key'
from
	sample_json sj,
	json_array_elements(json_data -> 'topLevelArray') topLevelArray,
	json_array_elements(topLevelArray -> 'secondLevelArray') secondLevelElements
where
	secondLevelElements ->> 'operator' = '>='
;

|id |?column?|
|---|--------|
|1  |foo     |
|2  |moo     |

The result being the row id, and the value of the key field in inner most nested object.

Chili Garlic Tofu with Broccoli

Ingredients

  • 2 14 oz packages of extra firm tofu
  • 2 TB olive oil
  • Salt & pepper, to taste
  • 4 garlic cloves, minced
  • 1 lb. broccoli florets
  • 2 TB Sambal Olek or chili paste
  • 4 TB honey
  • 1 TB low sodium soy sauce
  • 4 sheets roasted seaweed
  • 1 TB sesame seeds
  • 1.5 cups basmati rice

Directions

  1. Blot tofu with paper towels while pressing down slightly to release water. Slice into 1/2 inch thick pieces and blot again with paper towels.
  2. Sauce: In a small bowl whisk together 2 TB Samba Olek, 4 TB honey, and 1 TB soy sauce, set aside.
  3. Cut seaweed sheets into thin strips to use as a garnish, set aside.
  4. Basmati Rice: Rinse the basmati rice in cold water to remove excess starch. Combine rice with 2 1/4 cups water, 1 TB olive oil and a pinch of salt in a medium sized pot. Stir once and bring to a boil over high heat. Cover, reduce heat to low and simmer for 10 minutes. Remove pan from heat and leave covered for 5 minutes. Remove lid and fluff with fork before serving.
  5. Place broccoli florets in a microwavable bowl with 1/4 cup water. Cover and microwave 4 minutes.
  6. Heat 2 TB olive oil in a large non-stick skillet over medium heat. Add the 4 cloves smashed garlic and sauté for 1-2 minutes. Place slices of tofu in pan and sear over medium heat, about 5 minutes per side. Do not fiddle with tofu, let it develop a crust so it does not stick to pan.
  7. Remove tofu from pan to a serving plate. Transfer steamed broccoli to the skillet and toss to coat in the garlic infused oil. Sprinkle with the sesame seeds, toss to combine and cook for 1 minute. Transfer to serving plates.
  8. Drizzle tofu with the chili sauce from Step 2 and garnish with seaweed stands.

Vegetarian Beef Tips and Asparagus Stir Fry

Ingredients

  • 2 bags Gardein Be’f Tips1
  • 1 1/4 cup soy sauce
  • 4 tablespoons vegetable oil
  • 3 cloves garlic, minced
  • 1 tablespoon ginger, minced
  • 1 teaspoon brown sugar
  • 1/2 red onion, chopped
  • 1 red bell pepper, chopped
  • approx 1 1/2 lbs of asparagus cut into 2 inch pieces
  • 2 tablespoons balsamic vinegar
  • 1 cup chicken broth
  • 2 teaspoons cornstarch
  • 4 cups cooked rice

Directions

  1. In a small bowl, combine soy sauce, and brown sugar and set aside
  2. Heat a skillet to medium-high and cook the beef tips in 2 tablespoons of oil. Put them in frozen and stir them every few minutes. Cook about 8 – 10 minutes until they are browned and then put them in a bowl and cover.
  3. In the same skillet, heat the remaining 2 tablespoons of oil. Once hot add the garlic and ginger and stir. Immediately add the onion and turn the heat to medium. Cook until onions are soft.
  4. Add asparagus and cook about 8 minutes
  5. Add the peppers and cook about 5 minutes
  6. Add the sauce and cook about 3 minutes
  7. In a separate bowl, combine vinegar, broth, and cornstarch. Dissolve cornstarch. Add to skillet. Bring to a boil. Reduce heat and simmer for 2 – 3 minutes
  8. Serve over rice

How to set up a locally hosted git server and create new repositories

You may have code and configurations that are required to stay on premise. As a result, you will need to setup your own git server and create and manage repositories locally. Following is an overview of how to set that up.

We will leave aside server setup, configuration, and networking and assume that we have a machine on which we will host the repos, git-server, and machines that will clone, pull, and push updates, clients.

Setting up the git-server

On the git-server add git-shell to the list of valid shells.

echo $(which git-shell) >> /etc/shells

Add a git user with a specific shell and setup the directory in which all of the repos will be stored.

useradd --shell $(which git-shell) --home-dir /home/git git
mkdir -p /home/git/.ssh
chown -R git: /home/git/.ssh
touch /home/git/.ssh/authorized_keys
chown -R git: /home/git/.ssh
chmod 600 /home/git/.ssh/authorized_keys
chmod 700 /home/git/.ssh
mkdir -p /var/lib/git_repo
chown git: /var/lib/git_repo
chmod 700 /var/lib/git_repo

Adding a new user

To add a new user that can commit to and pull repos via ssh concatenate their public ssh key to /home/git/.ssh/authorized_keys.

Creating a new repository

As root on the git-server

export GIT_REPO=<new-repo-name>.git
sudo -u git mkdir /var/lib/git_repo/$GIT_REPO
cd /var/lib/git_repo/$GIT_REPO && sudo -u git git init --bare

As an authorized git user on the client machine

mkdir example && cd example
git init
echo "# README" >> README.md
git add README.md
git commit -m 'Initial commit'
git remote add origin git@hpdl01:/var/lib/git_repo/<new-repo-name>.git
git push origin master

Now you can clone with

git clone git@hpdl01:/var/lib/git_repo/example.git

Setting up software RAID on Debian with mdadm

Software RAID has come a long way. Unless you have some very high-rate, high-volume, I/O workloads with SLAs that will otherwise cost you money, for the most part a software RAID will perform just fine. One of the primary benefits of using software RAID is the portability of your disks/data. If a box on which you have a software RAID dies somehow and at least some (depending on your RAID configuration) of drives survive, you can easily resurrect the RAID on another machine without having a duplicate RAID card on hand.

Following is an example of how to setup a RAID 1 array using mdadm on Debian. I have two 2TB drives, /dev/sdb and /dev/sdc.

Install mdadm

apt-get install mdadm

Create the array

mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc

Once you create the RAID device, you can check on the mirroring process (Not sure what it is syncing with a blank disk…)

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc[1] sdb[0]
1953382464 blocks super 1.2 [2/2] [UU]
[>....................] resync = 1.0% (20516736/1953382464) finish=190.1min speed=169401K/sec
bitmap: 15/15 pages [60KB], 65536KB chunk

Create a filesystem

mkfs.xfs /dev/md0

Create a mount point and mount the filesystem

mkdir /data
mount /dev/md0 /data

Ensure that the array is reassembled on boot by adding it to /etc/mdadm/mdadm.conf. We do this by scanning the active array and appending it’s details to the config file.

mdadm --detail --scan /dev/md0 >> /etc/mdadm/mdadm.conf

Upate the initramfs so that the array will be available on boot

update-initramfs -u

Add an entry to /etc/fstab

/dev/md0 /data               xfs     defaults,nofail,discard  0 0

Do a test reboot and make sure that it is mounted and the data accessible.

Enable and disable delivery to additional email addresses for sent and received messages with a Google Workspace account

I have a Google Workspace account for my domain and sometimes it is useful to have messages sent from or sent too other addresses in my domain also delivered to my address.

To set it up, login to your Google Admin console, then in the left-hand navigation click on Google Workspace, and then Gmail.

Look for Routing and click on it to enter the routing configurations.

You can then add, modify, delete, enable, or disable routing rules for different organizational units in your domain.

[SOLVED] debsig-verify for Failed verification error, “signatures using the SHA1 algorithm are rejected” and “Can’t check signature: Invalid digest algorithm”

If you are using debsig-verify for the verification of a downloaded .deb file and are unable to verify it, run it with the -d option to get more information. If you see the following two lines

gpg: Note: signatures using the SHA1 algorithm are rejected
gpg: Can't check signature: Invalid digest algorithm

It is likely that the PGP signature used to sign the package uses the SHA1 algorithm which has been deprecated in most of the recent Linux distros. If you can generate another PGP key with a different algorithm. If you are a consumer of this deb package and cannot get the maintainer to update their public key you can add a gpg configuration that will enable gpg to use the PGP signature

echo "allow-weak-digest-algos" >> /etc/gnupg/gpg.conf

And then retry with debsig-verify.

Query for Finding the nth Element in a PostgreSQL JSONB Array that Matches a Specified Value

The following is a sample query that you can use to search for rows that have the nth element in a JSONB array that match a specific value.

Given

CREATE TABLE sample_jsonb (
  id serial PRIMARY KEY,
  name varchar(64),
  json_data jsonb
);

INSERT INTO sample_jsonb(name, json_data)
VALUES
  ('foo',
  '{
  "key": "val1",
  "arr": ["homer", "bart", "barney"]
  }'),
  ('bar',
  '{
  "key": "val2",
  "arr": ["marge", "lisa", "maggie"]
  }'),
  ('baz',
  '{
  "key": "val2",
  "arr": ["bart", "milhouse", "nelson"]
  }')
;

We now have two records in the database where the string “bart” is contained in the arr array of values. The following query will select the row(s) where the “bart” string is in the 2nd (index 1) element of the array.

SELECT *
FROM sample_jsonb
WHERE
    (json_data->'arr'->1)::jsonb ? 'bart'

LVM Resize – reduce the size of one logical volume to enable expanding another

I’m running an Ubuntu workstation and when setting it up simply went the “next-next-next” route when setting up the encrypted disk via LVM. The default is to create a 1G swap partition which is just not enough when you attempt to run too many things and locks up and/or crashes the machine.

My goal was to reduce my /root partition and then use that space to extend my swap partition.

Ensure that you back up your data first! There is no guarantee when executing the following operations, even correctly, that you will not lose your data.

Decreasing the size of an LVM partition

Because I need to modify the /root partition I first needed to boot to a live image or rescue image. I happened to have a Debian 12 iso on a flash drive and after sorting out how to get my laptop to boot from it chose the rescue option when booting.

One of the unexpected options during boot to the rescue image was to decrypt the filesystem so I simply entered the passphase to decrypt it. If you are using a rescue image that does not include that feature checkout this post for details on how to decrypt it.

An overview of the steps is as follows

  1. Run a filesystem check on your filesystem
  2. Resize the filesystem contained in the logical volume
  3. Resize the logical volume

Run a filesystem check

Since you have booted from a rescue disk you first need to activate the volume group to work with it

vgchange -ay

Enter the following to the details for the volume group

root@marge:~# vgdisplay
  --- Volume group ---
  VG Name               marge-vg
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <465.27 GiB
  PE Size               4.00 MiB
  Total PE              119108
  Alloc PE / Size       119108 / <465.27 GiB
  Free  PE / Size       0 / 0   
  VG UUID               m06QqY-lQ3N-Y0be-j7nw-V1jo-hHwK-1OrE6U

Then use lvdisplay to see the details of all of the logical volumes contained within that group

root@marge:~# lvdisplay
  --- Logical volume ---
  LV Path                /dev/marge-vg/root
  LV Name                root
  VG Name                marge-vg
  LV UUID                820Zm6-zOcV-9gMu-efCr-jnVv-nmzX-T04zFq
  LV Write Access        read/write
  LV Creation host, time marge, 2022-12-15 14:58:30 -0500
  LV Status              available
  # open                 1
  LV Size                <464.31 GiB
  Current LE             118863
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Path                /dev/marge-vg/swap_1
  LV Name                swap_1
  VG Name                marge-vg
  LV UUID                MHyNcx-ASX7-bsDd-Wto3-k9u8-QjZA-Z85yIR
  LV Write Access        read/write
  LV Creation host, time marge, 2022-12-15 14:58:30 -0500
  LV Status              available
  # open                 2
  LV Size                980.00 MiB
  Current LE             245
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2

Run the filesystem check as you cannot resize a filesystem that in a bad state. The LV Path is the path to the underlying file system.

e2fsck -fy /dev/marge-vg/root

Shrink the filesystem

Just to be safe, we will shrink the filesystem to be smaller than the target logical volume size. If we accidentally shrink the logical volume to be smaller than the filesystem that it contains this can result in corruption of your data. Once the operation is complete, we can reclaim that “lost” space. The last argument in the command specifies the desired target size of the filesystem.

resize2fs /dev/marge-vg/root 430G

Shrink the logical volume

Once the filesystem is shrunk we can now shrink the logical volume. Again, we will specify a specific target size for the logical volume of 434G. You will get a warning that the operation could result in data loss. If we have performed the previous steps successfully we are (relatively) safe to proceed.

lvreduce -L 434G /dev/marge-vg/root

lvdisplay should now show the smaller logical volume

root@marge:~# lvdisplay
  --- Logical volume ---
  LV Path                /dev/marge-vg/root
  LV Name                root
  VG Name                marge-vg
  LV UUID                820Zm6-zOcV-9gMu-efCr-jnVv-nmzX-T04zFq
  LV Write Access        read/write
  LV Creation host, time marge, 2022-12-15 14:58:30 -0500
  LV Status              available
  # open                 1
  LV Size                <434.00 GiB  <-- New LV Size
  Current LE             118863
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1

Now that the logical volume has been reduced we can reclaim the additional space (the difference between the reduced filesystem size and the new logical volume size) by extending the filesystem to use all available space in the logical volume.

resize2fs /dev/marge-vg/root

Extend the other logical volume

Verify the amount of free space now available in the volume group

root@marge:~# vgdisplay
  --- Volume group ---
  VG Name               marge-vg
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <465.27 GiB
  PE Size               4.00 MiB
  Total PE              119108
  Alloc PE / Size       111427 / <435.27 GiB
  Free  PE / Size       7618 / 30.00 GiB  <-- Free space
  VG UUID               m06QqY-lQ3N-Y0be-j7nw-V1jo-hHwK-1OrE6U

Run the following command to extend the logical volume. In this case, we will extend the swap volume by 30 G

lvextend -L +30G /dev/marge-vg/swap_1

Resize the filesystem for the extended logical volume

We have only resized the logical volume. The filesystem contained therein has not yet been resized to use the additional space

Because we are resizing a swap partition we need to do things a little differently. Run the following to recreate the swap partition in this logical volume

mkswap /dev/marge-vg/swap_1

For a “normal” partition you would run the following to extend the filesystem to the size of the logical volume.

resize2fs /dev/<vol-group>/<lv>

From here you should be able to reboot your system with your new LVM configuration.

Query for Finding an Element in a PostgreSQL JSONB Array

The following is a sample query that you can use to search for rows that have an element in a JSONB array.

Given

CREATE TABLE sample_jsonb (
  id serial PRIMARY KEY,
  name varchar(64),
  json_data jsonb
);

INSERT INTO sample_jsonb(name, json_data)
VALUES
  ('foo',
  '{
  "key": "val1",
  "arr": ["homer", "bart", "barney"]
  }'),
  ('bar',
  '{
  "key": "val2",
  "arr": ["marge", "lisa", "maggie"]
  }')
;

select count(*) from sample_jsonb where json_data->'arr' ? 'marge';