Pruning directories from find

I have no idea why, but for some reason I always have a hard time remembering the exact syntax for find when I want to prune some list of directories from a search.

Let’s say that you want to execute a find in a directory where there are a lot of .git directories and you don’t want to search through the guts of the repo directories. With the following command we specify the prune predicate ahead of the search for any file that has ‘*.json’ in the file name.

find ./ -type f -iwholename '*.git' -prune -o -name '*.json' -print

Another way to do it is to exclude specific directories from a search. With the following command we first specify a set of directories to exclude from the search, by specific path and name, and then execute a search for the specific files.

find ./ -type d \( -path ./grpc-java -o -path ./go-in-mem-datastore \) -prune -o -name '*.json' -print

Using cut with a delimiter of any amount of whitespace

The TLDR; is to first use tr to replace all occurrences of any horizontal whitespace character with a single space, and then squeeze down any number of spaces to a single space and then define the delimiter for cut as a single space. The following example assumes that you want to see from the 5th column to the end of the line.

<do-something-to-generate-input> | tr '[:blank:]' ' ' | tr -s ' ' | cut -d ' ' -f5-

The previous command will, after using -s to squeeze repeated spaces into one and then cut from the 5th field to the end of the line.

Running GUI apps locally as root in a non-root session

There are instances when you need to run an X Window application. For me this is often running a terminator instance as root so that I can create tabs and split the window as still be root in each of those terminals.

In order for the root user to be able to connect to the X server you need to provide it with “credentials”. In this case it is on the same box and not over the network so the use of cookie authentication is acceptable.

As the user that is already authenticated to the X server run the following command to get the cookie used to connect to the the current $DISPLAY

rchapin@leviathan:~$ xauth list $DISPLAY
leviathan/unix:0  MIT-MAGIC-COOKIE-1  5fb2c0e68f4618ee4fa2202e1e4ae937

su to root and then run the following to add that cookie to roots authorization file

root@leviathan:~# xauth add leviathan/unix:0 MIT-MAGIC-COOKIE-1 5fb2c0e68f4618ee4fa2202e1e4ae937

You should now be able to run X windows applications from that root terminal.

VS Code “Test result not found for:” When Running Tests for a Python Project [SOLVED]

I finally was able to get Visual Studio Code set-up correctly to run and debug unit and integration tests for a Python 3.8 project that I am working on (I’ll add a link to that post here once it is up).

After making some changes to the code and adding a test I got the following error when trying to debug the test:

Test result not found for: ./mylibs/integration_tests/myclient_integration_test.py::MyClientIntegrationTest::test_happy_path

? An odd error message, to be sure.

After a little while I figured out that when this happens it is ultimately the result of some syntax, interpretation error that occurs at runtime that the IDE may not flag as a problem for you.

Check the Output panel and click on the drop-down and select Python Test Log to see the stack trace of the error to see where you have a typo.

Compiling Python Under Linux

The following should work with just about any version of Python. I am using it to compile, currently 3.10.x, on distros where those packages are not readily available for installation. The following is a quick how to on getting it compiled under both RedHat/CentOS/Almalinux and Debian based systems.

Download the Tarball for the Version You Want To Install

Download the tar.gz archive for the version that you want to install from here. Verify the download and then save the path to this file for later.

Install Dependencies

This assumes that you already have the “build-essentials” and kernel headers installed on the box, which is an exercise for the reader.

RedHat/CentOS/Almalinux

yum install -y bzip2-devel expat-devel gdbm-devel ncurses-devel openssl-devel readline-devel wget sqlite-devel tk-devel xz-devel zlib-devel libffi-devel gmp-devel libmpc-devel mpfr-devel openssl-devel liblzma-devel

Debian

apt install -y build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libsqlite3-dev libreadline-dev libffi-dev curl libbz2-dev liblzma-dev

Compile Python

The following enables a non-root user to unpack, compile, and install it into their home directory. Copy this file to /var/tmp/compile-python.sh and then run as follows

/var/tmp/compile-python.sh <path-to-tarball>
#!/bin/bash

set -u
set -e

# The path to the downloaded tarball
py_tarball=$1

export PY_DIR=$(echo $py_tarball | awk -F/ '{ print $NF }' | sed 's/.tgz//')
export PY_PREFIX=$(echo ~/usr/local/$PY_DIR | tr [:upper:] [:lower:])

mkdir -p ~/usr/local/src ~/usr/local/bin ~/usr/local/include $PY_PREFIX
rm -rf $PY_PREFIX
tar -xzf $py_tarball -C ~/usr/local/src/
cd ~/usr/local/src/$PY_DIR
./configure --prefix=$PY_PREFIX --exec-prefix=$PY_PREFIX
make && make install

Add the following to your PATH in ~/.bash_profile

PYTHON_HOME=~/usr/local/python-<version>

export PATH=$PATH:$PYTHON_HOME/bin

Using fc to Edit and Re-execute Bash Commands

I recently learned about the Bash built-in fc. It is a great tool that enables you to edit and re-execute commands from your bash history.

Oftentimes there is a command in your history that instead of just grepping through the history and then re-executing as-is you’ll want to make a modification or two. With fc you can first edit it in your favorite editor and then when closing the editor fc will execute the command.

For me, vim is my editor of choice. Add the following to your .bashrc and fc will automatically open vim for you.

export FCEDIT=vim

Then, simply run fc passing it the id of the command in your history that you want to edit and then execute.

fc 1234

Creating a Launch Config in VSCode to Debug a Python Invoke Script

I regularly use Python Invoke and Fabric for the automation of various tasks; from deploying code to developing my own set of tools for various projects. Following is an example on how to write a launch.json launch configuration for vscode so that you can step through the tasks.py code and debug it.

Assuming that you have created a virtual environment and pip installed invoke into it. And, assuming that you have defined a task in your tasks.py file as follows:

from invoke import task

@task()
def do_something(ctx, some_path, some_other_path):
    # Do something with data in these dirs . . . 

The following is a template you can use for a launch configuration that you can use to debug your task.

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "invoke",
      "type": "python",
      "request": "launch",
      // The complete path to the invoke python script in your virtual environment
      "program": "/my/virtualenv/path/bin/invoke",
      "justMyCode": false,
      // The args that you would otherwise enter on the command line
      // when invoking your task
      "args": [
        "do-something",
        "--some-path",
        "/var/tmp/a/",
        "--some-other-path",
        "/var/tmp/b/"
      ],
      "cwd": "/the/path/to/the/dir/that/contains/your/tasks/script",
    }
  ]
}

Using bq load Command to Load logicalType Partitioned Data into a BigQuery Table

Following is the syntax and bq load command that you need to issue if you want to load data in avro file into a partitioned BigQuery table based on avro field defined as a logicalType.

Given the following schema

{
  "type" : "record",
  "name" : "logicalType",
  "namespace" : "com.ryanchapin.tests",
  "fields" : [ {
    "name" : "id",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "value",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "day",
    "type" : {
      "type" : "int",
      "logicalType" : "date"
    }
  }
}

And the following BigQuery schema

[
  {
    "name": "id",
    "mode": "NULLABLE",
    "type": "STRING"
  },
  {
    "name": "value",
    "mode": "NULLABLE",
    "type": "INT64"
  },
  {
    "name": "day",
    "mode": "REQUIRED",
    "type": "DATE"
  }
]

Assuming that you have a correct avro data file (an exercise for the reader) that contains records that include values in the day column that are the number of days since the epoch, you can run the following bq load command to load that data into your table.

 bq --project_id my_project load --source_format=AVRO --time_partitioning_type=DAY --time_partitioning_field=day --use_avro_logical_types my_dataset.my_table gs://my_bucket/*.avro