Passing an Array as an Argument to a Bash Function

If you want to pass an array of items to a bash function, the simple answer is that you need to pass the expanded values.  That means that you can pass the data as a quoted value, assuming that the elements are whitespace delimited, or you can pass it as a string and then split it using an updated IFS (Internal Field Separator) inside the function.

Following is an example of taking the output of a Hive query (a single column that is separated by new lines), wrapping it in quotes and passing it as a single value to the function.

#!/bin/bash

#
# This function will accept the expanded elements of the array
#
function foo() {
# Loop through elements in the first argument passed.
   # In this case, each is separated by whitespace so we do
   # not need to change the IFS
   for i in $1
   do
      echo "i = $i"
   done
}

# Dynamically build our hive query
HIVE_QRY="use somedb; select some_column from some_table;"

# Dynamically build the hive command to execute
CMD="hive -e '$HIVE_QRY'"

# Execute the hive query in a subshell and store the result in
# the 'QRY_RETVAL' variable
QRY_RETVAL=$(eval $CMD)

# Call the foo method and pass it the output of the query, /QUOTED/
# so that it will be passed as a single argument and not a series
# of arguments for each row returned by the query
foo "${QRY_RETVAL}"

Restarting Individual Services or the Entire HDP Stack in the Hortornworks Virtual Sandbox

I’m using the Hortonworks Virtual Sandbox for development and testing and wanted to restart the HDP stack without (of course) having to restart the VM.

It took me a little while to figure out how to go about it as Internet searches on the topic revealed very little.

It turns out that Hortonworks have set up their own service on the box, startup_script.

If you take a look at /etc/init.d/startup_script you will see that it calls a number of other shell scripts in /usr/lib/hue/tools/start_scripts/

To restart the whole stack simply issue the following command:

service startup_script restart

Vim Search and Replacing with Backreferences

It is often helpful to write search and replace commands that save segments of the matched text to use in the replacement string.

Source text:

This is some text (we want to change) with a phone number (301) 555-1234.
We want to remove the parenthesis, but only from the phone number string.

In this example we have a text file that has a phone number in it and we want to remove the parenthesis that surround ONLY the area code in the phone number, plus the trailing space and replace it with the aread code plus a trailing “-“.

Following is the search and replace command. The forward slashes are separators in the search command.  They will be omitted in the full explanation below:

:1,%s/(\([0-9]\+\))\s/\1-/g

Here is it explained:

:1,%s

Tells, vi to execute the search and replace command from line 1 to the end of the document

(\([0-9]\+\))\s

The regex including the backreference that will find the area code including the parenthesis and a trailing space.  The \(  and \) parenthesis wrap the numerical part of the string that we are trying to save and mark it as a backreference.  The \+ indicates that we one one or more of the numerical characters (the plus char must be escaped).  The \s indicates a whitespace character the we also want to match.

\1-

The characters that we want to use to replace what we have found.  The \1 indicates we want to use the first backreference that was matched followed by a ‘-‘ character. 

g

Tells, vi to execute the search and replace on all occurances found in a line.

Removing the Last Token From a String in Bash with awk

Let’s say that you have some number of files for which you want to create a containing directory that is named with all but the last token of the file name, and you want to remove just the last token to create the name of the directory.

Much easier to explain with an example.  Given this list of files:

ls -1
foo_10_10_sometrash
foo_1_sometrash
foo_2_sometrash
foo_3_sometrash
foo_4_sometrash
foo_5_5_sometrash
foo_5_sometrash
foo_6_6_sometrash
foo_7_7_sometrash
foo_8_8_sometrash
foo_9_9_sometrash

You want to create a directory for each of the files as follows:

foo_5_sometrash should have a directory named foo_5.

Further, let’s assume that you have thousands, or hundreds of thousands of files.  In that case doing it via a script while you get a cup of coffee is the preferred solution.

The work will be done within a for i in loop iterating over the output from ls with a nested awk command.

for i in `ls -1`; do DIRNAME=$(echo $i | awk -F_ '{$NF=""; print $0}' | sed 's/ /_/g' | sed 's/_$//g'); mkdir $DIRNAME; done

Here is the command broken down:

DIRNAME=$(....)

will set the var $DIRNAME to the result of the code within the parenthesis.

awk -F_ '{$NF=""; print $0}

will set the field separator to a ‘_’, the character on which you will be ‘splitting’ your string.  $NF="" will set the last field to an empty string and then the print $0 will print the entire input line.

The following sed commands will replace the spaces generated by the awk command with the original separators, ‘_’, and then remove the spurious, trailing ‘_’.

Backspace, Delete, and/or Return Key Stops Working in Oracle SQL Developer

So, I fire up SQL Developer to run some queries against a QC server and for some reason, I am no longer able to use the backspace, delete, or return keys to edit .sql files opened in the program.

I tried opening a new .sql file, and restarting SQL Developer.  I then tried restarting Windows.  None of those worked.

After a bit of searching I found a forum posting that indicated by going to Tools/Preferences/Accelerators and clicking the “Load Preset…” button in the bottom right of the dialog box would fix the problem.  My guess is that some key mapping preference file had gotten corrupted some how and that by replacing it with a default that it fixes the problem.

After doing so, I was back in business.

How To Benchmark Disk I/O

Here is a quick snipped on how to benchmark Disk I/O with dd.

time sh -c "dd if=/dev/zero of=/home/rchapin/test.zeros bs=1024k count=10000 && sync"

10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 81.4124 s, 129 MB/s

real    1m21.950s
user    0m0.810s
sys     0m5.474s

Will do a write test of 10GB.

You can do a similar test and read from that file generated and write to another file or /dev/null to get an idea of the read speeds.

See the this link for more information.

Creating an Array in Bash from a File With Each Element on a Separate Line

Let’s say that you have a file and you would like to convert each line in the file to an element in an array.

The key to this is knowing about and how to manipulate the IFS (Internal Field Separator).  The default IFS is whitespace (a space, tab, or newline) and if you create an array passing it a whitespace delimited list of strings, each token will be set to an element in the array.

ARRAY=(a b d c)

Will result in an array with a single letter in each element.

To do the same thing with the contents of a file, whereby each element is on a separate line, the first thing to be done is to set the IFS that is just new-lines (carriage returns).  Then set, as the input for the array, the contents of the file.

# Save our existing IFS
OIFS="$IFS"

# Set our IFS to a new-line/carriage return
IFS=$'\r\n'

# Create the array with the contents of a file
TEST_ARRAY=($(cat some_file.txt))

# Reset our IFS
IFS="$OIFS"

for i in "${TEST_ARRAY[@]}"
do
   echo $i
done