Welcome to my website. I am always posting links to photo albums, art, technology and other creations. Everything that you will see on my numerous personal sites is powered by the formVistaTM Website Management Engine.


  • Subscribe to this RSS Feed
  • My Git Cheat Sheet
    02/25/2016 9:44PM

    Some standard configs:

    git config --global user.name "Ryan Chapin"
    git config --global user.email "rchapin@nbinteractive.com"
    git config --global core.editor vim

    Ignore files in your repo:

    .gitignore files contain patterns for files or directories to ignore

    You can also set up a global .gitignore file for all Git repos via the core.excludesfile setting
    $ git config --global core.excludesfile ~/path/to/some/.gitignore

    The local .gitignore can be committed to a repo, the global one is only locally visible

    Empty dirs need to have some file in there, .gitkeep for example:
    $ touch .gitkeep

    Git Cheat-Sheet:

    # Some standard configs for git on your local box

    $ git config --global user.name "Ryan Chapin"
    $ git config --global user.email "rchapin@nbinteractive.com"
    $ git config --global core.editor vim

    $ git checkout master : checks out the master and sets it to HEAD
    $ git checkout <branch_name> : checks out the 'branch_name' branch and sets it to HEAD
    $ git branch -a : lists the branches in the current git repo

    # -------------------------------------
    # Creating and working with repos:

    # create a remote repo on a server without a working tree, by default the repo
    # should end in .git
    $ git init --bare

    # show existing defined remote repos
    $ git remote

    # show details of named remot repos
    $ git remote show <name>

    # -------------------------------------
    # Adding and removing files to be commited
    # Add all files already tracked that have been changed
    $ git add -u

    # Add all files already tracked and new files
    $ git add -A
    $ git add .

    # remove a file from the index, or undoing a git add before committing
    $ git reset name_of_file_in_index.txt

    # -------------------------------------
    # Tags:
    # create a tag:
    $ git tag version0.01

    # With a comment
    $ git tag -a 1.0.0 -m 'First complete version of the hadoop-utility-scripts'
    $ git push origin --tags

    # create a tag for a commit id
    $ git tag version0.01 <tagID>

    # checkout a certain tag
    $ git checkout <tag_name>

    # delete a tag (I did it from the master branch)
    # first list it
    $ git tag -l

    # delete it
    $ git tag -d foo-blah

    # push it to the remote
    $ git push origin :refs/tags/foo-blah

    # get the commit id for a given tag
    $ git rev-list -1 <tag>

    # -------------------------------------
    # Branches
    # list local branches
    $ git branch

    # list remote-tracking branches
    $ git branch -r

    # list both remote-tracking and local branches
    $ git branch -a

    # switch HEAD pointer to a branch
    $ git checkout <branch_name>

    # create a branch
    $ git checkout -b <new-branch-name>

    # delete a branch
    $ git branch -d <branch-name-to-delete>

    # delete a remote branch
    $ git push origin --delete <branch-name-to-delete>

    # push a branch
    $ git push origin <branch-name>

    # Set up a tracking branch from a remote in your local repo:
    $ git checkout -b <branch-name> origin/<branch-name>

    # Merge a specific commit from one branch to another, cherry picking:
    # First figure out the commit id(s) that you want to merge
    # Then, checkout the target branch (the one into which you want to merge the specific commit)
    $ git checkout -b <target-branch-name>

    # Then merge just that commit (the commit id is a specific commit on a specific branch)
    $ git cherry-pick abacac

    # Rename a branch locally and remotely
    # Rename branch locally    
    $ git branch -m old_branch new_branch

    # Delete the old branch    
    $ git push origin :old_branch
    # Push the new branch, set local branch to track the new remote
    $ git push --set-upstream origin new_branch


    # -------------------------------------
    # Updating remote URLs:
    $ git remote set-url origin ssh://git@some.host/path/repo.git

    # Confirm by using
    $ git remote -v

    # -------------------------------------
    # Resetting/Fixing things:

    # Reset your local tracking branch to the state of the remote tracking branch.  This will destroy anything in your local tracking branch

    $ git reset --hard origin/BRANCH

    # -------------------------------------
    # Solution for
    # 'fatal: No path specified. See 'man git-pull' for valid url syntax'
    # error message when attempting to push to a remote repo
    # This is the result of git not being configured properly for this user and
    # this repository.
    # 1. Get the correct url for your repository.  For example:
    #    git@github.com:rchapin/some-repo.git
    #    Where 'git@github.com' is the top-level url for your repo,
    #    'rchapin' is your user name, and 'some-repo.git' is the repo to
    #    which you are trying to push.
    # 2. Issue the following commands (be sure to update them to match your url,
    #    user name, and repo), in the repository directory.
    $ git remote rm
    $ git remote add origin 'git@github.com:rchapin/some-repo.git'

  • Looping Through a List of Files with Spaces in the File Name with Bash
    02/16/2016 8:58PM

    If you have a list of files that you want to operate on in a loop in bash and some of them have spaces in the file name the default IFS (Internal Field Separator) will match with the space and tokenize the file.

    The simple approach is to temporarily set the IFS as follows.  This can be done in a shell script, but the following example is directly on the command line for 'one-liner' usage.

    $ OIFS="$IFS"

    $ IFS=$'\n'

    $ for i in `find ./ -type f -iname '*some_criteria*'`; do "something with $i"; done

    $ IFS="$OIFS"

    The previous commands will:

    1. Save the existing IFS
    2. Update the IFS to a newline char
    3. Execute your loop with the results of a find command
    4. Reset the IFS
  • How To Remove the Byte Order Mark (BOM) from UTF-8 Encoded Text Files
    01/28/2016 1:01PM

    The easiest way that I have seen so far for doing so is to use tail and simply read everything except the first three bytes (start reading at the 4th byte), as follows:

    $ tail --bytes=+4 text_file.txt > text_file-wo-bom.txt

  • [SOLVED] java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter When Using Avro Data with MapReduce
    01/14/2016 2:56PM

    I am working on a project and have decided to use Avro for the data serialization format.

    I encountered the following error when trying to set up the unit test to test the mapper implementation through Eclipse:

    java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
        at org.apache.avro.hadoop.io.AvroSerialization.getSerializer(AvroSerialization.java:114)
        at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:82)
        at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:67)
        at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:98)
        at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:111)
        at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:676)
        at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:680)
        at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:120)
        at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:130)
        at org.apache.hadoop.mrunit.MapDriverBase.addAll(MapDriverBase.java:141)
        at org.apache.hadoop.mrunit.MapDriverBase.withAll(MapDriverBase.java:247)
        at com.ryanchapin.hadoop.mapreduce.mrunit.UserDataSortTest.testMapper(UserDataSortTest.java:111)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
        at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
        at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

    After digging through the source code and finding that method did, infact, exist.  I tried running the same unit test via the maven cli.  It worked just fine.

    After more digging, it turns out that what was happening was that the classpath in Eclipse was using avro-1.7.4 from the hadoop-common and hadoop-mapreduce-client-core jars in my project, and not the 1.7.7 version that I was trying to use.

    To see what the difference between running it via the maven cli and running it in eclipse, I went through the following steps:

    Added the following code to my test code to print out the classpath at runtime:

        // Print out the classpath
        ClassLoader sysClassLoader = ClassLoader.getSystemClassLoader();
        URL[] urls = ((URLClassLoader)sysClassLoader).getURLs();
        for(int i=0; i< urls.length; i++) {

    Then ran it, in Eclipse and saved off the console output.

    Then, I added a sleep call for 100 seconds in the same place in the code.  This enabled me to run the test again from the terminal and copy the project/target/surefire/ directory which contained the surefirebooter.jar.  Click here to read more about that project.

    After copying that jar to a temporary directory, I unpacked it and then compared the versions of avro between the Eclipse classpath and the classpath from the terminal and noticed that they were different.  Inspecting the dependency tree of my project it was clear that 1.7.4 was part of the hadooop jars I was using.

    Ultimately, I ended up updating my version of avro to 1.7.4 in my pom to eliminate the conflict.

  • Configuring Hidden, Invisible, or Whitespace Characters in The Eclipse Text Editor
    01/08/2016 10:27AM

    The newer (I am currently using Mars, 4.5.0) versions of Eclipse provide very good tools for configuring the visibility of whitespace characters in code.

    To customize your settings go to Window -> Preferences -> General -> Editors -> Text Editors.

    On that page there will be checkbox option next to "Show whitespace characters (configure visibility).

    Clicking on the 'configure visibility' link will allow you to choose what is shown and the opacity of the whitespace characters, which is a really nice touch.

Advanced Search