Welcome to my website. I am always posting links to photo albums, art, technology and other creations. Everything that you will see on my numerous personal sites is powered by the formVistaTM Website Management Engine.


  • Subscribe to this RSS Feed
  • How To Remove the Byte Order Mark (BOM) from UTF-8 Encoded Text Files
    01/28/2016 1:01PM

    The easiest way that I have seen so far for doing so is to use tail and simply read everything except the first three bytes (start reading at the 4th byte), as follows:

    $ tail --bytes=+4 text_file.txt > text_file-wo-bom.txt

  • [SOLVED] java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter When Using Avro Data with MapReduce
    01/14/2016 2:56PM

    I am working on a project and have decided to use Avro for the data serialization format.

    I encountered the following error when trying to set up the unit test to test the mapper implementation through Eclipse:

    java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
        at org.apache.avro.hadoop.io.AvroSerialization.getSerializer(AvroSerialization.java:114)
        at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:82)
        at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:67)
        at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:98)
        at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:111)
        at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:676)
        at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:680)
        at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:120)
        at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:130)
        at org.apache.hadoop.mrunit.MapDriverBase.addAll(MapDriverBase.java:141)
        at org.apache.hadoop.mrunit.MapDriverBase.withAll(MapDriverBase.java:247)
        at com.ryanchapin.hadoop.mapreduce.mrunit.UserDataSortTest.testMapper(UserDataSortTest.java:111)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
        at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
        at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

    After digging through the source code and finding that method did, infact, exist.  I tried running the same unit test via the maven cli.  It worked just fine.

    After more digging, it turns out that what was happening was that the classpath in Eclipse was using avro-1.7.4 from the hadoop-common and hadoop-mapreduce-client-core jars in my project, and not the 1.7.7 version that I was trying to use.

    To see what the difference between running it via the maven cli and running it in eclipse, I went through the following steps:

    Added the following code to my test code to print out the classpath at runtime:

        // Print out the classpath
        ClassLoader sysClassLoader = ClassLoader.getSystemClassLoader();
        URL[] urls = ((URLClassLoader)sysClassLoader).getURLs();
        for(int i=0; i< urls.length; i++) {

    Then ran it, in Eclipse and saved off the console output.

    Then, I added a sleep call for 100 seconds in the same place in the code.  This enabled me to run the test again from the terminal and copy the project/target/surefire/ directory which contained the surefirebooter.jar.  Click here to read more about that project.

    After copying that jar to a temporary directory, I unpacked it and then compared the versions of avro between the Eclipse classpath and the classpath from the terminal and noticed that they were different.  Inspecting the dependency tree of my project it was clear that 1.7.4 was part of the hadooop jars I was using.

    Ultimately, I ended up updating my version of avro to 1.7.4 in my pom to eliminate the conflict.

  • Configuring Hidden, Invisible, or Whitespace Characters in The Eclipse Text Editor
    01/08/2016 10:27AM

    The newer (I am currently using Mars, 4.5.0) versions of Eclipse provide very good tools for configuring the visibility of whitespace characters in code.

    To customize your settings go to Window -> Preferences -> General -> Editors -> Text Editors.

    On that page there will be checkbox option next to "Show whitespace characters (configure visibility).

    Clicking on the 'configure visibility' link will allow you to choose what is shown and the opacity of the whitespace characters, which is a really nice touch.

  • [SOLVED] Configuring chrooted bind and rndc-confgen Hangs Not Generating a Key
    12/02/2015 12:47PM

    I am putting together a chrooted installation of named and ran into a problem whereby attempting to generate an rndc.key with rndc-confgen just hangs, never returning and not generating a key.

    After doing some searching I discovered that I needed to run the command as follows:

    # rndc-confgen -a -r /dev/urandom  -t /var/named/chroot
    wrote key file "/etc/rndc.key"
    wrote key file "/var/named/chroot/etc/rndc.key"

    Which generated the key file that I expected.

  • Using the Eclipse Memory Analyzer (MAT) Remotely on Large Heap Dumps
    11/23/2015 10:42AM

    Sometime your java applilcation will fail and generate an enormous heap dump.  One that may be too large to be able to transfer to your local machine and to analyze for lack of RAM, time or both.

    One solution is to install the MAT tool on the remote server and generate an HTML output of the analysis to download and view locally.  This saves the headache of attempting to get X Windows installed on the remote machine and get all of the ssh tunneling sorted out (which is of course an option as well).

    First, download and install the stand-alone Eclipse RCP Application.  Then transfer to your server and unpack.  Then determine how large the heap dump is and, if necessary, modify the MemoryAnalyzer.ini file to instantiate a JVM with enough RAM for your heap dump.

    In this example, I have an 11GB heap dump and have modified the last two lines (adding -Xms)


    Do an initial run to parse the heap dump.  This will generate intermediary data that can be used by subsequent runs to make future analysis faster.

    ./ParseHeapDump.sh /path/to/heap-dump

    After that completes, you can run any of a number of different analysis on the data.  The following is an illustration of how to search for memory leak suspects.

    ./ParseHeapDump.sh /path/to/heap-dump org.eclipse.mat.api:suspects

    Additional reports:


    To give creadit where it is due, this is basically a copy of a post by Ashwin Jayaprakash, but I wanted to capture it here as well.

Advanced Search