Debugging MapReduce MRv2 Code in Eclipse

Following is how to set-up your environment to be able to set breakpoints, step-through, and debug your MapReduce code in Eclipse.

All of the this was done on a machine running Linux, but should work just fine for any *nix machine, and perhaps Windows running Cygwin (assuming that you can get Hadoop and its naitive libraries compiled under Windows).

This also assumes that you are building your project with maven.

Install a pseudo-distributed hadooop cluster on your development box.  (Yes, → Continue reading “Debugging MapReduce MRv2 Code in Eclipse”

Unit Testing Private Static Methods With Primitive Array Arguments

When writing unit tests to cover your entire program you will undoubtedly come across the need to test private methods.  There are arguments that these methods should be tested via integration tests, but there are sometimes when it makes more sense to test all of the permutations in a unit test. This can be achieved using reflection in Java JUnit tests.

What is a little tricky, and was not completely obvious, was how to use reflection to test a private → Continue reading “Unit Testing Private Static Methods With Primitive Array Arguments”

One-Liner for Converting CRLF to LF in Text Files

If you have text files created under DOS/Windows and need to convert the CRLF (carriage return and line feed) characters to LF (line feed) character, here is a quick one-liner.

cat file.txt | perl -ne 's/\x0D\x0A/\x0A/g; print' file.txt.mod

You can also use dos2unix, however, especially under Cygwin I have seen dos2unix fail without giving any meaningful information about why it was unable to complete the task.  In that case, you can just do it by hand. → Continue reading “One-Liner for Converting CRLF to LF in Text Files”