[SOLVED] java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter When Using Avro Data with MapReduce

I am working on a project and have decided to use Avro for the data serialization format.

I encountered the following error when trying to set up the unit test to test the mapper implementation through Eclipse:

java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
    at org.apache.avro.hadoop.io.AvroSerialization.getSerializer(AvroSerialization.java:114)
    at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:82)
    at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:67)
    at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:98)
    at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:111)
    at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:676)
    at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:680)
    at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:120)
    at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:130)
    at org.apache.hadoop.mrunit.MapDriverBase.addAll(MapDriverBase.java:141)
    at org.apache.hadoop.mrunit.MapDriverBase.withAll(MapDriverBase.java:247)
    at com.ryanchapin.hadoop.mapreduce.mrunit.UserDataSortTest.testMapper(UserDataSortTest.java:111)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

After digging through the source code and finding that method did, infact, exist.  I tried running the same unit test via the maven cli.  It worked just fine.

After more digging, it turns out that what was happening was that the classpath in Eclipse was using avro-1.7.4 from the hadoop-common and hadoop-mapreduce-client-core jars in my project, and not the 1.7.7 version that I was trying to use.

To see what the difference between running it via the maven cli and running it in eclipse, I went through the following steps:

Added the following code to my test code to print out the classpath at runtime:

// Print out the classpath
ClassLoader sysClassLoader = ClassLoader.getSystemClassLoader();
URL[] urls = ((URLClassLoader)sysClassLoader).getURLs();
System.out.println("---------------------------------------");
for(int i=0; i< urls.length; i++) {
    System.out.println(urls[i].getFile());
}
System.out.println("---------------------------------------");

Then ran it, in Eclipse and saved off the console output.

Then, I added a sleep call for 100 seconds in the same place in the code.  This enabled me to run the test again from the terminal and copy the project/target/surefire/ directory which contained the surefirebooter.jar.  Click here to read more about that project.

After copying that jar to a temporary directory, I unpacked it and then compared the versions of avro between the Eclipse classpath and the classpath from the terminal and noticed that they were different.  Inspecting the dependency tree of my project it was clear that 1.7.4 was part of the hadooop jars I was using.

Ultimately, I ended up updating my version of avro to 1.7.4 in my pom to eliminate the conflict.

Leave a Reply