Wednesday 18 November 2015

Exception in using UDF


Some time may you face below issue while using customizes UDF in hive

java.io.FileNotFoundException: File does not exist: hdfs:


here is complete stack trace

java.io.FileNotFoundException: File does not exist: hdfs://localhost:54310/usr/local/hivetmp/amit.pathak/9381feb3-6c5f-469b-b6b1-9af55abbdabd/udf.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)


If you want to view exact problem you can check this issueLink




This issue comes mainly when you  use UDF in join or Create table tablename as function.

To fix above issue I have two ways.

1) - Use add file command instead of add jar (As using file it make sure your data exist in distributed cache )

Before Changes ::

add jar '/user/hive/udf.jar';
create temporary function convertToJulian as 'com.convertToJulian';

After Changes ::

add file '/user/hive/udf.jar';
create temporary function convertToJulian as 'com.convertToJulian';

2)- Have same file structure on local as well as on hadoop. 

Like if you stored your UDF in below local file system

/user/hive/amit/udf.jar

So you also need to create same directory structure in hadoop filesystem and then put your udf jar in that directory.









No comments:

Post a Comment