python - Pydoop error: RuntimeError: java home not found, try setting JAVA_HOME on remote server using CDH5.4 -
objective: read remote file stored in hdfs using pydoop, laptop. i'm using pycharm professional edition. i'm using cloudera cdh5.4
pycharm configuration on laptop: in project interpreter (under settings), have directed python compiler on remote server ssh://remote-server-ip-address:port-number/home/ashish/anaconda/bin/python2.7
now there file stored in hdfs location /home/ashish/pencil/somefilename.txt
then install pydoop on remote server using pip install pydoop , installed. write code read file hdfs location
import pydoop.hdfs hdfs hdfs.open('/home/ashish/pencil/somefilename.txt') file: line in file: print(line,'\n')
on execution error
traceback (most recent call last): file "/home/ashish/pycharm_proj/remote_server_connect/hdfsconxn.py", line 7, in <module> import pydoop.hdfs hdfs file /home/ashish/anaconda/lib/python2.7/sitepackages/pydoop/hdfs/__init__.py", line 82, in <module> . import common, path file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/path.py", line 28, in <module> . import common, fs hdfs_fs file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/fs.py", line 34, in <module> .core import core_hdfs_fs file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/core/__init__.py", line 49, in <module> _core_module = init(backend=hdfs_core_impl) file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/core/__init__.py", line 29, in init jvm.load_jvm_lib() file "/home/ashish/anaconda/lib/python2.7/site- packages/pydoop/utils/jvm.py", line 33, in load_jvm_lib java_home = get_java_home() file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/utils/jvm.py", line 28, in get_java_home raise runtimeerror("java home not found, try setting java_home") runtimeerror: java home not found, try setting java_home process finished exit code 1
my guess maybe not able find py4j. location of py4j
/home/ashish/anaconda/lib/python2.7/site-packages/py4j
and when echo java home on remote server,
echo $java_home
i location,
/usr/java/jdk1.7.0_67-cloudera
i new programming in python centos setup, please suggest can solve problem?
thanks
well, looks solved it. did used
sys.path.append('/usr/java/jdk1.7.0_67-cloudera')
i updated code
import os, sys sys.path.append('/usr/java/jdk1.7.0_67-cloudera') input_file = '/home/ashish/pencil/somedata.txt' open(input_file) f: line in f: print line
this code reads file hdfs in remote server , prints output in pycharm console on laptop.
by using sys.path.append() don't have manually change hadoop.sh file , cause conflicts other java configuration files.
Comments
Post a Comment