python - Pydoop error: RuntimeError: java home not found, try setting JAVA_HOME on remote server using CDH5.4 -


objective: read remote file stored in hdfs using pydoop, laptop. i'm using pycharm professional edition. i'm using cloudera cdh5.4

pycharm configuration on laptop: in project interpreter (under settings), have directed python compiler on remote server ssh://remote-server-ip-address:port-number/home/ashish/anaconda/bin/python2.7

now there file stored in hdfs location /home/ashish/pencil/somefilename.txt

then install pydoop on remote server using pip install pydoop , installed. write code read file hdfs location

import pydoop.hdfs hdfs hdfs.open('/home/ashish/pencil/somefilename.txt') file:  line in file:     print(line,'\n') 

on execution error

traceback (most recent call last): file "/home/ashish/pycharm_proj/remote_server_connect/hdfsconxn.py", line 7,  in <module> import pydoop.hdfs hdfs file /home/ashish/anaconda/lib/python2.7/sitepackages/pydoop/hdfs/__init__.py", line  82, in <module> . import common, path file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/path.py", line 28, in <module> . import common, fs hdfs_fs file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/fs.py", line 34, in <module> .core import core_hdfs_fs file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/core/__init__.py", line 49, in <module> _core_module = init(backend=hdfs_core_impl) file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/hdfs/core/__init__.py", line 29, in init jvm.load_jvm_lib() file "/home/ashish/anaconda/lib/python2.7/site- packages/pydoop/utils/jvm.py", line 33, in load_jvm_lib java_home = get_java_home() file "/home/ashish/anaconda/lib/python2.7/site-packages/pydoop/utils/jvm.py", line 28, in get_java_home raise runtimeerror("java home not found, try setting java_home") runtimeerror: java home not found, try setting java_home  process finished exit code 1 

my guess maybe not able find py4j. location of py4j

/home/ashish/anaconda/lib/python2.7/site-packages/py4j 

and when echo java home on remote server,

 echo $java_home 

i location,

/usr/java/jdk1.7.0_67-cloudera 

i new programming in python centos setup, please suggest can solve problem?

thanks

well, looks solved it. did used

sys.path.append('/usr/java/jdk1.7.0_67-cloudera') 

i updated code

import os, sys sys.path.append('/usr/java/jdk1.7.0_67-cloudera') input_file = '/home/ashish/pencil/somedata.txt' open(input_file) f:    line in f:        print line 

this code reads file hdfs in remote server , prints output in pycharm console on laptop.

by using sys.path.append() don't have manually change hadoop.sh file , cause conflicts other java configuration files.


Comments

Popular posts from this blog

php - Admin SDK -- get information about the group -

Python Error - TypeError: input expected at most 1 arguments, got 3 -

python - Pygame screen.blit not working -