To deal with this you should set SPARK_HOME
before you create SparkConf
.
import os
os.environ["SPARK_HOME"] = "/home/dirk/spark-1.4.1-bin-hadoop2.6"
conf = (SparkConf().setMaster('local').setAppName('a'))
SparkConf constructor assumes that SPARK_HOME on master is already set. It calls pyspark.context.SparkContext._ensure_initialized which calls pyspark.java_gateway.launch_gateway, which tries to acccess SPARK_HOME and fails., 1 week ago SparkConf предполагает, что SPARK_HOME на master уже установлен. Он вызывает pyspark.context.SparkContext._ensure_initialized который вызывает pyspark.java_gateway.launch_gateway, который пытается получить доступ к SPARK_HOME и терпит неудачу. , pyspark relies on the spark SDK. You need to have that installed before using pyspark. Once that's set you need to set the environment variable SPARK_HOME to tell pyspark where to look for your spark installation. If you're on a *nix system you can do so by adding the follow to your .bashrc , 1 week ago This recipe shows how to initialize the SparkContext object as a part of many Spark applications. SparkContext is an object which allows us to create the base RDDs. Every Spark application must contain this object to interact with Spark. It is also used to initialize StreamingContext, SQLContext and HiveContext.
File "test.py", line 10, in <module> conf=(SparkConf().setMaster('local').setAppName('a').setSparkHome('/home/dirk/spark-1.4.1-bin-hadoop2.6/bin')) File "/home/dirk/spark-1.4.1-bin-hadoop2.6/python/pyspark/conf.py", line 104, in __init__ SparkContext._ensure_initialized() File "/home/dirk/spark-1.4.1-bin-hadoop2.6/python/pyspark/context.py", line 229, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway() File "/home/dirk/spark-1.4.1-bin-hadoop2.6/python/pyspark/java_gateway.py", line 48, in launch_gateway SPARK_HOME = os.environ["SPARK_HOME"] File "/usr/lib/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'SPARK_HOME'
import os os.environ["SPARK_HOME"] = "/home/dirk/spark-1.4.1-bin-hadoop2.6"
conf = (SparkConf().setMaster('local').setAppName('a'))
File "test.py", line 10, in <module>conf=(SparkConf().setMaster('local').setAppName('a').setSparkHome('/home/dirk/spark-1.4.1-bin-hadoop2.6/bin')) File "/home/dirk/spark-1.4.1-bin-hadoop2.6/python/pyspark/conf.py", line 104, in __init__SparkContext._ensure_initialized() File "/home/dirk/spark-1.4.1-bin-hadoop2.6/python/pyspark/context.py", line 229, in _ensure_initializedSparkContext._gateway = gateway or launch_gateway() File "/home/dirk/spark-1.4.1-bin-hadoop2.6/python/pyspark/java_gateway.py", line 48, in launch_gatewaySPARK_HOME = os.environ["SPARK_HOME"] File "/usr/lib/python2.7/UserDict.py", line 23, in __getitem__raise KeyError(key) KeyError: 'SPARK_HOME'
import os os.environ["SPARK_HOME"] = "/home/dirk/spark-1.4.1-bin-hadoop2.6"
conf = (SparkConf().setMaster('local').setAppName('a'))
pyspark relies on the spark SDK. You anycodings_machine-learning need to have that installed before using anycodings_machine-learning pyspark.,If you're using Windows there's a anycodings_machine-learning convoluted way of setting variables via anycodings_machine-learning GUI here. Through DOS you can use set in anycodings_machine-learning the place of export:,Once that's set you need to set the anycodings_machine-learning environment variable SPARK_HOME to tell anycodings_machine-learning pyspark where to look for your spark anycodings_machine-learning installation. If you're on a *nix system anycodings_machine-learning you can do so by adding the follow to anycodings_machine-learning your .bashrc,What is the easiest way to push an element to the beginning of the array?
when I try:
from pyspark
import SparkContext, SparkConf
sc = SparkContext()
I get:
KeyError: 'SPARK_HOME'
Once that's set you need to set the anycodings_machine-learning environment variable SPARK_HOME to tell anycodings_machine-learning pyspark where to look for your spark anycodings_machine-learning installation. If you're on a *nix system anycodings_machine-learning you can do so by adding the follow to anycodings_machine-learning your .bashrc
export SPARK_HOME=<location of spark install>
If you're using Windows there's a anycodings_machine-learning convoluted way of setting variables via anycodings_machine-learning GUI here. Through DOS you can use set in anycodings_machine-learning the place of export:
SET SPARK_HOME=<location of spark install>
It seems that it is not possible to run various custom startup files as it was with ipython profiles. Thus, the easiest way will be to run pyspark init script at the beginning of your notebook manually or follow alternative way.,You can check SPARK_HOME path using following brew command,You can also force pyspark shell command to run ipython web notebook instead of command line interactive interpreter. To do so you have to add following env variables:,Since profiles are not supported in jupyter and now you can see following deprecation warning
# For Apache Spark if which java > /dev/null; then export JAVA_HOME = $(/usr/libexec / java_home); fi
brew update brew install scala brew install apache - spark
# For a ipython notebook and pyspark integration if which pyspark > /dev/null; then export SPARK_HOME = "/usr/local/Cellar/apache-spark/2.1.0/libexec/" export PYTHONPATH = $SPARK_HOME / python: $SPARK_HOME / python / build: $PYTHONPATH export PYTHONPATH = $SPARK_HOME / python / lib / py4j - 0.10 .4 - src.zip: $PYTHONPATH fi
# For Apache Spark if which java > /dev/null; then export JAVA_HOME = $(/usr/libexec / java_home); fi
brew update brew install scala brew install apache - spark
# For a ipython notebook and pyspark integration if which pyspark > /dev/null; then export SPARK_HOME = "/usr/local/Cellar/apache-spark/2.1.0/libexec/" export PYTHONPATH = $SPARK_HOME / python: $SPARK_HOME / python / build: $PYTHONPATH export PYTHONPATH = $SPARK_HOME / python / lib / py4j - 0.10 .4 - src.zip: $PYTHONPATH fi
$ ipython notebook--profile = pyspark[TerminalIPythonApp] WARNING | Subcommand `ipython notebook`
is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future[W 01: 45: 07.821 NotebookApp] Unrecognized alias: '--profile=pyspark', it will probably have no effect.
Run ipython
$ jupyter - notebook
$ ipython notebook--profile = pyspark[TerminalIPythonApp] WARNING | Subcommand `ipython notebook`
is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future[W 01: 45: 07.821 NotebookApp] Unrecognized alias: '--profile=pyspark', it will probably have no effect.
export PYSPARK_DRIVER_PYTHON = jupyter
export PYSPARK_DRIVER_PYTHON_OPTS = notebook