AL
alirezahajibagheri/spark-travis-test
Testing spark using travis
travis-pytest-spark
A Minimal setup of tests (using pytest) with pyspark.
The tests just creates a simple Spark DataFrame and assets things on it.
The main portion is the Travis CI setup located in the .travis.yml file.
The key section is as follows:
before_install:
- mkdir -p /opt
- wget -q -O /opt/spark.tgz http://www.gtlib.gatech.edu/pub/apache/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
- tar xzf /opt/spark.tgz -C /opt/
- rm /opt/spark.tgz
- export SPARK_HOME=/opt/spark-2.4.3-bin-hadoop2.7
- export PATH=$PATH:/opt/spark-2.4.3-bin-hadoop2.7/binSetting up SPARK_HOME environment variable allows pytest-spark
to find the Spark JARs and actually execute a standalone Spark
instance.
Running locally
You will need Spark install on your machine with it in your PATH and also its
location set under SPARK_HOME (for pytest)
# create env
conda create -n travis-spark python=3.6
# clone and go to folder`
git clone https://github.com/fdosani/travis-pytest-spark.git`
cd travis-pytest-spark
# install deps
pip install -r requirements.txt
#run tests
pytestspark-travis-test
On this page
Contributors
Created August 29, 2019
Updated August 29, 2019