Apache Airflow using docker along with the authentication
Airflow is a platform to programmatically author, schedule and monitor workflows.
To get started with Airflow LocalExecutor, let’s install docker, docker-compose. And this docker boilerplate is based on Python (3.6-slim) official Image python:3.6-slim and uses the official Postgres as backend.
Clone this boilerplate repository,
git clone https://github.com/saianupkumarp/airflow-docker.git
Upon cloning, change the directory to the cloned folder and let’s build the airflow,
docker build --rm -t airflow-image .
Optionally install Extra Airflow Packages and/or python dependencies before build time by adding the following line in the Dockerfile above line 62:
&& pip install -r requirements.txt \
Optionally install Extra Airflow Packages and/or python dependencies at build time by the following command :
docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" -t airflow-image .docker build --rm --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t airflow-image .
or combined
docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t airflow-image .
By default, docker-airflow runs Airflow with SequentialExecutor :
docker run -d -p 8080:8080 airflow-image webserver
If you want to run another executor, use the other docker-compose.yml files provided in this repository.
For LocalExecutor :
docker-compose -f docker-compose-LocalExecutor.yml up -d
NB : If you want to have DAGs example loaded (default=False), you’ve to set the following environment variable :
LOAD_EX=n
docker run -d -p 8080:8080 -e LOAD_EX=y airflow-image
If you want to use Ad hoc query, make sure you’ve configured connections: Go to Admin -> Connections and Edit “airflow_db” set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :
- Host : postgres
- Schema : airflow
- Login : airflow
- Password : airflow
For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup which is already in the docker-compose-LocalExecutor.ym, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key across containers. To generate a fernet_key :
docker run airflow-image python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)"
Authentication
By default, the config file which is under config/airflow.cfg
is enabled with web authentication, to disable it, change the boolean value from "True" to "False"
Below is the configuration to enable,
# Set to true to turn on authentication:
# https://airflow.apache.org/security.html#web-authentication
authenticate = True
auth_backend = airflow.contrib.auth.backends.password_auth
Follow the below steps to create the users,
Get the container id,
docker container ls
Jump onto the container bash by its id,
# With root user
docker exec -it -u root <container id> bash
Execute the following script under the airflow folder within the python console
python
import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
from sqlalchemy import create_engineuser = PasswordUser(models.User())
user.username = ''
user.email = ''
user.password = ''# Make the value true if you want the user to be a administrator
user.superuser = Falseengine = create_engine("postgresql://airflow:airflow@postgres:5432/airflow")
session = settings.Session(bind=engine)
session.add(user)
session.commit()
session.close()
exit()
Enter exit
and hit enter to come out of the container bash