Elastic Beanstalk - python application server structure and celery installation
Elastic beanstalk python application is deployed under /opt/python/
.
The application is running under Apache web server.
Source folder structure is this:
bin
httpdlaunch - a tool script to set environment variables and launch httpd
bundle - dir with app source code, used during updates
current - symlink to the recent source code version under bundle
app - application sources
env - shell script with environment variables (passed from EB environment settings)
etc - supervisord config
log - supervisord logs
run - virtual environments
Apache logs, deployment logs and system messages log are under /var/log
.
Another important directory is /opt/elasticbeanstalk
- here there are EB tool scripts and app server restart hooks.
Apache is managed by supervisord, to check the status run this command:
sudo /usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf status
And you can restart apache like this:
sudo /usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart httpd
How to launch python application manually
I have a flask application and usually it is started as python application.py
.
To run it on the server instance you need to init virtual environment and set environment variables first:
source /opt/python/run/venv/bin/activate && source /opt/python/current/env
Now you can start the application manually:
cd /opt/python/current/app
python application.py
Or start flask shell with flask shell
.
How to install celery
Celery is a distributed task queue.
Our requirements are following:
- the celery application should be launched automatically when new version is deployed to Elastic Beanstalk
- the celery application should be watched by supervisord and restarted in the case of failure
The final project structure will be this:
.ebextensions/ - elastic beanstal configs
myapp.config - main config
deploy.sh - deployment script, launched from main config
utils.sh - utility functions for deployment script
files/
celeryd.conf - celery app config for supervisord
99_restart_services.sh - app server restart / reload hook
scripts/
celeryd - script to run celery application
celery.py - celery application
application.py - main application
requirements.txt - project requirements for pip
First, add the following line to the requirements.txt in the root folder of your project (these requirements will be installed with pip automatically):
...
Celery==3.1.17
Note that the version number can be different, 3.1.17 is just an actual version at the moment of writing.
Then create a main celery application file. In my case, this is celery.py
file in the root folder.
Here I don’t describe the celery application file content - do whatever you need from celery here.
Now modify the elasticbeanstalk config (.elasticbeanstak/myapp.config) to include the deployment script:
container_commands:
004-start-container-commands:
command: logger "Start deploy script" -t "DEPLOY"
005-command:
command: chmod +x .ebextensions/deploy.sh
006-deploy:
command: .ebextensions/deploy.sh 2>&1 | /usr/bin/logger -t "DEPLOY" ; test ${PIPESTATUS[0]} -eq 0
200-end-container-commands:
command: logger "End container commands" -t "DEPLOY"
Here we configure ElasticBeanstalk to launch the custom deployment script.
In general I find the approach with shell script for beanstalk configuration to be much more convenient than the standard way to put shell script code to the text config file. More details about this method can be found in the great Innocuous looking Evil Devil article. Check also related posts.
This deployment script .ebextensions/deploy.sh
:
#!/usr/bin/env bash
set -e
SCRIPT_PATH=`dirname $0`
source $SCRIPT_PATH/utils.sh
# Check for leader: see utils.sh
if is_leader; then
echo "Start leader deploy"
else
echo "Start non-leader deploy"
fi
# copy celery app config
copy_ext $SCRIPT_PATH/files/celeryd.conf /opt/python/etc/celeryd.conf 0755 root root
# copy restart hook to different hooks folders
copy_ext $SCRIPT_PATH/files/99_restart_services.sh /opt/elasticbeanstalk/hooks/appdeploy/enact/99_restart_services.sh 0755 root root
copy_ext $SCRIPT_PATH/files/99_restart_services.sh /opt/elasticbeanstalk/hooks/configdeploy/enact/99_restart_services.sh 0755 root root
copy_ext $SCRIPT_PATH/files/99_restart_services.sh /opt/elasticbeanstalk/hooks/restartappserver/enact/99_restart_services.sh 0755 root root
# include celeryd.conf into the supervisord.conf
script_add_line /opt/python/etc/supervisord.conf "include" "[include]"
script_add_line /opt/python/etc/supervisord.conf "celeryd.conf" "files=celeryd.conf "
# Reread the supervisord config
supervisorctl -c /opt/python/etc/supervisord.conf reread
# Update supervisord in cache without restarting all services
supervisorctl -c /opt/python/etc/supervisord.conf update
# Start/Restart celeryd through supervisord
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
A small inconvenience with supervisord is that configuration for all apps managed by supervisord should be inside the main supervisord.conf file or additional configs should be included into it. So in any case we need to modify the main supervisord config. The script above does this by adding a following lines to it:
[include]
; it is possible to include multiple files, names should be separated by space
files=celeryd.conf
The celeryd.conf
is copied into the same folder where supervisord.conf resides.
The .ebextensions/utils.sh
script contains additional functions used by deployment script:
#!/usr/bin/env bash
set -e
SCRIPT_PATH=`dirname $0`
# An error exit function
error_exit() {
echo "$1" 1>&2
exit 1
}
# Copy + chmod + chown
# copy_ext source target 0755 user:group
copy_ext() {
#cp + chmod + chown
local source=$1
local target=$2
local permission=$3
local user=$4
local group=$5
if ! cp $source $target; then
error_exit "Can not copy ${source} to ${target}"
fi
if ! chmod -R $permission $target; then
error_exit "Can not do chmod ${permission} for ${target}"
fi
if ! chown $user:$group $target; then
error_exit "Can not do chown ${user}:${group} for ${target}"
fi
echo "cp_ext: ${source} -> ${target} chmod ${permission} & chown ${user}:${group}"
}
is_leader() {
# Check for leader: /opt/elasticbeanstalk/bin/leader-test.sh:
# use as
# if is_leader; then
# dosmth
# else
# doelse
# fi
if [[ "$EB_IS_COMMAND_LEADER" == "true" ]]; then
# to be used in if's, so '0' means true (like for script exit code - 0 is success)
#return 0
#more clear (true returns 0)
true
else
# to be used in if's, so '1' means false
#return 1
#more clear (false returns non zero)
false
fi
}
script_add_line() {
local target_file=$1
local check_text=$2
local add_text=$3
if grep -q "$check_text" "$target_file"
then
echo "Modification ${check_text} found in ${target_file}"
else
echo ${add_text} >> ${target_file}
echo "Modification ${add_text} added to ${target_file}"
fi
}
The celeryd.conf in .ebextensions/files/celeryd.conf
is a supervisord config:
[program:celeryd]
command=/opt/python/current/app/scripts/celeryd
directory=/opt/python/current/app
user=wsgi
numprocs=1
stdout_logfile=/opt/python/log/celery-worker.log
stderr_logfile=/opt/python/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 60
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
; priority=998
Restart hook .ebextensions/files/99_restart_services.sh
restarts celery application when application server is reloaded or restarted:
#!/bin/bash
set -xe
# check if we already have the celeryd service
/usr/bin/supervisorctl -c /opt/python/etc/supervisord.conf status | grep celeryd
if [[ $? ]]; then
/usr/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
fi
eventHelper.py --msg "Application server successfully restarted." --severity INFO
Finally the script under scripts/celeryd
is used to set environment variables, activate virtual environment and run celery application:
#!/bin/bash
source /opt/python/current/env
source /opt/python/run/venv/bin/activate
cd /opt/python/current/app
# Note: exec is important here - this way supervisord will control
# the python script and not the bash script
# See also: http://sortedaffairs.tumblr.com/post/49113594655/managing-virtualenv-apps-with-supervisor
#
exec /opt/python/run/venv/bin/celery worker -A tasks --loglevel=INFO
Actually we configured Elastic Beanstalk to run and manage an additional application. The same approach can be used for any kind of application, not necessary celery - it can be any other application even written in other than python language.
Bonus: how to install ZeroMQ libaray
To install ZeroMQ python library add it to requirements.txt:
Flask==0.9
boto==2.34.0
...
pyzmq==13.0.2
Installation process also includes a compilation phase and you can get an error like this:
gcc: error trying to exec 'cc1plus': execvp: No such file or directory
To solve this problem, add following packages into the elasticbeanstalk config:
packages:
yum:
gcc: []
gcc-c++: []