Skip to content
This repository was archived by the owner on Feb 10, 2019. It is now read-only.
/ bigdata-docker Public archive

Docker images for Open Source bigdata/hadoop projects

License

Notifications You must be signed in to change notification settings

elek/bigdata-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Repository has been moved to https://flokkr.github.io










Docker images for Open Source bigdata/hadoop projects

This is the umbrella project for all of my docker images related to Apache Hadoop and other bigdata related Apache and non-apache projects.

The docker images are in separated repository:

name function
bigdata-base Special image contains base packages and the config loading scripts
docker-hadoop Apache Hadoop components (hdfs/yarn)
docker-spark Apache Spark
docker-zeppelin Apache Zeppelin
docker-zookeeper Apache Zookeeper
docker-kafka Apache Kafka
docker-hbase Apache HBase
docker-phoenix Apache Phoenix
docker-livy Cloudera Livy
docker-hive (experimental) Apache Hive
docker-storm Apache Storm
krb5 (for development only) MIT kerberos server
docker-consul-composer Special image to dynamically start compose containers based on docker-compose in a Consul server

All of the docker images contains the component extracted from the open source distribution and some advanced configuration loading mechanism.

Currently there are two main configuration use cases:

  • For using local docker-compose files we use the envtoconf utility which converts environment variables to configuration files according to the naming convention. (eg. CORE-SITE.XML_fs.default.name="hdfs://localhost:9000" will be converted to a well formed hadoop xml configuration)
  • For using a multi host environment we use the consul-launcher which downloads the configuration from consul and launch the specific starter. (It also listens to the changes and restart the process similar to the consul-template)

There is also a simple python tool to upload configuration (the consul directory of this repository) to the consul (only required if consul is used for configuration management).

This repository contains example cluster configuration, using different ways to configure (environment variables, consul, spring config server) and provision (docker-compose, ansible, consul-composer) the products

Examples

directory configuration type docker container starter provisioning cluster type network (*)
simple environment variables docker-compose local Using host network
compose environment variables docker-compose local Using dedicated docker network
consul consul consul-composer (docker-compose) consul-compose(+ansible) local/cluster Using host network
ansible environment variables docker (ansible module) ansible cluster Using host network
  • Host network is not a limitation just the example uses this simplified approach.

Locally with host network and docker-compose

This is the most simple option. All of the application will use the network of the localhost and the default ports will be available on the localhost. See the simple subdirectory for the docker-compose file of this option.

About

Docker images for Open Source bigdata/hadoop projects

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages