IT Rocks

Sunday, 30 December 2018

How-to Series: CredHub Login Cause "validity check failed" Error? Fix it by a surprising way!

Issue / Problem Statement

Today while rebuilding my demo Concourse cluster, which has UAA (see my PR #85) + CredHub (see my PR #111) integrated, I encountered a wired error while trying to run my scripts:

$ cat > connect-concourse-credhub.sh <<EOF
source connect-bosh-credhub.sh

A="\$( credhub get -n /bosh-lite/concourse/credhub_admin_secret -j | jq -r .value )"
B="\$( credhub get -n /bosh-lite/concourse/atc_tls -j | jq -r .value.ca )"

export CREDHUB_SERVER=https://concourse.test:8844
export CREDHUB_CLIENT=credhub-admin
export CREDHUB_SECRET=\${A}
export CREDHUB_CA_CERT=\${B}

credhub login
EOF

The idea was very simple:

Retrieve the credentials from BOSH CredHub;
Export CredHub system variables and then login

Surprisingly, I kept getting below error while trying to execute the credhub login command:

invalid_token: An I/O error occurred while reading from the JWK Set source: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed

The thing was, while checking UAA and CredHub, everything looked okay!

Tried reset CredHub by `rm -rf ~/.credhub` and redeployed Concourse couldn't help.

Solution

I had to re-look at this issue again: validity check failed.
What could cause validity checking failure? Too many possibilities!

But it was obvious that the error was NOT thrown at backend, but frontend. And since it complained about token, let's check the token first.

$ credhub --token

We would get back the token that was used for login.

Open jwt.io website and paste the retrieved token to the "Encoded" box and I could see something like this:

{
  "jti": "5697ecd271b042fcbc65831bd25264c9",
  "sub": "credhub-admin",
  "authorities": [
    "credhub.write",
    "credhub.read"
  ],
  "scope": [
    "credhub.write",
    "credhub.read"
  ],
  "client_id": "credhub-admin",
  "cid": "credhub-admin",
  "azp": "credhub-admin",
  "revocable": true,
  "grant_type": "client_credentials",
  "rev_sig": "6131f65e",
  "iat": 1546099456,
  "exp": 1546103056,
  "iss": "https://concourse.test:8443/oauth/token",
  "zid": "uaa",
  "aud": [
    "credhub-admin",
    "credhub"
  ]
}

I checked the "exp" which stands for expiry and tried it out in https://www.epochconverter.com/.

Which showed:

GMT: Saturday, December 29, 2018 5:04:16 PM
Your time zone: Sunday, December 30, 2018 1:04:16 AM GMT+08:00
Relative: A day ago

OMG, this just-issued token was expired a day ago!!!

Why? Checked out the VM time -- yeah! It was obviously wrong.

So changed it back to the right Date/Time:

# date +%Y%m%d -s "20181231"
# date +%T -s "13:00:00"

After restart of the processes, everything went well.

./connect-concourse-credhub.sh
Setting the target url: https://192.168.50.6:8844
Login Successful
Setting the target url: https://concourse.test:8844
Login Successful

Saturday, 10 November 2018

Using BOSH BootLoader (BBL) to Bootstrap Control Plane

Introduction

Managing PCF or other cloud platforms requires a solid Control Plane so that we can drive the platforms in an automated way.
There is a reference architecture for how to build Control Plane for PCF. But frankly speaking, it's a great generic design for all BOSH-managed clusters.

What Components are in Control Plane?

If you ask a question like that to me, I'd start with "it depends".

But it's common to have components like:

A Jumpbox where you can start things from;
A BOSH Director so all Control Plane workloads can be managed and benefited by BOSH's great capacity;
Some Control Plane workloads like

Concourse Cluster
Prometheus Cluster with Grafana
Minio
And more

There are already BOSH releases available for all Control Plane workloads mentioned above. So deploying these is just a simply `bosh deploy`.

Here I'd like to focus on how to bootstrap our Control Plane.

Ways to Bootstrap Control Plane

There are many ways to bootstrap our Control Plane but below are two of frequently used patterns.

Manually provision a Jumpbox -> Install necessary tools in Jumpbox -> use bosh create-env to create a BOSH Director -> use bosh deploy to create BOSH releases
Use BOSH BootLoader's (bbl) bbl up to create a Jumpbox and BOSH Director -> use bosh deploy to create BOSH releases

As you may have seen, using BBL can simply the process of building our Control Plane. Furthermore, as it offers a series of out-of-the-box features, we can benefit a lot if we go this way.

Let's find out how.

The BBL Way

1. Prepare An "init.sh" File

It's a good practice to prepare a simple init.sh file to export some parameters even you can simple just run some export commands directly.

vSphere:

$ cat > init.sh <<EOF
export BBL_IAAS=vsphere
export BBL_VSPHERE_VCENTER_USER=
export BBL_VSPHERE_VCENTER_PASSWORD=
export BBL_VSPHERE_VCENTER_IP=
export BBL_VSPHERE_VCENTER_DC=
export BBL_VSPHERE_VCENTER_CLUSTER=
export BBL_VSPHERE_VCENTER_RP=
export BBL_VSPHERE_NETWORK=
export BBL_VSPHERE_VCENTER_DS=
export BBL_VSPHERE_SUBNET=
export BBL_VSPHERE_VCENTER_DISKS=
export BBL_VSPHERE_VCENTER_TEMPLATES=
export BBL_VSPHERE_VCENTER_VMS=
EOF

GCP:

cat > init.sh <<EOF
export BBL_IAAS=gcp
export BBL_GCP_REGION=
export BBL_GCP_SERVICE_ACCOUNT_KEY=
EOF

You may refer to other IaaS platform:

Now we can simply source it so that we expose the required parameters for further processing.

$ source init.sh

2. Generate a Plan: bbl plan

It's another good practice to plan it before actually executing because we may customize something.

$ bbl plan

This command will generate a series files based on the default settings:

-rw-r--r--  1 user1 user1  587 Nov 10 20:47 bbl-state.json
drwxrwxr-x 19 user1 user1 4096 Nov 10 20:47 bosh-deployment
drwxrwxr-x  2 user1 user1 4096 Nov 10 20:47 cloud-config
-rwxr-x---  1 user1 user1  644 Nov 10 20:47 create-director.sh
-rwxr-x---  1 user1 user1  569 Nov 10 20:47 create-jumpbox.sh
-rwxr-x---  1 user1 user1  644 Nov 10 20:47 delete-director.sh
-rwxr-x---  1 user1 user1  569 Nov 10 20:47 delete-jumpbox.sh
-rwxrwxr-x  1 user1 user1  568 Nov  5 14:17 init.sh
drwxrwxr-x  7 user1 user1 4096 Nov 10 20:47 jumpbox-deployment
drwxrwxr-x  3 user1 user1 4096 Nov 10 20:47 terraform
drwxr-----  2 user1 user1 4096 Nov 10 20:47 vars

As you can see, it generates a couple of files and folders.

Among them, there are something to highlight here:

bosh-deployment               --- this is a copy of bosh-deployment
cloud-config                  --- there is a cloud config file which you can customize
create-director.sh            --- a script file which will be used to create director
create-jumpbox.sh             --- a script file which will be used to create jumpbox
delete-director.sh            --- a script file which will be used to delete director
delete-jumpbox.sh             --- a script file which will be used to create jumpbox
init.sh                       --- our init file
jumpbox-deployment            --- this is a copy of jumpbox-deployment
terraform                     --- a folder which contains all terraform files
vars                          --- a folder which contains var files

It's NOT recommended to change these generated files directly if you want to customize, as another bbl plan command will replace your customization effort.

There are some conventions to customize the deployment. Below are frequently used methods:

By adding a *-override.sh file for the sh files like create-jumpbox.sh, create-director.sh
By adding a new yaml file as the ops files.

For example, if we want to customize the way we create director later, say adding custom dns, we can do something like this:

$ cp create-director.sh create-director-override.sh
$ vi create-director-override.sh
#!/bin/sh
bosh create-env \
  ${BBL_STATE_DIR}/bosh-deployment/bosh.yml \
  --state  ${BBL_STATE_DIR}/vars/bosh-state.json \
  --vars-store  ${BBL_STATE_DIR}/vars/director-vars-store.yml \
  --vars-file  ${BBL_STATE_DIR}/vars/director-vars-file.yml \
  -o  ${BBL_STATE_DIR}/bosh-deployment/vsphere/cpi.yml \
  -o  ${BBL_STATE_DIR}/bosh-deployment/jumpbox-user.yml \
  -o  ${BBL_STATE_DIR}/bosh-deployment/uaa.yml \
  -o  ${BBL_STATE_DIR}/bosh-deployment/credhub.yml \
  -o  ${BBL_STATE_DIR}/bosh-deployment/vsphere/resource-pool.yml \
  -o  ${BBL_STATE_DIR}/bosh-deployment/misc/dns.yml \
  -v  vcenter_user="${BBL_VSPHERE_VCENTER_USER}" \
  -v  vcenter_password="${BBL_VSPHERE_VCENTER_PASSWORD}" \
  -v internal_dns=[10.193.239.2] \
  -v internal_cidr=10.193.239.0/24 \
  -v internal_ip=10.193.239.41

At this case, the bbl tool will use the -override.sh file instead of the originally generated file to execute.

Or if you want to add another ops file to customize the default network:

$ cat > ops-cloud-config-network.yml <<EOF
- type: replace
  path: /networks/name=default/subnets/0/static?
  value: [11.193.239.50-10.193.239.70]

- type: replace
  path: /networks/name=default/subnets/0/reserved?
  value: [11.193.239.1-10.193.239.49]
EOF

$ cp ops-cloud-config-network.yml cloud-config/

At this case, the newly added ops file will be automatically added and merged as another ops file.
Of course, if you really want to update further on cloud config, you can do it anytime, even after the deployment.

3. Execute It: bbl up

Once you've done the customization, you can issue the bbl up command to execute it.

$ bbl up
step: terraform init
step: terraform apply
step: creating jumpbox
Deployment manifest: '/home/user1/bbl/jumpbox-deployment/jumpbox.yml'
Deployment state: '/home/user1/bbl/vars/jumpbox-state.json'

Started validating
  Downloading release 'os-conf'... Skipped [Found in local cache] (00:00:00)
  Validating release 'os-conf'... Finished (00:00:00)
  Downloading release 'bosh-vsphere-cpi'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bosh-vsphere-cpi'... Finished (00:00:00)
  Validating cpi release... Finished (00:00:00)
  Validating deployment manifest... Finished (00:00:00)
  Downloading stemcell... Skipped [Found in local cache] (00:00:00)
  Validating stemcell... Finished (00:00:03)
Finished validating (00:00:03)

Started installing CPI
  Compiling package 'ruby-2.4-r3/8471dec5da9ecc321686b8990a5ad2cc84529254'... Finished (00:01:44)
  Compiling package 'vsphere_cpi/3049e51ead9d72268c1f6dfb5b471cbc7e2d6816'... Finished (00:00:50)
  Compiling package 'iso9660wrap/82cd03afdce1985db8c9d7dba5e5200bcc6b5aa8'... Finished (00:00:00)
  Installing packages... Finished (00:00:00)
  Rendering job templates... Finished (00:00:00)
  Installing job 'vsphere_cpi'... Finished (00:00:00)
Finished installing CPI (00:02:35)

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-trusty-go_agent/3468.17'... Finished (00:00:30)

Started deploying
  Creating VM for instance 'jumpbox/0' from stemcell 'sc-4227b41a-f52a-4192-bfce-02f7cf802067'... Finished (00:00:21)
  Waiting for the agent on VM 'vm-3f9dccaf-a8e5-4214-bc6e-8413c6ff4dfb' to be ready... Finished (00:00:18)
  Rendering job templates... Finished (00:00:00)
  Updating instance 'jumpbox/0'... Finished (00:00:11)
  Waiting for instance 'jumpbox/0' to be running... Finished (00:00:00)
  Running the post-start scripts 'jumpbox/0'... Finished (00:00:01)
Finished deploying (00:00:57)

Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Succeeded
step: created jumpbox
step: creating bosh director
Deployment manifest: '/home/user1/bbl/bosh-deployment/bosh.yml'
Deployment state: '/home/user1/bbl/vars/bosh-state.json'

Started validating
  Downloading release 'bosh'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bosh'... Finished (00:00:00)
  Downloading release 'bpm'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bpm'... Finished (00:00:01)
  Downloading release 'bosh-vsphere-cpi'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bosh-vsphere-cpi'... Finished (00:00:00)
  Downloading release 'os-conf'... Skipped [Found in local cache] (00:00:00)
  Validating release 'os-conf'... Finished (00:00:00)
  Downloading release 'uaa'... Skipped [Found in local cache] (00:00:00)
  Validating release 'uaa'... Finished (00:00:03)
  Downloading release 'credhub'... Skipped [Found in local cache] (00:00:00)
  Validating release 'credhub'... Finished (00:00:01)
  Validating cpi release... Finished (00:00:00)
  Validating deployment manifest... Finished (00:00:00)
  Downloading stemcell... Skipped [Found in local cache] (00:00:00)
  Validating stemcell... Finished (00:00:05)
Finished validating (00:00:32)

Started installing CPI
  Compiling package 'ruby-2.4-r4/0cdc60ed7fdb326e605479e9275346200af30a25'... Finished (00:01:46)
  Compiling package 'iso9660wrap/82cd03afdce1985db8c9d7dba5e5200bcc6b5aa8'... Finished (00:00:00)
  Compiling package 'vsphere_cpi/e1a84e5bd82eb1abfe9088a2d547e2cecf6cf315'... Finished (00:00:52)
  Installing packages... Finished (00:00:00)
  Rendering job templates... Finished (00:00:00)
  Installing job 'vsphere_cpi'... Finished (00:00:00)
Finished installing CPI (00:02:40)

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-xenial-go_agent/97.12'... Finished (00:00:40)

Started deploying
  Creating VM for instance 'bosh/0' from stemcell 'sc-dea8d9a0-e423-4b92-8cc3-b72dae27f65e'... Finished (00:00:27)
  Waiting for the agent on VM 'vm-c5b55b02-91ca-4986-9eab-558fc35245f7' to be ready... Finished (00:00:15)
  Creating disk... Finished (00:00:06)
  Attaching disk 'disk-2afadc6c-5d44-4681-8fea-8b1b2ceefe78' to VM 'vm-c5b55b02-91ca-4986-9eab-558fc35245f7'... Finished (00:00:22)
  Rendering job templates... Finished (00:00:09)
  Compiling package 'ruby-2.4-r4/0cdc60ed7fdb326e605479e9275346200af30a25'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'openjdk_1.8.0/4d45452ce6bd79122873640ac63cae4d9b419ed4'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'bpm-runc/c0b41921c5063378870a7c8867c6dc1aa84e7d85'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'golang/27413c6b5a88ea20a24a9eed74d4b090b7b88331'... Skipped [Package already compiled] (00:00:01)
  Compiling package 'golang-1.9-linux/8d6c67abda8684ce454f0bc74050a213456573ff'... Skipped [Package already compiled] (00:00:01)
  Compiling package 'mysql/898f50dde093c366a644964ccb308a5281c226de'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'libpq/e2414662250d0498c194c688679661e09ffaa66e'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'ruby-2.4-r4/0cdc60ed7fdb326e605479e9275346200af30a25'... Finished (00:01:51)
  Compiling package 'health_monitor/2ea21f1adae7dd864b38dff926675ba4fca89ef0'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'verify_multidigest/8fc5d654cebad7725c34bb08b3f60b912db7094a'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'davcli/f8a86e0b88dd22cb03dec04e42bdca86b07f79c3'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'lunaclient/b922e045db5246ec742f0c4d1496844942d6167a'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'bosh-gcscli/fce60f2d82653ea7e08c768f077c9c4a738d0c39'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'credhub/62f912abb406d6d9b49393be629713fd407328c7'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'nginx/5a68865452a3bdcc233867edbbb59c1e18658f6b'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'gonats/73ec55f11c24dd7c02288cdffa24446023678cc2'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'bpm/d139b63561eaa3893976416be9668dea539bf17d'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'configurator/d19e331ac9c867c132d19426007802f86070526a'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'uaa/87da0e8d38c63e84fda7069ce77285419399623d'... Skipped [Package already compiled] (00:00:01)
  Compiling package 's3cli/3097f27cb9356172c9ae52de945821c4e338c87a'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'uaa_utils/90097ea98715a560867052a2ff0916ec3460aabb'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'postgres-9.4/52b3a31d7b0282d342aa7a0d62d8b419358c6b6b'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'iso9660wrap/82cd03afdce1985db8c9d7dba5e5200bcc6b5aa8'... Finished (00:00:01)
  Compiling package 'director/81a742bcbbb4f6eabea846365b9fd491d8d2fff8'... Skipped [Package already compiled] (00:00:00)
  Compiling package 'vsphere_cpi/e1a84e5bd82eb1abfe9088a2d547e2cecf6cf315'... Finished (00:00:53)
  Updating instance 'bosh/0'... Finished (00:01:00)
  Waiting for instance 'bosh/0' to be running... Finished (00:01:38)
  Running the post-start scripts 'bosh/0'... Finished (00:00:02)
Finished deploying (00:06:57)

Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Succeeded
step: created bosh director
step: generating cloud config
step: applying cloud config
step: applying runtime config

As a result, below items will be ready and deployed:

A BOSH Director
A Jumpbox VM
A set of randomly generated BOSH director credentials
A generated keypair allowing you to SSH into the BOSH Director and any instances BOSH deploys
A copy of the manifest the BOSH Director was deployed with
A basic cloud config

4. Verify the deployment

The easiest way to verify the environment may be as below:

$ eval "$(bbl print-env)"
$ bosh vms
Using environment 'https://10.193.239.41:25555' as client 'admin'

Succeeded

Yes, the BOSH Director is ready to rock!

What else? Well, you can do some interesting things like ssh into Jumpbox VM:

$ bbl ssh --jumpbox
...

jumpbox/0:~$

Or ssh into BOSH Director VM for troubleshooting purposes:

$ bbl ssh --director
...

bosh/0:~$ sudo su -
bosh/0:~# monit summary
The Monit daemon 5.2.5 uptime: 10m

Process 'nats'                      running
Process 'postgres'                  running
Process 'blobstore_nginx'           running
Process 'director'                  running
Process 'worker_1'                  running
Process 'worker_2'                  running
Process 'worker_3'                  running
Process 'worker_4'                  running
Process 'director_scheduler'        running
Process 'director_sync_dns'         running
Process 'director_nginx'            running
Process 'health_monitor'            running
Process 'uaa'                       running
Process 'credhub'                   running
System 'system_localhost'           running

What's next? Having a BOSH Director is a really a great start for deploying cool software, like Concourse, Prometheus, Mino, MySQL, Kafka or whatever.

Meanwhile, you may discover BBL more by accessing its GitHub repo, here.

Enjoy!

Saturday, 27 October 2018

How-to Series: How to `bosh scp` files into BOSH-managed VMs' Folders

Issue / Problem Statement

If we bosh scp directly to folders under /var/vcap/jobs/xxx/config, you will encounter Permission denied like:

$ bosh -e lite -d concourse4 scp concourse.yml web:/var/vcap/jobs/uaa/config/

Using environment '192.168.50.6' as client 'admin'

Using deployment 'concourse4'

Task 3687. Done
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | Unauthorized use is strictly prohibited. All access and activity
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | is subject to logging and monitoring.
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | scp: /var/vcap/jobs/uaa/config//concourse.yml: Permission denied

Interestingly, if we bosh scp to some folders like `/tmp', it's fine:

$ bosh -e lite -d concourse4 scp concourse.yml web:/tmp/
Using environment '192.168.50.6' as client 'admin'

Using deployment 'concourse4'

Task 3689. Done
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | Unauthorized use is strictly prohibited. All access and activity
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | is subject to logging and monitoring.

Let's verify it:

$ bosh -e lite -d concourse4 ssh web -c 'ls -la /tmp/'
Using environment '192.168.50.6' as client 'admin'

Using deployment 'concourse4'

Task 3693. Done
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | Unauthorized use is strictly prohibited. All access and activity
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | is subject to logging and monitoring.
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | total 2128
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | drwxrwx--T  4 root                 vcap                    4096 Oct 27 13:27 .
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | drwxr-xr-x 34 root                 root                    4096 Oct 25 08:55 ..
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | -rw-r--r--  1 root                 root                  233394 Oct 25 08:57 ca-certificates.crt
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | -rw-r--r--  1 bosh_99ef981c59fa482 bosh_99ef981c59fa482    2496 Oct 27 13:27 concourse.yml
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | -rwxr-xr-x  1 root                 root                 1923450 Aug 24 05:46 garden-init
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | drwxr-xr-x  2 root                 root                    4096 Oct 25 08:57 hsperfdata_root
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | drwxr-x---  2 vcap                 vcap                    4096 Oct 25 08:57 hsperfdata_vcap
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | Connection to 10.244.0.104 closed.

Yes, we can see that the file concourse.yml has been successfully copied into our BOSH-managed VM node web.

Now if we really want to scp file(s) to the right place under /var/vcap/jobs/xxx/config, how?

Solution

As you have seen we can copy over to /tmp, let's wrap it up with some more scripts.

$ bosh -e lite -d concourse4 scp concourse.yml web:/tmp/
$ bosh -e lite -d concourse4 ssh web -c 'sudo cp /tmp/concourse.yml /var/vcap/jobs/uaa/config/'

Let's verify it again:

$ bosh -e lite -d concourse4 ssh web -c 'sudo ls -al /var/vcap/jobs/uaa/config/'
Using environment '192.168.50.6' as client 'admin'

Using deployment 'concourse4'

Task 3699. Done
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | Unauthorized use is strictly prohibited. All access and activity
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | is subject to logging and monitoring.
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | total 60
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | drwxr-x--- 3 root vcap  4096 Oct 27 13:34 .
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | drwxr-x--- 5 root vcap  4096 Oct 25 08:56 ..
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | -rw-r----- 1 root vcap   420 Oct 25 08:56 bpm.yml
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stdout | -rw-r--r-- 1 root root  2496 Oct 27 13:34 concourse.yml
...
web/4de05ffc-ae08-4dfa-a52a-a751e62b624f: stderr | Connection to 10.244.0.104 closed.

Yes, we made it! The file has been copied to the desired folder, say /var/vcap/jobs/uaa/config/.

I also posted it in my gist, here.

References:

1. https://bosh.io/docs/cli-v2/#scp

Tuesday, 23 October 2018

How-to Series: How to Quickly Setup OpenLDAP for Testing Purposes

Issue / Problem Statements

Like it or not, LDAP still plays heavy roles in most of the organizations as user store and authentication authority.
To setup a proper LDAP Server may require some LDAP knowledge and experience. So how to setup a LDAP in a quick (say <10mins) and easy way becomes obvious requirement.

Solution:

Highlights:

Use Docker
OpenLDAP
With sample data (which can be customized of course) pre-loaded
Provide sample `ldapsearch`

Steps:

1. Prepare users and group data, the *.ldif files

$ mkdir testdata

$ cat testdata/1-users.ldif
dn: ou=people,dc=bright,dc=com
ou: people
description: All people in organisation
objectclass: organizationalunit

# admin1
dn: cn=admin1,ou=people,dc=bright,dc=com
objectClass: inetOrgPerson
sn: admin1
cn: admin1
uid: admin1
mail: admin1@bright.com
# secret
userPassword: {SSHA}RRN6AM9u0tpTEOn6oBcIt9X3BbFPKVk5

# admin2
dn: cn=admin2,ou=people,dc=bright,dc=com
objectClass: inetOrgPerson
sn: admin2
cn: admin2
uid: admin2
mail: admin2@bright.com
# secret
userPassword: {SSHA}RRN6AM9u0tpTEOn6oBcIt9X3BbFPKVk5

# developer
dn: cn=developer1,ou=people,dc=bright,dc=com
objectClass: inetOrgPerson
sn: developer1
cn: developer1
mail: developer1@bright.com
userPassword: {SSHA}RRN6AM9u0tpTEOn6oBcIt9X3BbFPKVk5

$ cat testdata/2-groups.ldif
dn: ou=groups,dc=bright,dc=com
objectClass: organizationalUnit
ou: groups

dn: cn=admins,ou=groups,dc=bright,dc=com
objectClass: groupOfNames
cn: admins
member: cn=admin1,ou=people,dc=bright,dc=com
member: cn=admin2,ou=people,dc=bright,dc=com

dn: cn=developers,ou=groups,dc=bright,dc=com
objectClass: groupOfNames
cn: developers
member: cn=admin1,ou=people,dc=bright,dc=com
member: cn=developer1,ou=people,dc=bright,dc=com

2. Start It Up

$ docker run --name my-openldap-container \
    --env LDAP_ORGANISATION="Bright Inc" \
    --env LDAP_DOMAIN="bright.com" \
    --env LDAP_ADMIN_PASSWORD="secret" \
    --volume "$(pwd)"/testdata:/container/service/slapd/assets/config/bootstrap/ldif/custom \
    -p 10389:389 \
    --detach \
    osixia/openldap:1.2.2 --copy-service --loglevel debug

Now the LDAP service is exposed from container port 389 to local port 10389.

Note: to clean it up, use below command:

$ docker stop $(docker ps -aqf "name=my-openldap-container") && \
  docker rm $(docker ps -aqf "name=my-openldap-container")

3. Test It Out

$ ldapsearch -LLL -x \

    -H ldap://localhost:10389 \
    -D "cn=admin,dc=bright,dc=com" -w secret \
    -b 'dc=bright,dc=com' \
    dn
dn: dc=bright,dc=com
dn: cn=admin,dc=bright,dc=com
dn: ou=people,dc=bright,dc=com
dn: cn=admin1,ou=people,dc=bright,dc=com
dn: cn=admin2,ou=people,dc=bright,dc=com
dn: cn=developer1,ou=people,dc=bright,dc=com
dn: ou=groups,dc=bright,dc=com
dn: cn=admins,ou=groups,dc=bright,dc=com
dn: cn=developers,ou=groups,dc=bright,dc=com

$ ldapsearch -LLL -x \
    -H ldap://localhost:10389 \
    -D "cn=admin,dc=bright,dc=com" -w secret \
    -b 'dc=bright,dc=com' \
    '(&(objectClass=groupOfNames)(member=cn=admin2,ou=people,dc=bright,dc=com))' \
    cn
dn: cn=admins,ou=groups,dc=bright,dc=com
cn: admins

That's it. Hope it helps!

Thursday, 11 October 2018

How-to Series: How To Correctly Enable Self-signed Cert for PCF - NSX-T Integration

Issue Description:

While trying to enable NSX-T for Pivotal Cloud Foundry (PCF), OpsMan complains about:

Error connecting to NSX IP: The NSX server's certificate is not signed with the provided NSX CA cert. Please provide the correct NSX CA cert', type: IaasConfigurationVerifier

Issue Analysis:

As the connectivity between PCF and NSX-T Manager is secured by SSL, one has to configure the CA certificate in PEM format that authenticates to the NSX-T Manager.
Refer to below screenshot for the configuration requirements.

Now the issue is how to get back the right CA cert and what should be the right way to handle this situation.

Solution:

If you're using a simple self-signed cert, you can probably try below "quick" solution -- simply use openssl tool to get the cert back, as suggested in the doc here:

openssl s_client -showcerts \
  -connect NSX-MANAGER-ADDRESS:443 < /dev/null 2> /dev/null | openssl x509

So that you can simply get the cert and put it there.

But sometimes it won't work so I'd always recommend a "better" way and the steps are as below:

1. Generate NSX-T CSR

Log on to the NSX Manager and navigate to System|Trust, click CSRs tab and then “Generate CSR”, populate the certificate request and click Save.
Select the new CSR and click Actions|Download CSR PEM to save the exported CSR in PEM format.

2. Ask Your CA to Sign This CSR

Submit the CSR file to your CA to get it signed and get back the new certificate.

Some organization has a CA chain and use the intermediate CA to sign the real CSR. At this case, please remember to make it a full cert chain by having certs concatenated (simply copy & paste to the end of the cert, one by one in a good text editor like Visual Studio Code) as:
- Newly signed NSX Certificate
- Intermediate CA
- Root CA

But sometimes there is no official internal CA, I'd always recommend to generate one so that you can have better control and easier trust link while handling big distributed systems like PCF.
I compiled a simple two-step process for doing that:
- Gist of generate-internal-ca.md
- Gist of sign-certs-with-internal-ca.md

3. Configure NSX Manager

Now assuming we already got our CSR signed and with you there is the concatenated certs.

In NSX Manager, select the CSRs tab and click Actions | Import Certificate for CSR.
In the window, paste in the concatenated certificates from above and click save.

Now you’ll have a new certificate and CA certs listed under Certificates tab.

4. Apply the Certs

Under Certificates tab, copy the full ID of the newly added certificate (not CA).
Note: somehow the UI only shows a portion of the ID by default, click it to display the full ID and copy it to the clip board.

Launch RESTClient in Firefox or Advanced REST client in Chrome.
Make sure have below request configuration:
- URL: https://<NSX Manager IP or FQDN>api/v1/node/services/http?action=apply_certificate&certificate_id=<certificate ID copied in previous step>
- Authentication: Basic Authentication, enter the NSX Manager credentials for Username and Password
- Method: POST

click SEND button and make sure that the response status code is 200.

Refresh browser session to NSX Manager GUI to confirm new certificate is in use.

Ref:
1. https://brianragazzi.wordpress.com/2018/03/08/replacing-the-self-signed-certificate-on-nsx-t/

Monday, 18 September 2017

Deploy Vault, the great credential management tool, by BOSH

Yes, We Need Credential Management Tool

Credential management is critical in microservices world.
You can imagine that within microservices architecture, there are a series of services/apps deployed into our environment. How to provide a service, let's use MySQL as an example, to identified applications, creation of credentials is mandatory.
Now how to handover the credential to these identified applications? How to make sure the credentials won't be leaked to developers/operators?

Yes, in short, we need credential management tools like Vault.

Since BOSH is a great platform for deploying and managing software clusters. It's a good idea to let Vault work with BOSH.

Preparation & Requisites

1. Prepare Infra

The process is based on Google Cloud Platform (GCP).
But, as BOSH is the great platform to abstract the IaaSes, changing to another IaaS, for example AWS or Azure, is just a trivial thing to handle.

What we need to prepare are:

VPC with subnet(s). In my example, my VPC is as below

VPC Network: bosh-sandbox
Subnet: bosh-releases
Region: us-central1
IP address range: 10.0.100.0/22

A Service Account with necessary Roles, for example:

App Engine Admin
Compute Instance Admin (v1)
Compute Network Admin
Compute Storage Admin

A Ubuntu Jumpbox (within the same VPC)

2. Create BOSH Director

$ git clone https://github.com/cloudfoundry/bosh-deployment
$ bosh create-env bosh-deployment/bosh.yml \
    --state=state.json \
    --vars-store=creds.yml \
    -o bosh-deployment/gcp/cpi.yml \
    -v director_name=bosh-gcp \
    -v internal_cidr=10.0.100.0/22 \
    -v internal_gw=10.0.100.1 \
    -v internal_ip=10.0.100.6 \
    --var-file gcp_credentials_json=<CREDENTIAL JSON FILE> \
    -v project_id=<PROJECT ID> \
    -v zone=us-central1-a \
    -v tags=[internal,bosh] \
    -v network=bosh-sandbox \
    -v subnetwork=bosh-releases

3. Alias BOSH Env

$ bosh alias-env sandbox -e 10.0.100.6 --ca-cert <(bosh int ./creds.yml --path /director_ssl/ca)

4. Export & Login to the Director

$ export BOSH_CLIENT=admin && export BOSH_CLIENT_SECRET=`bosh int ./creds.yml --path /admin_password`

5. Prepare & Update Cloud Config

$ bosh -e sandbox update-cloud-config cloud-config-gcp.yml

6. Upload Stemcells

$ bosh -e sandbox upload-stemcell stemcells/light-bosh-stemcell-3421.11-google-kvm-ubuntu-trusty-go_agent.tgz

Prepare Manifest File

$ vi vault.yml

With below content:

---
name: vault

instance_groups:
- instances: 1
  name: vault
  networks: [
    {
      name: ((VAULT_NW_NAME)), 
      static_ips: ((VAULT_STATIC_IP))
    }
  ]
  persistent_disk: 4096
  properties:
    vault:
      backend:
        use_file: true
      ha:
        redirect: null
  vm_type: ((VAULT_VM_TYPE))
  stemcell: trusty
  azs: [((VAULT_AZ_NAME))]
  jobs:
  - name: vault
    release: vault

releases:
- name: vault
  version: 0.6.2
  url: https://bosh.io/d/github.com/cloudfoundry-community/vault-boshrelease?v=0.6.2
  sha1: 36fd3294f756372ff9fbbd6dfac11fe6030d02f9

stemcells:
- alias: trusty
  os: ubuntu-trusty
  version: latest

update:
  canaries: 1
  canary_watch_time: 1000-30000
  max_in_flight: 50
  serial: false
  update_watch_time: 1000-30000

Deploy Vault

$ bosh -e sandbox -d vault deploy vault.yml \
    -v VAULT_NW_NAME=network-z1-only \
    -v VAULT_STATIC_IP=10.0.100.10 \
    -v VAULT_VM_TYPE=small \
    -v VAULT_AZ_NAME=z1

Once it's done, we can verify:

$ bosh -e sandbox -d vault vms
Using environment '10.0.100.6' as client 'admin'

Task 118. Done

Deployment 'vault'

Instance                                    Process State  AZ  IPs          VM CID                                   VM Type
vault/8753e1bb-e486-44a2-8bba-25fa4c2278b4  running        z1  10.0.100.10  vm-86c20491-ebdc-4948-631f-e0ac733d0690  small

1 vms

Succeeded

Post Actions

After deployment is done, enable it by following steps.

1. Prepare Tools

$ wget https://releases.hashicorp.com/vault/0.8.2/vault_0.8.2_linux_amd64.zip
$ unzip vault_0.8.2_linux_amd64.zip
$ chmod +x vault && sudo mv vault /usr/local/bin/
$ vault -v
    Vault v0.8.2 ('9afe7330e06e486ee326621624f2077d88bc9511')

2. Unseal Vault

$ export VAULT_ADDR=http://10.0.100.10:8200
    Unseal Key 1: xxx
    Unseal Key 2: xxx
    Unseal Key 3: xxx
    Unseal Key 4: xxx
    Unseal Key 5: xxx
    Initial Root Token: xxx

    Vault initialized with 5 keys and a key threshold of 3. Please
    securely distribute the above keys. When the vault is re-sealed,
    restarted, or stopped, you must provide at least 3 of these keys
    to unseal it again.

    Vault does not store the master key. Without at least 3 keys,
    your vault will remain permanently sealed.
$ vault unseal
    Key (will be hidden):
    Sealed: true
    Key Shares: 5
    Key Threshold: 3
    Unseal Progress: 1
    Unseal Nonce: 1ffc83e5-2e5f-1038-4e68-0223f1544746
$ vault unseal
    Key (will be hidden):
    Sealed: true
    Key Shares: 5
    Key Threshold: 3
    Unseal Progress: 2
    Unseal Nonce: 1ffc83e5-2e5f-1038-4e68-0223f1544746
$ vault unseal
    Key (will be hidden):
    Sealed: false
    Key Shares: 5
    Key Threshold: 3
    Unseal Progress: 0
    Unseal Nonce:
$ vault auth
    Token (will be hidden):
    Successfully authenticated! You are now logged in.
    token: 5e3f7eba-2e27-fc74-7c55-f6084bad4b00
    token_duration: 0
    token_policies: [root]

3. Try Putting Values

Now, you can put secrets in the vault, and read them back out. For example:

$ vault write secret/test mykey=myvalue
    Success! Data written to: secret/test
$ vault read secret/test
    Key              Value
    ---              -----
    refresh_interval 768h0m0s
    mykey            myvalue
$ vault delete secret/test
    Success! Deleted 'secret/test' if it existed.

Yeah! Vault, managed by BOSH, is ready to rock.

Note: this is just the basic setup, please refer to vault-boshrelease for more advanced topics like HA, zero-downtime upgrade etc.

Thursday, 20 October 2016

Integrated Continuous Deployment (CD) Solution with Mesos, Zookeeper, Marathon on top of CI

Overview

After development of our RESTful webservices (see post here) and CI exercises (see post here), now we eventually reach the Continuous Deployment (CD) part.
Obviously, we’re going to experiment a Docker-based CD process.

Below components will be engaged to build up our integrated CD solution:

The Docker image we built and published previously in our private Docker Registry;
Apache Mosos: The DC/OS platform for managing our CPU, memory, storage, and other compute resources within our distributed clusters;
Apache ZooKeeper: A highly reliable distributed coordination tool for Apache Mosos cluster;
Mesosphere Marathon: The container orchestration platform for Apache Mesos.

What we’re going to focus in this chapter is as highlighted as below:

Table of Contents

Update docker-compose.yml By Adding CD Components

Marathon Deployment File
Deployment Scripts

Plug It In By Adding One More Build Step In Jenkins
Trigger the Process Again
Verify Our App
Conclusion

Update docker-compose.yml By Adding CD Components

Having the infrastructure setup for CI previously, it’s pretty easy to add more components into our Docker Compose configuration file for our CD solution which includes:

ZooKeeper
Mesos: Master, Slave
Marathon

ZooKeeper

~/docker-compose.yml

…
# Zookeeper
zookeeper:
  image: jplock/zookeeper:3.4.5
  ports:
    - "2181"

Mesos: mesos-master, mesos-slave

~/docker-compose.yml

…
# Mesos Master
mesos-master:
  image: mesosphere/mesos-master:0.28.1-2.0.20.ubuntu1404
  hostname: mesos-master
  links:
    - zookeeper:zookeeper
  environment:
    - MESOS_ZK=zk://zookeeper:2181/mesos
    - MESOS_QUORUM=1
    - MESOS_WORK_DIR=/var/lib/mesos
    - MESOS_LOG_DIR=/var/log
  ports:
    - "5050:5050"

# Mesos Slave
mesos-slave:
  image: mesosphere/mesos-slave:0.28.1-2.0.20.ubuntu1404
  links:
    - zookeeper:zookeeper
    - mesos-master:mesos-master
  environment:
    - MESOS_MASTER=zk://zookeeper:2181/mesos
    - MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins
    - MESOS_CONTAINERIZERS=docker,mesos
    - MESOS_ISOLATOR=cgroups/cpu,cgroups/mem
    - MESOS_LOG_DIR=/var/log
  volumes:
    - /var/run/docker.sock:/run/docker.sock
- /usr/bin/docker:/usr/bin/docker

    - /sys:/sys:ro
- mesosslace-stuff:/var/log

    - /lib/x86_64-linux-gnu/libsystemd-journal.so.0:/lib/x86_64-linux-gnu/libsystemd-journal.so.0
    - /usr/lib/x86_64-linux-gnu/libapparmor.so.1:/usr/lib/x86_64-linux-gnu/libapparmor.so.1
    - /lib/x86_64-linux-gnu/libcgmanager.so.0:/lib/x86_64-linux-gnu/libcgmanager.so.0
    - /lib/x86_64-linux-gnu/libnih.so.1:/lib/x86_64-linux-gnu/libnih.so.1
    - /lib/x86_64-linux-gnu/libnih-dbus.so.1:/lib/x86_64-linux-gnu/libnih-dbus.so.1
    - /lib/x86_64-linux-gnu/libgcrypt.so.11:/lib/x86_64-linux-gnu/libgcrypt.so.11
  expose:
    - "5051"

Marathon

~/docker-compose.yml

…
# Marathon
marathon:
  image: mesosphere/marathon
  links:
    - zookeeper:zookeeper
  ports:
    - "8080:8080"
  command: --master zk://zookeeper:2181/mesos --zk zk://zookeeper:2181/marathon

The Updated docker-compose.yml

Now the final updated ~/docker-compose.yml is as below:

# Git Server
gogs:
  image: 'gogs/gogs:latest'
  #restart: always
  hostname: '192.168.56.118'
  ports:
    - '3000:3000'
    - '1022:22'
  volumes:
    - '/data/gogs:/data'

# Jenkins CI Server
jenkins:
  image: devops/jenkinsci
#  image: jenkinsci/jenkins
#  links:
#    - marathon:marathon
  volumes:
    - /data/jenkins:/var/jenkins_home

    - /opt/jdk/java_home:/opt/jdk/java_home
    - /opt/maven:/opt/maven
    - /data/mvn_repo:/data/mvn_repo

    - /var/run/docker.sock:/var/run/docker.sock
    - /usr/bin/docker:/usr/bin/docker
    - /etc/default/docker:/etc/default/docker

    - /lib/x86_64-linux-gnu/libsystemd-journal.so.0:/lib/x86_64-linux-gnu/libsystemd-journal.so.0
    - /usr/lib/x86_64-linux-gnu/libapparmor.so.1:/usr/lib/x86_64-linux-gnu/libapparmor.so.1
    - /lib/x86_64-linux-gnu/libcgmanager.so.0:/lib/x86_64-linux-gnu/libcgmanager.so.0
    - /lib/x86_64-linux-gnu/libnih.so.1:/lib/x86_64-linux-gnu/libnih.so.1
    - /lib/x86_64-linux-gnu/libnih-dbus.so.1:/lib/x86_64-linux-gnu/libnih-dbus.so.1
    - /lib/x86_64-linux-gnu/libgcrypt.so.11:/lib/x86_64-linux-gnu/libgcrypt.so.11
  ports:
- "8081:8080"

# Private Docker Registry
docker-registry:
  image: registry
  volumes:
    - /data/registry:/var/lib/registry
  ports:
    - "5000:5000"

# Zookeeper
zookeeper:
  image: jplock/zookeeper:3.4.5
  ports:
    - "2181"

# Mesos Master
mesos-master:
  image: mesosphere/mesos-master:0.28.1-2.0.20.ubuntu1404
  hostname: mesos-master
  links:
    - zookeeper:zookeeper
  environment:
    - MESOS_ZK=zk://zookeeper:2181/mesos
    - MESOS_QUORUM=1
    - MESOS_WORK_DIR=/var/lib/mesos
    - MESOS_LOG_DIR=/var/log
  ports:
    - "5050:5050"

# Mesos Slave
mesos-slave:
  image: mesosphere/mesos-slave:0.28.1-2.0.20.ubuntu1404
  links:
    - zookeeper:zookeeper
    - mesos-master:mesos-master
  environment:
    - MESOS_MASTER=zk://zookeeper:2181/mesos
    - MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins
    - MESOS_CONTAINERIZERS=docker,mesos
    - MESOS_ISOLATOR=cgroups/cpu,cgroups/mem
    - MESOS_LOG_DIR=/var/log
  volumes:
    - /var/run/docker.sock:/run/docker.sock
     - /usr/bin/docker:/usr/bin/docker

    - /sys:/sys:ro
    - mesosslace-stuff:/var/log

    - /lib/x86_64-linux-gnu/libsystemd-journal.so.0:/lib/x86_64-linux-gnu/libsystemd-journal.so.0
    - /usr/lib/x86_64-linux-gnu/libapparmor.so.1:/usr/lib/x86_64-linux-gnu/libapparmor.so.1
    - /lib/x86_64-linux-gnu/libcgmanager.so.0:/lib/x86_64-linux-gnu/libcgmanager.so.0
    - /lib/x86_64-linux-gnu/libnih.so.1:/lib/x86_64-linux-gnu/libnih.so.1
    - /lib/x86_64-linux-gnu/libnih-dbus.so.1:/lib/x86_64-linux-gnu/libnih-dbus.so.1
    - /lib/x86_64-linux-gnu/libgcrypt.so.11:/lib/x86_64-linux-gnu/libgcrypt.so.11
  expose:
    - "5051"

# Marathon
marathon:
  image: mesosphere/marathon
  links:
    - zookeeper:zookeeper
  ports:
    - "8080:8080"
  command: --master zk://zookeeper:2181/mesos --zk zk://zookeeper:2181/marathon

Spin Up the Docker Containers

$ docker-compose up

Verify Our Components

Jenkins: http://192.168.56.118:8081/

Marathon: http://192.168.56.118:8080/

Deployment By Using Mesos + Marathon

Marathon provides RESTful APIs for managing the behaviors we're going to load into the platform, such as deleting an application, deploying an application etc.

So the idea we're going to illustrate is to:

Prepare the payload, or the Marathon deployment file;
Wrap the interactions with Marathon as simple deployment scripts which can eventually be "plugged" into Jenkins as one more build steps to kick off the CD pipeline.

Marathon Deployment File

Firstly we need to create the Marathon deployment file, which is in JSON format, for scheduling our application on Marathon.

Let’s name it marathon-app-springboot-jersey-swagger.json and put it under /data/jenkins/workspace/springboot-jersey-swagger (which will be eventually mapped to Jenkins container’s path of “/var/jenkins_home/workspace/springboot-jersey-swagger”, or "${WORKSPACE}/" so that it’s reachable in Jenkins).

$ sudo touch /data/jenkins/workspace/springboot-jersey-swagger/marathon-app-springboot-jersey-swagger.json
$ sudo vim /data/jenkins/workspace/springboot-jersey-swagger/marathon-app-springboot-jersey-swagger.json

/data/jenkins/workspace/springboot-jersey-swagger/marathon-app-springboot-jersey-swagger.json

{
    "id": "springboot-jersey-swagger", 
    "container": {
      "docker": {
        "image": "192.168.56.118:5000/devops/springboot-jersey-swagger:latest",
        "network": "BRIDGE",
        "portMappings": [
          {"containerPort": 8000, "servicePort": 8000}
        ]
      }
    },
    "cpus": 0.2,
    "mem": 256.0,
    "instances": 1
}

Deployment Scripts

Once we have the Marathon deployment file to publish we can compile the deployment scripts that will remove the currently running application and redeploy it using the new image.

There are better upgrade strategies out of the box but, I won’t discuss them here.

Let’s put it under /data/jenkins/workspace/springboot-jersey-swagger (which will be eventually mapped to Jenkins container’s below path so that it’s reachable in Jenkins) and don’t forget to grant it with execution permission.

$ sudo touch /data/jenkins/workspace/springboot-jersey-swagger/deploy-app-springboot-jersey-swagger.sh
$ sudo chmod +x /data/jenkins/workspace/springboot-jersey-swagger/deploy-app-springboot-jersey-swagger.sh
$ sudo vim /data/jenkins/workspace/springboot-jersey-swagger/deploy-app-springboot-jersey-swagger.sh

#!/bin/bash

# We can make it much more flexible if we expose some of below as parameters
APP="springboot-jersey-swagger"
MARATHON="192.168.56.118:8080"
PAYLOAD="${WORKSPACE}/marathon-app-springboot-jersey-swagger.json"
CONTENT_TYPE="Content-Type: application/json"

# Delete the old application
# Note: again, this can be enhanced for a better upgrade strategy
curl -X DELETE -H "${CONTENT_TYPE}" http://${MARATHON}/v2/apps/${APP}

# Wait for a while
sleep 2

# Post the application to Marathon
curl -X POST -H "${CONTENT_TYPE}" http://${MARATHON}/v2/apps -d@${PAYLOAD}

Plug It In By Adding One More Build Step In Jenkins

Simply add one more build step after the Maven step.

It's a "Execute Shell" build step and the commands are really simple:

cd ${WORKSPACE}
./deploy-app-springboot-jersey-swagger.sh

Save it.

Trigger the Process Again

By default, any code check-in will automatically trigger the build and deployment process.

But for testing purpose, we can click the “Build Now” to manually trigger it.

The Console Output is really verbose but it can help you understand the whole process. Check it out here.

Once the process is done in Jenkins (we can monitor the progress by viewing the Console Output), we can see our application exist in “Deployments” tab and quickly move into “Application” tab of Marathon which means that the deployment is done.

Note: if an application keeps pending in deployment because of “waiting for resource offer”, it’s most likely because of no sufficient resource (e.g. Mem) set in Mesos. In this case, we may need to add more Mesos Slave or fine tune the JSON deployment file with fewer resources.

One may notice that our application is under “Unknown”. Why? It’s because we haven’t set up our health check mechanism yet so Marathon doesn't really know whether the application is healthy or not.

Verify Our App

Once application is deployed by Mesos + Marathon, there will be new Docker container(s) spun up automatically.

But how can we access our application deployed by Mesos + Marathon? The port configuration part in Marathon may be a bit confusing and we will figure it out later. In current stage we have to see what port our app is exposed, by below command:

$ docker ps
CONTAINER ID        IMAGE                                                         COMMAND                  CREATED             STATUS              PORTS                                          NAMES
a020940ae0b5        192.168.56.118:5000/devops/springboot-jersey-swagger:latest   "java -jar /springboo"   8 minutes ago       Up 8 minutes        0.0.0.0:31489->8000/tcp                        mesos-e51e55d3-cd64-4369-948f-aaa433f12c1e-S0.b0e0d351-4da0-4af5-be1b-5e7dab876d06
...

As what I highlighted in red, the port exposed to host is 31489 so our application URL will be:

http://{the docker host ip}:31489, or http://192.168.56.118:31489 in my case.

Yes, once you reach here, our application's continuous deployment is now part of our integrated CI + CD pipeline. We did it!

Conclusion

Now it's time to recap our DevOps story.

I update a bit on the diagram which may be better for me to illustrate the process and components we have engaged.

The Development As Usual

Firstly, as usual, we develop our RESTful webservices by following Microservices architecture with components like Java, Spring Boot, Jersey, Swagger, Maven and more.

Our source code is hosted by Gogs, which is our private Git Repository.

We develop, do unit test, check in code -- everything is just as usual.

The Continuous Integration (CI) Comes To Help

Jenkins, acted as our CI Server, comes to help which detects changes in our Git Repository and triggers our integrated CI + CD pipeline.

For CI part, Jenkins engages Maven to run through tests, both Unit Test and SIT if any and build our typical Spring Boot artifact, an all-in-one fat Jar file. Meanwhile, with the Docker Maven plugin, docker-maven-plugin, Maven can take one step further to help build our desired Docker image where our application Jar file is being hosted.

The Docker image will be published to our private Docker Registry and our CI part ends.

The Final Show in Continuous Deployment (CD)

Apache Mesos, together with Marathon and ZooKeeper founds the CD world.

The trigger point is still in Jenkins -- by adding one more build step -- once CI part is done, Jenkins then executes our Shell scripts where the interactions with Marathon are wrapped.

By posting a predefined JSON payload, or the deployment file, to Marathon, it will deploy our application to Mesos platform. As a result, Docker container(s) will be spun up automatically with specific resources (e.g. CPU, RAM) allocated for our application.

Accordingly, the web services offered by our application are up and running.

That's our whole story of "from code to online services".

Future Improvements?

Yes, of course. Anyway, it's still a DevOps prototype.

Along the way, for simplicity purposes, there are some topics or concerns that haven't been properly addressed yet which include but are not limited to:

Using multi-server cluster to make it "real";
Introducing security and ACL to bring in necessary control;
Applying HAProxy setup for the service clustering;
Enabling auto scaling;
Adding health check mechanism for our application into Marathon;
Fine-tuning the application upgrade strategy;
Engaging Consul for automatic service discovery;
Having a centralized monitoring mechanism;
Trying it out with AWS or Azure;
And more

Final Words

Continuous Delivery (= Continuous Integration + Continuous Deployment) is a big topic and has variable solutions in the market now, be it commercial or open source.

I started working on this set-up mainly to make my hands dirty and also help developers and administrators to learn how to use a series of open source components to build up a streamlined CI+CD pipeline so that we can sense the beauty of continuous delivery.

As mentioned in previous chapter of "Future Improvements", a lot more things are still pending for exploring. But that's also the reason why architects like me love it the way that IT rocks, isn't it?

Enjoy DevOps and appreciate your comments if any!