Most of us are familiar with the usual hierarchal organisation set up by an IT department, where an individual developer is at the bottom of the totem pole and has to escalate upwards to get any issues resolved. Something as simple as being able to create a Dev environment to test one’s work has to go through many layers of approval from managers and once approved requires liaising with an infrastructure team to actually create the environment. These processes are born out of the desire to take away the gun before someone shoots themselves in the foot. While it’s true that security is paramount to any organisation, we also have to take into account that imposing artificial restrictions that rely on other humans to enforce them causes a significant dip in productivity while one individual is waiting on others for approval or tools access.
In a more modern ‘digital’ age where everything is automated, why is it that these processes are still so heavily reliant on humans? This reliance introduces far too many inefficiencies and stymies the smooth operation of the development organisation. Think of it as an engine that cannot operate on all cylinders. Some of those cylinders are waiting for fuel or the spark plug or another resource and are unable to perform up to their full potential.
Now imagine an organisation where a developer, has the ability to spin up, say a staging environment on his own based on pre-approved privileges in a matter of minutes rather than waiting days or even weeks. This sort of self service, without going through multiple layers of approval processes and without relying on another infrastructure team to create these resources, will not only empower the dev team and allow them to drastically improve productivity, but also make for a very scalable DevOps experience where a smaller team of DevOps engineers will be able to run the infrastructure as they will not be frequently called away to attend mundane tickets of having to help every developer create multiple environments.
This may seem to be the holy grail for organisations embarking on a DevOps journey on the public cloud, however, it is easily possible to implement such a process to govern the system. This relates to the work Automation Logic has been doing for the past few months on creating a modular and easily deployable cloud platform. I will not delve into too much detail about the cloud platform itself as that is not the intention of this blog post, however, I will talk about the governance implementation of the automated processes that surround the AWS version of the cloud platform.
AWS Service Catalog is a management tool provided by AWS that allows the creation of a portfolio of product templates and enables the use of role-based access to these products. In simpler terms, we can create a set of instructions and then assign who is able to execute and how. These instructions are stored as AWS CloudFormation templates and we are able to use AWS IAM roles and groups to determine user access rights. While Service Catalog is an oft-overlooked tool in Amazon’s toolbox, it is hardly worth writing a blog post and talking about it without there being a big reveal.
The real innovation that was born out of the efforts here at Automation Logic is that we have automated the Service Catalog itself. Allow me to present a future to DevOps on AWS that is not only completely feasible but also instinctively feels like a massive improvement to the processes and systems of the present. We are already at the point where infrastructure as code in the form of templates are being used more and more, the solution described here takes that a step further by automating the release of those templates and using them to create a one-click deploy solution while providing roles-based access control to who can access these resources.
We start with the template itself, this piece of infrastructure represented as a code is authored by the infrastructure / DevOps team. This template is then sent to a source control repository which is accessible only by the dev-ops team to prevent unauthorised changes to the templates. In our example, we use AWS code commit (easily replaced by GitHub) as the source control repository.
The next step is that a specialised AWS Lambda function whose purpose is to create or synchronise the Service Catalog is built and then called. This is achieved by using the AWS Code Pipeline in conjunction with AWS Code Build. These two tools are part of the Continuous Integration / Continuous Delivery tools provided by AWS and as such can be replaced by equivalent tools such as Jenkins. Once triggered this Lambda function analyses a mappings file that is also present in the repository that describes the path to the template file and how it should be treated by Service Catalog. The Lambda function then proceeds to create the template as a product within a portfolio on the Service Catalog and then proceeds to add constraints on the use of the product. If the product template already exists then the Lambda creates an incremental version of the product if changes are detected.
On the other end, a developer who needs to create an environment can open his Service Catalog interface and will only see the products (templates) they have access to. They can then proceed to launch a product with what is close to a one-click deployment. The products they have access to are pre-determined and pre-approved and benefit from the efficiency of self-service rather than managerial approval. The beauty of this system is that the developer does not need to have permissions to create any of the underlying resources in the template but only have the permission to use the template itself. In simpler terms, a developer within your organization can be given access only to use the Service Catalog and launch a product within it, but not given access to launch EC2 VMs even if the product’s template creates EC2 resources. This has a significant effect on the overall security posture of an organisation. By limiting users to only have access to Service Catalog and not the underlying resources of a template, we can prevent accidents and misuse of resources.
An additional improvement is an ability to constrain the use of the product. The same template may be used to create a product and two users with different permissions may be given access to this product. However, we can add constraints to the use of this product which will restrict what each will be able to do with the product.
Let us consider the example of two templates, a Java platform template, and a NodeJs platform template, in Service Catalog. A developer from the Java team is given access to use the Java template via Service Catalog and can only run it in ‘Dev Mode’ through the use of parameters in the template which for example may restrict him to only use t2.micro ( lower capability virtual machine) instances. This developer from the Java team will not be able to see or launch the NodeJS template within the Service Catalog and vice versa. Now consider an infrastructure team member who is responsible for production environments for both the Java and NodeJS platforms. This user can be given access to view and launch both templates and also run them in ‘Production Mode’ which will not restrict him to lower capability resources. Therefore we can use permissions and constraints to control access to the resources with a very high granularity.
We talk about a solution where we automate AWS Service Catalog and empower individual users to use a self-service system while maintaining a high degree of security and privileges around this process to prevent misuse. This system has significant improvements which aid the efficiency of the development team and is very scalable due to the automation of key organisational processes.
This solution is available under an open source Apache license and can be accessed here: https://bitbucket.org/automationlogic/aws-cloud-platform/overview
I encourage you to take a look and if interested contribute back to the project.