Recently, a colleague and I started looking into Azure policies and managing Azure policies with code. This is an approach where you define and assign policies using code rather than managing your policies by clicking through the Azure portal. I’m writing this post to give a glimpse of where we are at today and what we have learned so far. This article includes some experimental research so please feel free to reach out with your feedback.
Why write policy as code?
Writing policy-as-code has many advantages, so I’ll mention the ones I like the most. Personally, I find version control to be the most appealing. Knowing who made changes to what and what those changes were can be a big deal when detecting security incidents. More importantly, it makes peer-reviewing new code possible which is a great measure to prevent damaging changes of different kinds. If breaking changes were introduced, version control lets you quickly roll back to a previously known good state. Also, separation of duties can be introduced by peer reviews because you can enforce that the person committing code (containing new policies) can not start the deployment pipeline. See? So many benefits!
In addition, new additions to our code can be tested as part of our integration pipelines. Doing so can severely reduce the risk of critical errors in production environments.
Different alternatives
There are multiple ways to manage Azure policies with code. For example, Microsoft recently published the article “Azure Enterprise Policy as Code – A New Approach” describing how they have created a framework combining policies and configuration files written in JSON with deployment scripts in PowerShell. The article argues against using Terraform for managing Azure policies, but I guess the mix of my curiosity and stubbornness caused me to try anyway. It also helped that a lot of the comments on that article stated that they had successfully used terraform to achieve the same result as the authors described.
It should be mentioned that we took a lot of inspiration from the article and framework from Microsoft. I am planning on testing the framework in the upcoming weeks, and will make sure to write a short article on it.
Managing Azure policies with Terraform
The Terraform Azure provider has support for defining policies and policy sets and assigning them to different resources in Azure. However, using this resource to define policies means you must declare terraform variables for every key-value pair you want your policy to include. While it may seem natural that it is set up this way, I prefer to be able to define policies in JSON. Whenever you find policies written by others, it is usually written in JSON, and having to do that reformatting every time gets cumbersome after a while. The image below shows how you define policies with the Azure terraform provider.
Luckily, we found a list of terraform modules by Sadik Tekin and Chris O’Malley that makes defining policies in JSON possible. Using this module, you create different folders for your policy categories and then create a JSON file for every policy. For example, to define an Azure policy to whitelist regions your terraform code would look like this:
For this code to work, there must exist a “policies” folder in your repository. This policy folder would contain an additional folder for each policy category. In this case the category is “General”, in which you would create the “whitelist_regions.json” file defining the policy using JSON. So the folder structure looks like this:
policies → General → whitelist_regions.JSON
Testing and deploying
A dev and a test subscription were created to be able to write and test policies before deploying them to production. When a developer writes new policies, they can be tested directly from their local machine towards the dev subscription. If the policy is ready for production, the developer creates a pull request which triggers the test pipeline on approval. If the test pipeline runs successfully, the production pipeline may be started manually. Currently, our pipelines don’t do any fancy testing, but hopefully, we can do some cool work here in the future. The image below describes our current setup.
At the moment, we are using Terraform workspaces to separate our environments. The advantage of using them is that we can easily assign different configuration variables based on our environment. I’ve included some images below showing how we can assign a built-in policy set to different environments based on the active workspace.
The disadvantage of this method is that all workspaces use the same backend, which is not recommended because of the lack of isolation between the environments. I found the article “How to manage multiple environments with Terraform” which compares using workspaces, branches, and Terragrunt to manage multiple environments. It provides excellent insights into the different approaches. The article concludes with Terragrunt being the best option (for them) because of its efficient isolation and ease of having different settings in each environment. This is something we will look into shortly and post updates on.
So far, I think our overall approach is looking promising but time will show if it can withstand increased complexity as we extend our scope to enterprise level.
I hope this article can be helpful to someone planning on managing Azure policies using terraform. Feedback on all of this is highly appreciated, so don’t hesitate to get in touch with me for comments or questions.