Introduction About Terraform

Terraform is an open-source Infrastructure as Code(IaC) tool developed by HashiCorp. It is used to define and provision the complete infrastructure using an easy-to-learn declarative language.

Introduction

Terraform is an open-source Infrastructure as Code (IaC) tool developed by HashiCorp. It is used to define and provision the complete infrastructure using an easy-to-learn declarative language.

It is an infrastructure provisioning tool where you can store your cloud infrastructure setup as code. It's very similar to tools such as CloudFormation, which you would use to automate your AWS infrastructure, but you can only use that on AWS. With Terraform, you can use it on other cloud platforms as well.

Infrastructure as Code (IaC) is a widespread terminology among DevOps professionals. It is the process of managing and provisioning the complete IT infrastructure (comprises both physical and virtual machines) using machine-readable definition files. It is a software engineering approach toward operations. It helps in automating the complete data center by using programming scripts.

Benefits of Terraform

Multi-Cloud Support

Supports AWS, Azure, GCP, DigitalOcean, and 100+ other providers.

Orchestration

Handles full infrastructure orchestration, not just configuration management.

Immutable Infrastructure

Configuration changes are applied smoothly with version-controlled state.

HCL Language

Uses HashiCorp Configuration Language - easy to read and write.

Portable

Easily portable configurations across different cloud providers.

Client-Only

No server-side agents required - runs from your local machine or CI/CD.

Terraform Architecture

Installation of Terraform

$ curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
$ sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
$ sudo apt-get update && sudo apt-get install terraform

Terraform Core Concepts

Terraform core uses two input sources to do its job:

Terraform Configuration

You define what needs to be created or provisioned in .tf files.

State File

Terraform keeps an up‑to‑date state of the current infrastructure.

The core compares the current state with the desired configuration and figures out what needs to be created, updated, or deleted to reach the desired state.

Core Concepts Overview

path

providerstring

Required

Plugin that interacts with cloud provider APIs (AWS, Azure, GCP, etc.) to manage resources.

path

resourcestring

Required

Block defining one or more infrastructure objects (EC2 instances, VPCs, storage buckets).

path

data_sourcestring

Read-only reference to existing infrastructure objects managed outside Terraform.

path

statefile

Required

JSON file (terraform.tfstate) storing cached information about managed infrastructure.

path

modulestring

Directory containing reusable Terraform templates and configurations.

path

variablestring

Input parameter for customizing module behavior ( Input, Local, Output types).

Provider

A provider works pretty much as an operating system's device driver. It exposes a set of resource types using a common abstraction, thus masking the details of how to create, modify, and destroy a resource pretty much transparent to users.

Terraform downloads providers automatically from its public registry as needed, based on the resources of a given project. It can also use custom plugins, which must be manually installed by the user. Finally, some built-in providers are part of the main binary and are always available.

Although not strictly necessary, it's considered a good practice to explicitly declare which provider we'll use in our Terraform project and inform its version. For this purpose, we use the version attribute available to any provider declaration:

provider "aws" {
   access_key = "B5KG6Fe5GUKIATUF5UD"
   secret_key = "R4gb65y56GBF6765ejYSJA4YtaZ+T6GY7H"
   region = "us-east-2"
}

Resources

In Terraform, a resource is anything that can be a target for CRUD operations in the context of a given provider. Some examples are an EC2 instance, an Azure MariaDB, or a DNS entry.

Let's look at a simple resource definition:

resource "aws_instance" "web" {
   ami = "some-ami-id"
   instance_type = "t2.micro"
}

First, we always have the resource keyword that starts a definition. Next, we have the resource type, which usually follows the provider_type convention. In the above example, aws_instance is a resource type defined by the AWS provider, used to define an EC2 instance. After that, there's the user-defined resource name, which must be unique for this resource type in the same module -- more on modules later.

Finally, we have a block containing a series of arguments used as a resource specification. A key point about resources is that once created, we can use expressions to query their attributes. Also, and equally important, we can use those attributes as arguments for other resources.

To illustrate how this works, let's expand the previous example by creating our EC2 instance in a non-default VPC

resource "aws_vpc" "apps" {
   cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "frontend" {
   vpc_id = "${aws_vpc.apps.id}"
   cidr_block = "10.0.1.0/24"
}
resource "aws_instance" "web" {
   ami = "some-ami-id"
   instance_type = "t2.micro"
   subnet_id = "${aws_subnet.frontend.id}"
}

Here, we use the id attribute from our VPC resource as the value for the frontend's vpc_id argument. Next, its id parameter becomes the argument to the EC2 instance. Please note that this particular syntax requires Terraform version 0.12 or later. Previous versions used a more cumbersome ${expression} syntax, which is still available but considered legacy.

This example also shows one of Terraform's strengths: regardless of the order in which we declare resources in our project, it will figure out the correct order in which it must create or update them based on a dependency graph it builds when parsing them.

count and for_each Meta Arguments

It allow us to create multiple instances of any resource. The main difference between them is that count expects a non-negative number, whereas for_each accepts a list or map of values.

For instance, let's use count to create some EC2 instances on AWS:

resource "aws_instance" "server" {
   count = "${var.server_count}"
   ami = "ami-xxxxxxx"
   instance_type = "t2.micro"
   tags = {
      Name = "WebServer - ${count.index}"
   }
}

Within a resource that uses count, we can use the count object in expressions. This object has only one property: index, which holds the index (zero-based) of each instance.

Likewise, we can use the for_each meta argument to create those instances based on a map:

variable "instances" {
   type = map(string)
}
resource "aws_instance" "server" {
   for_each = "${var.instances}"
   ami = "${each.value}"
   instance_type = "t2.micro"
   tags = {
      Name = "${each.key}"
   }
}

This time, we've used a map from labels to AMI (Amazon Machine Image) names to create our servers. Within our resource, we can use the each object, which gives us access to the current key and value for a particular instance.

A key point about count and for_each is that, although we can assign expressions to them, Terraform must be able to resolve their values before performing any resource action. As a result, we cannot use an expression that depends on output attributes from other resources.

Data Sources

Data sources work pretty much as "read-only" resources, in the sense that we can get information about existing ones but can't create or change them. They are usually used to fetch parameters needed to create other resources.

A typical example is the aws_ami data source available in the AWS provider, which we use to recover attributes from an existing AMI:

data "aws_ami" "ubuntu" {
   most_recent = true

   filter {
      name   = "name"
      values = ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"]
   }

   filter {
      name   = "virtualization-type"
      values = ["hvm"]
   }
   owners = ["099720109477"]
}

This example defines a data source called "ubuntu" that queries the AMI registry and returns several attributes related to the located image. We can then use those attributes in other resource definitions, prepending the data prefix to the attribute name:

resource "aws_instance" "web" {
   ami = "${data.aws_ami.ubuntu.id}"
   instance_type = "t2.micro"
}

State

The state of a Terraform project is a file that stores all details about resources that were created in the context of a given project. For instance, if we declare an azure_resourcegroup resource in our project and run Terraform, the state file will store its identifier.

The primary purpose of the state file is to provide information about already existing resources, so when we modify our resource definitions, Terraform can figure out what it needs to do.

An important point about state files is that they may contain sensitive information. Examples include initial passwords used to create a database, private keys, and so on.

Terraform uses the concept of a backend to store and retrieve state files. The default backend is the local backend, which uses a file in the project's root folder as its storage location. We can also configure an alternative remote backend by declaring it in a terraform block in one of the project's .tf files:

terraform {
backend "s3" {
   bucket = "some-bucket"
   key = "some-storage-key"
   region = "us-east-1"
}
}

Modules

Terraform modules are the main feature that allows us to reuse resource definitions across multiple projects or simply have a better organization in a single project. This is much like what we do in standard programming: instead of a single file containing all code, we organize our code across multiple files and packages.

A module is just a directory containing one or more resource definition files. In fact, even when we put all our code in a single file/directory, we're still using modules - in this case, just one. The important point is that sub-directories are not included as part of a module. Instead, the parent module must explicitly include them using the module declaration:

module "networking" {
   source = "./networking"
   create_public_ip = true
}

Here we're referencing a module located at the "networking" sub-directory and passing a single parameter to it - a boolean value in this case.

It's important to note that in its current version, Terraform does not allow the use of count and for_each to create multiple instances of a module.

Input Variables

Any module, including the top, or main one, can define several input variables using variable block definitions:

variable "myvar" {
   type = string
   default = "Some Value"
   description = "MyVar description"
}

A variable has a type, which can be a string, map, or set, among others. It also may have a default value and description. For variables defined at the top-level module, Terraform will assign actual values to a variable using several sources:

-var command-line option
.tfvar files, using command-line options or scanning for well-known files/locations
Environment variables starting with TF_VAR_
The variable's default value, if present

As for variables defined in nested or external modules, any variable that has no default value must be supplied using arguments in a module reference. Terraform will generate an error if we try to use a module that requires a value for an input variable but we fail to supply one.

Once defined, we can use variables in expressions using the var prefix:

resource "xxx_type" "some_name" {
   arg = "${var.myvar}"
}

Output Values

By design, a module's consumer has no access to any resources created within the module. Sometimes, however, we need some of those attributes to use as input for another module or resource. To address those cases, a module can define output blocks that expose a subset of the created resources:

output "web_addr" {
   value = "${aws_instance.web.private_ip}"
   description = "Web server's private IP address"
}

Here we're defining an output value named "web_addr" containing the IP address of an EC2 instance that our module created. Now any module that references our module can use this value in expressions as module.module_name.web_addr, where module_name is the name we've used in the corresponding module declaration.

Local Variables

Local variables work like standard variables, but their scope is limited to the module where they're declared. The use of local variables tends to reduce code repetition, especially when dealing with output values from modules:

locals {
   vpc_id = "${module.network.vpc_id}"
}
module "network" {
   source = "./network"
}
module "service1" {
   source = "./service1"
   vpc_id = "${local.vpc_id}"
}
module "service2" {
   source = "./service2"
   vpc_id = "${local.vpc_id}"
}

Here, the local variable vpc_id receives the value of an output variable from the network module. Later, we pass this value as an argument to both service1 and service2 modules.

Workspaces

Terraform workspaces allow us to keep multiple state files for the same project. When we run Terraform for the first time in a project, the generated state file will go into the default workspace. Later, we can create a new workspace with the terraform workspace new command, optionally supplying an existing state file as a parameter.

We can use workspaces pretty much as we'd use branches in a regular VCS. For instance, we can have one workspace for each target environment - DEV, QA, PROD - and, by switching workspaces, we can terraform apply changes as we add new resources.

Given the way this works, workspaces are an excellent choice to manage multiple versions - or "incarnations" if you like - of the same set of configurations. This is great news for everyone who's had to deal with the infamous "works in my environment" problem, as it allows us to ensure that all environments look the same.

In some scenarios, it may be convenient to disable the creation of some resources based on the particular workspace we're targeting. For those occasions, we can use the terraform.workspace predefined variable. This variable contains the name of the current workspace, and we can use it as any other in expressions.

Terraform Lifecycle

Terraform follows a four-stage lifecycle for infrastructure management:

terraform init

Initializes the working directory containing Terraform configuration files.

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider `local` (hashicorp/local) 2.2.3...

Terraform has been successfully initialized!

This step scans project files and downloads required providers.

terraform plan

Creates an execution plan showing what actions Terraform will perform.

$ terraform plan
...
Terraform will perform the following actions:

# local_file.hello will be created
+ resource "local_file" "hello" {
      + content              = "Hello, Terraform"
      + directory_permission = "0777"
      + file_permission      = "0777"
      + filename             = "hello.txt"
      + id                   = (known after apply)
   }

Plan: 1 to add, 0 to change, 0 to destroy.

Think of this as a "dry run" - it shows changes without applying them.

terraform apply

Executes the planned changes to create, update, or delete infrastructure resources.

$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

local_file.hello: Creating...
local_file.hello: Creation complete after 0s [id=392b5481eae4ab2178340f62b752297f72695d57]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Always review the plan output before confirming apply in production environments.

terraform destroy

Removes all infrastructure resources managed by the current state file.

$ terraform destroy --auto-approve

Terraform will perform the following actions:

# local_file.hello will be destroyed
- resource "local_file" "hello" {
      - content              = "Hello, Terraform" -> null
      - directory_permission = "0777" -> null
      - file_permission      = "0777" -> null
      - filename             = "hello.txt" -> null
      - id                   = "392b5481eae4ab2178340f62b752297f72695d57" -> null
   }

Plan: 0 to add, 0 to change, 1 to destroy.

This action is irreversible - all managed resources will be permanently deleted.

terraform init

Let us look into small terraform code to execute and understand the basics.

Create main.tf and add the below contents into the main.tf file to understand about the terraform lifecycle.

resource "local_file" "hello" {
   content              = "Hello, Terraform"
   directory_permission = "0777"
   file_permission      = "0777"
   filename             = "hello.txt"
}

Since this is the first time we're running this project, we need to initialize it with the init command:

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider `local` (hashicorp/local) 2.2.3...

Terraform has been successfully initialized!

In this step, Terraform scans our project files and downloads any required provider the local provider, in our case.

terraform plan

we use the plan command to verify what actions Terraform will perform to create our resources. This step works pretty much as the "dry run" feature available in other build systems, such as GNU's make tool:

$ terraform plan
... messages omitted
Terraform will perform the following actions:

# local_file.hello will be created
+ resource "local_file" "hello" {
      + content              = "Hello, Terraform"
      + directory_permission = "0777"
      + file_permission      = "0777"
      + filename             = "hello.txt"
      + id                   = (known after apply)
   }

Plan: 1 to add, 0 to change, 0 to destroy.
... messages omitted

Here, Terraform is telling us that it needs to create a new resource, which is expected as it doesn't exist yet. We can also see the provided values we've set and a pair of permission attributes. As we haven't supplied those in our resource definition, the provider will assume default values.

terraform apply

We can now proceed to actual resource creation using the apply command:

$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create

Terraform will perform the following actions:

# local_file.hello will be created
+ resource "local_file" "hello" {
      + content              = "Hello, Terraform"
      + directory_permission = "0777"
      + file_permission      = "0777"
      + filename             = "hello.txt"
      + id                   = (known after apply)
   }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

local_file.hello: Creating...
local_file.hello: Creation complete after 0s [id=392b5481eae4ab2178340f62b752297f72695d57]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

We can now verify that the file has been created with the specified content:

$ cat hello.txt
Hello, Terraform

All good! Now, let's see what happens if we rerun the apply command, this time using the -auto-approve flag so Terraform goes right away without asking for any confirmation:

$ terraform apply -auto-approve
local_file.hello: Refreshing state... [id=392b5481eae4ab2178340f62b752297f72695d57]

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

This time, Terraform did nothing because the file already existed. That's not all, though. Sometimes a resource exists, but someone may have changed one of its attributes, a scenario that is usually referred to as "configuration drift". Let's see how Terraform behaves in this scenario:

$ echo foo > hello.txt
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

local_file.hello: Refreshing state... [id=392b5481eae4ab2178340f62b752297f72695d57]

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create

Terraform will perform the following actions:

# local_file.hello will be created
+ resource "local_file" "hello" {
      + content              = "Hello, Terraform"
      + directory_permission = "0777"
      + file_permission      = "0777"
      + filename             = "hello.txt"
      + id                   = (known after apply)
   }

Plan: 1 to add, 0 to change, 0 to destroy.

Terraform has detected the change in the hello.txt file's content and generated a plan to restore it. Since the local provider lacks supports for in-place modification, we see that the plan consists of a single step -- recreating the file.

We now can run apply again and, as a result, it will restore the contents of the file to its intended content:

$ terraform apply -auto-approve
... messages omitted
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

$ cat hello.txt
Hello, Terraform

terraform destroy

It would destroy the resources which we created from current directory and according to the state file.

$ terraform destroy --auto-approve

Terraform will perform the following actions:

# local_file.hello will be destroyed
- resource "local_file" "hello" {
      - content              = "Hello, Terraform" -> null
      - directory_permission = "0777" -> null
      - file_permission      = "0777" -> null
      - filename             = "hello.txt" -> null
      - id                   = "392b5481eae4ab2178340f62b752297f72695d57" -> null
   }

Plan: 0 to add, 0 to change, 1 to destroy.

Was this page helpful?

Last updated 3 weeks ago

Built with Documentation.AI