Déployer AKS avec Terraform : infrastructure Azure en code
← Retour au blogAzure

Déployer AKS avec Terraform : infrastructure Azure en code

10 novembre 202412 min de lectureAzureTerraformAKS

Terraform + Azure = une combinaison puissante pour déployer AKS de manière reproductible. VNet, identités managées, node pools, intégration ACR : un guide complet avec exemples de code.

Pourquoi Terraform pour Azure ?

Bicep est le langage natif Azure pour l'Infrastructure as Code, mais Terraform s'impose comme le standard multi-cloud dans de nombreuses équipes. Son avantage principal : un même outil pour gérer AWS, Azure, GCP et les providers tiers (Datadog, Cloudflare, GitHub). Le provider azurerm maintenu par HashiCorp et Microsoft est complet et très actif — plus de 700 ressources supportées.

Ce guide couvre le déploiement d'un cluster AKS production-ready avec Terraform : réseau, identités, node pools, ACR, et pipeline CI/CD.

Prérequis : Service Principal et backend

Avant d'écrire du Terraform, configurez l'authentification et le state backend :

# Créer un Service Principal pour Terraform
az ad sp create-for-rbac   --name "sp-terraform-aks"   --role Contributor   --scopes /subscriptions/${SUBSCRIPTION_ID}   --json-auth

# Créer le Storage Account pour le backend Terraform
az group create --name rg-terraform-state --location francecentral
az storage account create   --name stterraformstate001   --resource-group rg-terraform-state   --sku Standard_LRS   --allow-blob-public-access false
az storage container create   --name tfstate   --account-name stterraformstate001

Structure du projet Terraform

infra/
├── main.tf          # Providers + backend
├── variables.tf     # Variables d'entrée
├── outputs.tf       # Outputs (kubeconfig, etc.)
├── network.tf       # VNet, subnets, NSG
├── aks.tf           # Cluster AKS + node pools
├── acr.tf           # Azure Container Registry
├── identity.tf      # Managed Identity + role assignments
└── environments/
    ├── dev.tfvars
    └── prd.tfvars

Configuration du provider et du backend

terraform {
  required_version = ">= 1.6"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.110"
    }
  }
  backend "azurerm" {
    resource_group_name  = "rg-terraform-state"
    storage_account_name = "stterraformstate001"
    container_name       = "tfstate"
    key                  = "aks-prod.tfstate"
  }
}

provider "azurerm" {
  features {
    key_vault {
      purge_soft_delete_on_destroy = false
    }
  }
}

Réseau : VNet et subnets

resource "azurerm_virtual_network" "main" {
  name                = "vnet-aks-${var.environment}"
  resource_group_name = azurerm_resource_group.main.name
  location            = var.location
  address_space       = ["10.0.0.0/16"]
}

resource "azurerm_subnet" "nodes" {
  name                 = "snet-aks-nodes"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.0.1.0/24"]
}

resource "azurerm_subnet" "pods" {
  name                 = "snet-aks-pods"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.0.2.0/22"]

  delegation {
    name = "aks-delegation"
    service_delegation {
      name    = "Microsoft.ContainerService/managedClusters"
      actions = ["Microsoft.Network/virtualNetworks/subnets/join/action"]
    }
  }
}

Identités managées

resource "azurerm_user_assigned_identity" "aks" {
  name                = "id-aks-${var.environment}"
  resource_group_name = azurerm_resource_group.main.name
  location            = var.location
}

# Kubelet identity (pour pull ACR)
resource "azurerm_user_assigned_identity" "kubelet" {
  name                = "id-aks-kubelet-${var.environment}"
  resource_group_name = azurerm_resource_group.main.name
  location            = var.location
}

# AcrPull pour accès ACR sans clé
resource "azurerm_role_assignment" "acr_pull" {
  principal_id                     = azurerm_user_assigned_identity.kubelet.principal_id
  role_definition_name             = "AcrPull"
  scope                            = azurerm_container_registry.main.id
  skip_service_principal_aad_check = true
}

# Network Contributor sur le subnet nodes
resource "azurerm_role_assignment" "network" {
  principal_id         = azurerm_user_assigned_identity.aks.principal_id
  role_definition_name = "Network Contributor"
  scope                = azurerm_subnet.nodes.id
}

Cluster AKS

resource "azurerm_kubernetes_cluster" "main" {
  name                = "aks-${var.environment}"
  resource_group_name = azurerm_resource_group.main.name
  location            = var.location
  dns_prefix          = "aks-${var.environment}"
  kubernetes_version  = var.kubernetes_version
  sku_tier            = "Standard"  # SLA 99.95%

  # Workload Identity + OIDC
  workload_identity_enabled = true
  oidc_issuer_enabled       = true

  default_node_pool {
    name                    = "system"
    vm_size                 = "Standard_D4s_v5"
    node_count              = 3
    zones                   = ["1", "2", "3"]
    vnet_subnet_id          = azurerm_subnet.nodes.id
    pod_subnet_id           = azurerm_subnet.pods.id
    only_critical_addons_enabled = true  # Taint CriticalAddonsOnly

    upgrade_settings {
      max_surge = "33%"
    }
  }

  identity {
    type         = "UserAssigned"
    identity_ids = [azurerm_user_assigned_identity.aks.id]
  }

  kubelet_identity {
    client_id                 = azurerm_user_assigned_identity.kubelet.client_id
    object_id                 = azurerm_user_assigned_identity.kubelet.principal_id
    user_assigned_identity_id = azurerm_user_assigned_identity.kubelet.id
  }

  network_profile {
    network_plugin    = "azure"
    network_plugin_mode = "overlay"
    load_balancer_sku = "standard"
  }

  auto_upgrade_profile {
    upgrade_channel = "patch"
  }

  maintenance_window_auto_upgrade {
    frequency   = "Weekly"
    interval    = 1
    day_of_week = "Sunday"
    start_time  = "02:00"
    utc_offset  = "+01:00"
    duration    = 4
  }

  azure_active_directory_role_based_access_control {
    managed            = true
    azure_rbac_enabled = true
  }

  monitor_metrics {}
}

Node pool applicatif

resource "azurerm_kubernetes_cluster_node_pool" "app" {
  name                  = "app"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.main.id
  vm_size               = "Standard_D8s_v5"
  node_count            = 2
  min_count             = 2
  max_count             = 10
  enable_auto_scaling   = true
  zones                 = ["1", "2", "3"]
  vnet_subnet_id        = azurerm_subnet.nodes.id
  pod_subnet_id         = azurerm_subnet.pods.id

  node_labels = {
    "role" = "app"
  }
}

resource "azurerm_kubernetes_cluster_node_pool" "spot" {
  name                  = "spot"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.main.id
  vm_size               = "Standard_D4s_v5"
  priority              = "Spot"
  eviction_policy       = "Delete"
  spot_max_price        = -1  # prix du marché
  node_count            = 0
  min_count             = 0
  max_count             = 20
  enable_auto_scaling   = true

  node_labels = {
    "kubernetes.azure.com/scalesetpriority" = "spot"
  }

  node_taints = ["kubernetes.azure.com/scalesetpriority=spot:NoSchedule"]
}

Pipeline CI/CD GitHub Actions

name: Terraform AKS Deploy
on:
  push:
    branches: [main]
    paths: ["infra/**"]

permissions:
  id-token: write
  contents: read

jobs:
  terraform:
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: actions/checkout@v4

      - name: Azure Login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.9.x"

      - run: terraform init
        working-directory: infra

      - run: terraform plan -var-file=environments/prd.tfvars -out=tfplan
        working-directory: infra

      - run: terraform apply tfplan
        working-directory: infra

Bonnes pratiques et pièges à éviter

  • Ne jamais committer le state : toujours utiliser un backend remote (Azure Storage, Terraform Cloud)
  • Verrouillez les versions : provider ~> 3.110 et Terraform >= 1.6 dans required_providers
  • Séparez les node pools du cluster principal dans des ressources azurerm_kubernetes_cluster_node_pool distinctes pour éviter de recréer le cluster lors des mises à jour
  • Évitez lifecycle { ignore_changes } sur node_count quand l'auto-scaling est activé — Terraform ne gèrera plus les changements de capacité
  • Utilisez des workspaces Terraform ou des fichiers .tfvars par environnement plutôt que des branches Git

Conclusion

Terraform sur Azure permet de déployer un cluster AKS complet, sécurisé et reproductible en quelques centaines de lignes. La combinaison VNet personnalisé + Azure CNI Overlay + Workload Identity + Managed Prometheus constitue la référence production en 2025.

Move2Cloud accompagne ses clients dans la mise en place de cette stack IaC, de la conception à l'intégration CI/CD, avec des modules Terraform réutilisables et une gouvernance multi-environnements.

← Retour au blog