Career Guides13 min2026-06-12TechCerted Editorial

The 11 Tools Every Working Cloud Architect Actually Uses

Six cost nothing. Five cost roughly $65 per user per month combined. Here is the stack that appears in every senior cloud architect's daily workflow -- with the one tool that will determine your on-call quality of life.

We have covered cloud architecture careers on this site for three years, and the single most useful exercise we have done is this: map what tools appear in cloud architect job postings against what practitioners actually cite on r/aws, r/devops, and professional Slack communities. The gap between those two lists shows you exactly where vendor marketing budgets are larger than actual adoption rates. After running that comparison across 2024 and 2025 data, eleven tools remain in both lists consistently. Six of the eleven cost $0. The other five run roughly $65 per user per month combined at standard enterprise pricing. The $300,000-per-year ITSM suites vendors push at re:Invent? Almost none of the working architects we track use them as daily tools. Here is the actual stack.

The full stack at a glance: what each tool costs and why it earns its spot

The eleven tools span five workflow categories: infrastructure provisioning, observability, container management, architecture documentation, and cost plus incident management. Every one of them appears in senior cloud architect job descriptions at mid-market companies and enterprise shops alike. The table below uses standard public pricing as of June 2026. Enterprise negotiations routinely reduce list prices by 20 to 40 percent on annual contracts, but the ratios between tools stay consistent regardless of deal size.

The 11-tool cloud architect stack -- June 2026 pricing
Terraform (HashiCorp)
Open source BSL license; HCP Terraform Plus from $20/user/month for policy-as-code and audit logs
$0
AWS CLI
Free; normal AWS API rates apply for resources provisioned or queried
$0
draw.io (diagrams.net)
Fully free and open source; desktop-installable; Confluence plugin available separately
$0
k9s (Kubernetes TUI)
MIT license; the standard terminal UI for real-time Kubernetes cluster inspection
$0
Helm
Apache 2.0 license; 75% adoption among Kubernetes users (CNCF Annual Survey 2024)
$0
Grafana + Prometheus
Both fully open source; Grafana Cloud free tier covers 10,000 series and 3 users
$0
AWS Cost Explorer
Effectively free at normal usage; AWS Trusted Advisor included with Business Support plan
$0.01 / 1K API requests
Datadog Infrastructure Pro
Includes metrics, logs, and APM; annual contract reduces to ~$11/host/month
~$15/host/month
PagerDuty Professional
Incident management and on-call scheduling; Business plan at $35/user/month adds AIOps
~$20/user/month
Confluence Cloud Standard
Architecture decision records and team documentation; free for up to 10 users
~$5.75/user/month
GitHub Actions (paid tier)
Free for public repos and 2,000 min/month private; GitHub Copilot adds $10-19/user/month
~$0.008/minute above free tier
Total~$60-$65/user/month for the five paid tools at standard pricing

Infrastructure provisioning: Terraform, AWS CLI, and GitHub Actions

Terraform is the tool a working cloud architect cannot avoid in 2026. It commands 76% of the IaC market among cloud-native practitioners and appears as a baseline expectation across cloud architecture and infrastructure engineering job descriptions (CNCF Survey 2024). The reason is cloud-agnosticism: an architect designing multi-cloud or hybrid infrastructure does not want to learn separate IaC tooling for AWS, Azure, and GCP. Terraform's HCL syntax is declarative, version-controllable, and diff-readable by non-engineers -- which matters when presenting an infrastructure change to a security team or a CTO who does not read code. The open-source core is free; HCP Terraform Plus starts at $20/user/month and adds policy-as-code enforcement and audit logs for organizations that need governance controls. One important note for 2026: HashiCorp's 2023 BSL license change triggered a fork. OpenTofu now holds roughly 12% practitioner adoption and is growing at 300% annually (Pragmatic Engineer Survey 2026). If you are preparing for a Terraform interview, know the distinction. The <a href='/certifications/terraform-associate'>HashiCorp Terraform Associate exam</a> costs $70.50 via mindhub.com and is now commonly requested alongside AWS certifications in senior architect postings.

The AWS CLI is not glamorous, but working architects use it every single day. No one who provisions infrastructure with any regularity does so exclusively through the AWS Console. The CLI enables scripting, bulk operations, and the kind of rapid environment inspection that the console makes painfully slow. Common architect use cases: checking IAM policy conditions across multiple accounts, pulling CloudWatch logs in bulk during incident triage, scripting one-off resource modifications, and querying AWS Config for compliance state. It is free. If you are newer to the <a href='/learn/what-does-a-cloud-architect-do-2026'>cloud architect role</a>, investing time in the AWS CLI is one of the highest-return-per-hour skills in your first year -- it unlocks automation patterns the console never will.

GitHub Actions replaced Jenkins as the dominant CI/CD platform across most mid-market cloud environments by 2024. Architects do not typically write the pipeline configuration files -- that belongs to platform and DevOps engineers -- but they design the pipeline topology: which environments exist, what gates sit between them, and what infrastructure automation runs at deploy time. GitHub Actions free tier covers public repositories and 2,000 minutes per month on private repos. Above that threshold, compute costs approximately $0.008 per minute. The more significant budget question for most architecture teams in 2026 is GitHub Copilot: at $10 to $19 per user per month, it has become standard at most engineering organizations and meaningfully accelerates Terraform module review and CloudFormation template analysis work.

The 2024 Annual Survey confirms GitHub Actions leads CI/CD platform adoption at 51% among cloud-native organizations. Helm remains the preferred Kubernetes package management tool at 75% adoption. And GitOps has become mainstream: 77% of surveyed organizations have adopted GitOps principles in their deployment workflows, a 17-percentage-point increase over the prior year.
Cloud Native Computing Foundation · CNCF Annual Survey 2024

Observability: Datadog and Grafana with Prometheus

Every cloud architect eventually has to pick a monitoring and observability stack, and this choice shapes on-call quality of life for years. Datadog is the paid option that dominates enterprise environments where budget exists. Cloud architects at organizations where the tooling budget is set comfortably -- median total comp $147,236 for the role per ZipRecruiter 2026 -- tend to land in Datadog environments. At roughly $15 per host per month for the Infrastructure Pro plan, a 50-host environment runs approximately $750 per month or $9,000 per year. That is real money for an early-stage startup and a rounding error for a company with $5 million in annual cloud spend. Datadog's advantage is time-to-insight: logs, metrics, APM traces, and alerts all live in one interface with pre-built dashboards for AWS services, Kubernetes, and most common SaaS integrations. The architecture team does not need to build and maintain the observability infrastructure -- they instrument, route, and alert. Datadog holds approximately 52% of the commercial observability market (Grafana Labs Survey 2025). For how observability skills map to specific compensation bands, see the <a href='/learn/cloud-architect-salary-guide-2026'>cloud architect salary guide</a> we published earlier this year.

Grafana plus Prometheus is the free alternative that larger or more cost-conscious teams use. Prometheus is a time-series metrics database and collection system. Grafana is the visualization and alerting layer on top of it. Both are fully open source and self-hostable. The Grafana Labs 2025 Observability Survey (n=1,255) found that 67% of organizations run Prometheus in production, with total usage including proofs of concept reaching 86% (Grafana Labs Survey 2025). Three-quarters of practitioners surveyed use open-source licensing for at least some of their observability stack. The trade-off is operational ownership: someone on your team must manage Prometheus retention configuration, Grafana dashboard definitions, and alertmanager routing rules. At small-to-medium scale with no dedicated SRE, this maintenance overhead typically consumes more architect time than a Datadog subscription would cost. At larger scale with a platform team, the economics flip.

76%
Infrastructure practitioners citing Terraform as primary IaC tool
CNCF Survey 2024
80%
Enterprises running Kubernetes in production in 2024
CNCF Survey 2024
$201,062
Average US cloud architect total annual compensation
Glassdoor 2025

Our 2025 Observability Survey confirms that organizations are embracing a diverse, open source-centric approach to observability.

Tom Wilkie, CTO at Grafana Labs (Grafana Labs 2025 Observability Survey)

Container management: Helm and k9s

Helm is the package manager for Kubernetes and has been the dominant application packaging standard since 2018. Where Terraform provisions cloud infrastructure, Helm manages the deployment of applications into that infrastructure. Architects typically do not write Helm charts themselves -- that belongs to platform engineering -- but they design the chart architecture, the values hierarchy across environments, and the release promotion strategy between staging and production. Helm's 75% adoption among Kubernetes users (CNCF Survey 2024) makes it non-optional knowledge at the senior cloud architect level. The combination of Terraform plus Helm covers the full infrastructure-to-application provisioning chain that most cloud architects own in mid-market environments. Helm is free under the Apache 2.0 license.

k9s is a free, open-source terminal-based UI for Kubernetes clusters that has become the standard tool for real-time cluster inspection. It replaces typing `kubectl get pods -n namespace --watch` repeatedly with a live, keyboard-navigable view of your cluster state across all namespaces. Cloud architects use k9s differently than developers do: it is the fastest way to inspect a cluster's actual resource allocation during an incident, spot unhealthy workloads before they become pages, and understand what a team's infrastructure looks like in production versus what the Terraform plan intended. k9s is MIT-licensed and completely free. Kubernetes knowledge at the cluster management level -- not just deploying to it -- is now requested in a majority of senior cloud architect roles. The <a href='/certifications/cka-kubernetes'>Certified Kubernetes Administrator exam</a> costs $395 via mindhub.com and validates the deeper cluster management skills that appear in principal architect job descriptions.

Architecture documentation: draw.io and Confluence

Architecture diagrams are a primary output of the cloud architect role -- arguably the most-shared artifact the role produces. The free and near-universal choice for creating them is draw.io (also known as diagrams.net). It is open source, desktop-installable, browser-accessible without an account, and ships with first-class AWS, Azure, and GCP stencil libraries maintained by AWS and Microsoft themselves. The output format is XML-based, version-controllable in git, and renders in Confluence and Notion with plugins. For the <a href='/careers/cloud-architect'>cloud architect career path</a>, producing clear, accurate architecture diagrams is a core job skill. draw.io is the tool senior architects consistently choose over Lucidchart and Miro for solo and small-team work: no subscription cost, no vendor lock-in, and the AWS diagram stencils are more current than any paid alternative.

Confluence is where the architecture lives as structured documentation, specifically through Architecture Decision Records (ADRs). An ADR is a short, formal document recording why a system is designed the way it is: the context, the decision, the alternatives considered, and the consequences accepted. Cloud architects who maintain a current ADR library spend dramatically less time re-explaining settled decisions to new engineers, re-litigating design choices when team members rotate, and reconstructing the rationale for legacy infrastructure before a migration. Atlassian reports Confluence adoption across more than 312,000 companies globally (Atlassian 2025). The Cloud Standard plan at $5.75 per user per month is not a high cost relative to the value of a well-maintained ADR library; the free tier covers up to 10 users.

Cost and incident management: AWS Cost Explorer and PagerDuty

Cloud architects own cost in a specific sense: every architectural decision has a direct cost implication, and the architect is accountable for knowing what that implication is before the decision is made. AWS Cost Explorer is the free, native tool for this analysis. It provides cost breakdowns by AWS service, linked account, cost allocation tag, and resource type -- the data architects need to make comparisons like 'moving this workload from EC2 On-Demand to Fargate Spot saves $X per month at the cost of $Y in refactoring time.' Cost Explorer is effectively free at standard usage volumes (API requests cost $0.01 per 1,000). AWS Trusted Advisor, included with Business Support plans, surfaces rightsizing recommendations and idle resource flags automatically. Cost optimization skills now appear in 45% of senior cloud architect job descriptions (LinkedIn Workforce Insights 2026) -- making Cost Explorer proficiency a direct hiring signal.

PagerDuty is the on-call and incident management platform that has been the professional standard since 2014. At approximately $20 per user per month for the Professional plan, it handles on-call rotation scheduling, escalation policy configuration, alert routing from monitoring tools like Datadog, and post-incident timeline reconstruction. Cloud architects care about PagerDuty for two reasons. First, they are typically in the on-call rotation for their infrastructure, usually as a level-two escalation above the platform engineering team. Second, the architect is responsible for designing the alerting thresholds and escalation paths that PagerDuty enforces at 3am. A badly configured escalation policy generates noise on every transient metric spike. A well-configured one routes only actionable signals and auto-resolves the rest. The difference between those two configurations is felt immediately by every engineer on the rotation.

Pros
  • Datadog: single interface for logs, metrics, APM traces, and security signals -- cuts the cognitive overhead of switching between systems during an active incident
  • Datadog: pre-built integrations for 650+ AWS services, Kubernetes operators, and most common SaaS tools save weeks of custom dashboard and alert configuration work
  • Grafana plus Prometheus: zero licensing cost and all observability data stays in your own infrastructure with no vendor lock-in or per-host pricing surprises
  • Grafana plus Prometheus: scales horizontally to any data volume without incremental licensing cost -- the economics improve as infrastructure grows beyond 100 hosts
  • PagerDuty: battle-tested incident workflow tooling with a large practitioner base; most DevOps and SRE professionals already know it, which cuts onboarding time to near zero
Cons
  • Datadog: per-host pricing scales linearly with infrastructure -- a 500-host environment at list price runs $90,000 per year, which requires architecture-level budget justification
  • Datadog: log ingestion and retention pricing is a separate charge from infrastructure monitoring; large log volumes add $0.10 per GB ingested and many teams cap retention to control cost
  • Grafana plus Prometheus: requires dedicated operational ownership -- storage sizing, retention policies, HA configuration, and alertmanager routing rules all need active maintenance
  • Grafana plus Prometheus: correlating signals across logs (Loki), metrics (Prometheus), and traces (Tempo) requires integrating three separately managed components, adding operational surface area
  • PagerDuty: the AIOps features that meaningfully reduce alert noise are locked behind the $35/user/month Business plan -- teams often discover this gap only after a high-alert-volume incident
Verdict: Start with the six free tools; add the paid ones as team and infrastructure scale demands it

Every tool on this list earns its spot. The six free tools -- Terraform, AWS CLI, draw.io, k9s, Helm, and Grafana plus Prometheus -- are non-negotiable regardless of company size or budget and should be in every cloud architect's workflow from day one. The paid tools follow a clear trigger: add Datadog when Grafana plus Prometheus requires more SRE time to operate than a Datadog subscription costs, which typically happens around 10 to 15 production services at a team of 8 to 12 engineers. Add PagerDuty when your on-call rotation has more than three people and you need formal escalation policies rather than ad-hoc group messages. Add Confluence when your ADR count exceeds 20 documents and team wiki searches start failing. The tools in this list are not aspirational -- they appear in every senior cloud architect's daily browser tabs. For the certification path that maps to this tooling: the <a href='/certifications/aws-solutions-architect'>AWS Solutions Architect Associate</a> ($150) is the standard first credential; Terraform Associate ($70.50) is the right second if your target roles require IaC depth. See how these credentials map to specific comp tiers in our <a href='/learn/l5-cloud-architect-faang-salary-2026'>FAANG cloud architect salary breakdown</a>.

What most tool guides miss: the real cost is not the licensing

Every tool guide, including this one, understates the actual cost of a toolchain by listing only licensing fees. What most guides skip is the operational overhead calculation. The real cost of a monitoring stack is not Datadog's $15 per host per month -- it is $15 per host per month plus the two engineer-weeks it takes to properly instrument a new service, plus the quarterly work of tuning alert thresholds as the product changes, plus the on-call engineer time saved or spent when an incident fires at 2am. The tools above score well on that broader calculation. Terraform's learning curve is steep -- a junior engineer writing Terraform for the first time will make state file mistakes that take a senior architect two hours to untangle -- but the operational return once the team knows the tool well is very high. Datadog's licensing cost is real, but the alternative of building equivalent observability on Grafana plus Prometheus plus Loki often consumes engineering time that most teams would rather spend on product work.

The tool most conspicuously absent from vendor presentations but present in every real-world architect's workflow is a structured documentation system for decisions. draw.io handles the diagrams; Confluence handles the ADRs. But the discipline of writing the ADR at all -- before the infrastructure is built, not after -- is the practice that separates architects whose codebases are maintainable from those whose codebases require archaeological excavation before any change. The tool itself matters less than the habit. The teams that maintain strong ADR practices are the ones where infrastructure changes take a day, not a sprint. That is a compounding return on 20 minutes of documentation per decision, and no licensing fee can substitute for it.

Frequently asked questions

Is Terraform still worth learning in 2026 given AWS CDK and Pulumi exist?+

Yes. Terraform commands 76% of the IaC market and appears in the majority of cloud architect and infrastructure engineering job postings (CNCF Survey 2024). AWS CDK is compelling for AWS-only shops where the team prefers Python or TypeScript over HCL. Pulumi is growing fast (45% year-over-year user growth) but holds a fraction of the job-posting volume. Learn Terraform first, add CDK if your target roles require AWS-native tooling. The Terraform Associate exam ($70.50 via mindhub.com) is increasingly co-listed with AWS certifications in senior architect job descriptions.

How much does a cloud architect's full toolchain cost per user per month?+

The six free tools cost nothing. The paid tools -- Datadog (priced per host, not per user), PagerDuty (~$20/user/month), Confluence (~$5.75/user/month), GitHub Copilot (~$10/user/month), and GitHub Actions above the free tier -- run roughly $35 to $50 per user per month depending on Datadog host count. Enterprise deals reduce list prices by 20 to 40 percent. The most common budget shock is Datadog at scale: a 200-host environment at list price runs $36,000 per year.

Do I need all eleven tools to work as a cloud architect?+

No. The six free tools are the floor. You can do serious cloud architecture work with Terraform, AWS CLI, draw.io, k9s, Helm, and Grafana plus Prometheus at no licensing cost. The paid tools become necessary at team and infrastructure scale. Many architects at companies with under 20 engineers and 30 production services never touch Datadog or PagerDuty professionally -- they use CloudWatch alarms and shared Slack channels for on-call. The value of this list is understanding what senior-level market expectations look like, not installing all eleven tools on day one.

Is Datadog worth the cost versus self-hosting Grafana plus Prometheus?+

At under 15 production services with a team that has no dedicated SRE resources, Datadog usually wins the total-cost analysis because the engineering time to operate Grafana plus Prometheus plus Loki correctly costs more than the Datadog subscription. At 50 or more services with a dedicated platform team, Grafana plus Prometheus often becomes the better economic choice. Start with Datadog's 14-day free trial, instrument your services properly, and make the build-versus-buy decision once you know your actual observability requirements and alert volume.

Should I get the Terraform Associate, the CKA (Certified Kubernetes Administrator), or the AWS Solutions Architect Associate first?+

AWS Solutions Architect Associate first. It adds an average $26,000 to annual salary (ThinkCloudly 2026), costs $150, and appears in more cloud architect job descriptions than any other single credential. Terraform Associate ($70.50 via mindhub.com) is a strong second if your target roles list Terraform as required. CKA ($395) is the right third if you are in a Kubernetes-heavy environment or targeting roles with platform engineering overlap. The sequence matches the market signal from job-posting frequency.

What tool is most often missing from a junior cloud architect's workflow?+

Architecture Decision Records in a structured tool like Confluence. Junior architects draft designs, build infrastructure, and move on without formalizing the decision context. Senior architects who have lived through the consequences -- re-explaining the same choices to new engineers, re-litigating settled debates, losing institutional knowledge when key people leave -- build ADR habits early and keep them. A good ADR takes 20 minutes to write and saves five hours of future explanation per decision per year. Start writing them now, even for personal and side-project infrastructure.