4.4 Creating Your First Cluster

Creating Your First Cluster

This walkthrough takes you from a config file to a running HPC cluster. The provisioning process takes about 10-15 minutes.

Cluster Architecture Overview

A typical ParallelCluster setup consists of:

Component Description
Head Node Controls compute nodes, hosts the Slurm scheduler, and provides your login environment
Compute Nodes Dynamically provisioned when jobs are submitted; scale to zero when idle
Shared Storage FSx for Lustre or EBS volumes shared across all nodes (mounted at /shared)
Placement Groups Keep instances physically close for maximum network performance
Scheduler Slurm manages job queues and resource allocation

Step 1: Prepare Your Configuration

ParallelCluster uses a YAML config file to describe your cluster. You can either:

  • Use the PCUI wizard to build one interactively
  • Write one manually (or use a template)

A config file defines your head node size, compute queues, storage, networking, and more. The great thing is you can share these files with colleagues so they can launch identical clusters.

Example Configuration

HeadNode:
  InstanceType: c6a.xlarge
  Networking:
    SubnetId: subnet-xxxxx
  Dcv:
    Enabled: true

Scheduling:
  Scheduler: slurm
  SlurmQueues:
    - Name: compute
      ComputeResources:
        - Name: hpc-nodes
          Instances:
            - InstanceType: hpc6a.48xlarge
          MinCount: 0
          MaxCount: 4
          Efa:
            Enabled: true

SharedStorage:
  - Name: FsxLustre
    StorageType: FsxLustre
    MountDir: /shared
    FsxLustreSettings:
      StorageCapacity: 1200
      DeploymentType: SCRATCH_2

See the Reference section for full config file documentation and templates.

Step 2: Create the Cluster

  1. Open the ParallelCluster UI
  2. Click Create Cluster
  3. Follow the step-by-step wizard to configure your cluster
  4. Use Dry Run to validate your configuration before deploying
  5. Click Create to launch

Via CLI

pcluster create-cluster \
  --cluster-name my-hpc-cluster \
  --cluster-configuration cluster-config.yaml

Step 3: Wait for Provisioning

The cluster will take 10-15 minutes to provision. You can monitor progress:

  • PCUI: Watch the cluster status change from “Creating” to “Running”
  • CLI: Run pcluster list-clusters to check status
pcluster list-clusters
Tip

Only one cluster of a given name can exist at any time per AWS Region per account.

Key Configuration Concepts

Concept Description
Queues Define groups of compute nodes with specific instance types
MinCount / MaxCount Control auto-scaling bounds (set MinCount to 0 for cost savings)
EFA Elastic Fabric Adapter for high-speed inter-node networking (100 Gbps)
Placement Groups Keep nodes physically close for lowest latency
Shared Storage FSx for Lustre provides high-performance shared filesystem
Warning

Cost Reminder: Compute nodes scale to zero when idle, but the head node and storage run continuously. Delete your cluster when you’re done to stop all charges.

Your cluster is running — time to connect: Connecting to Your Cluster