Cruz Compute Controller

Metal to Workload.
One Platform.

Cruz Compute Controller, C³, manages the complete infrastructure lifecycle — from bare metal rack discovery to running AI workloads — across every GPU vendor and every scheduler, from a single control plane.

See a Demo

Metal-to-Workload Pipeline

Six Stages. Fully Automated.

Every step from powering on a rack to monitoring a running AI training job, orchestrated through a single pipeline with autonomous agents at each stage.

Stage 1

🔍

Discover

Redfish / BMC auto-discovery of GPU racks and servers

Stage 2

📦

Cluster

Form bare metal or Kubernetes clusters from discovered nodes

Stage 3

⚙️

Platform

Compose scalable units with networking, storage & GPU topology

Stage 4

📋

Schedule

Multi-scheduler orchestration — SLURM, Run:ai, KAI unified

Stage 5

🚀

Workload

Submit training jobs, inference endpoints, MIG-partitioned workloads

Stage 6

📊

Observe

GPU telemetry, FinOps metering, tenant cost allocation

Two Products, One Platform

Infrastructure Meets AI Operations

C3 is architecturally split into two complementary products that cover the full Metal-to-Workload lifecycle.

🏗️

Open Rack Controller

Rack-scale infrastructure management. Discover hardware, form clusters, provision platforms — from a single rack BMC to a fleet of GPU servers.

Bare metal discovery via Redfish, iDRAC, and BMC protocols
Rack management with full power, thermal, and topology visibility
Kubernetes cluster lifecycle — create, discover, health-check
CNI detection, driver registry, and network validation
Composable platform provisioning across scalable units
GPU driver registry and automated installation

Supported Rack Platforms

Aivres NVL72 Aivres HGX B300 Celestica Helios NVIDIA MGX Supermicro

🧠

AI Cluster Manager

GPU-aware workload orchestration. Allocate GPUs, manage schedulers, run training jobs, and meter consumption — across every vendor.

Multi-scheduler orchestration — SLURM, Run:ai, and KAI from one API
GPU allocation with full-device and MIG slice granularity
NVLink and UALoE topology-aware scheduling
Real-time GPU telemetry — utilization, temperature, power, throttling
Multi-tenant GPU quotas with N:N scheduler-tenant mapping
FinOps cost metering per tenant, per workload, per GPU-hour
Inference endpoint management — vLLM, Triton, KServe

Supported GPU Vendors

NVIDIA Blackwell NVIDIA Hopper AMD MI450 AMD MI300X Intel Gaudi

Deployment Architecture

From Rack-Scale to Fleet-Scale

C3 manages GPU infrastructure in two complementary modes for maximum flexibility and density.

Rack Management

The rack is the compute unit

A single rack BMC is the entry point. All GPUs are interconnected within the rack via NVLink Switch Trays or UALoE fabric — no inter-rack GPU networking required. Ideal for maximum-density AI training with tightly coupled GPU interconnects.

GPUs per Rack

130

TB/s NVLink

BMC Entry Point

Liquid

Cooling Ready

Scalable Units

Compose clusters from distributed GPU servers

Multiple discrete GPU servers are networked together into composable platforms. Each server's BMC is discovered individually or through a K8s control plane. Inter-node fabric (InfiniBand, RoCEv2, Ethernet) is provisioned as part of the platform.

Servers per Unit

800G

Inter-Node Fabric

K8s

Native Orchestration

Multi

Scheduler Support

Why C3

Built Different

Purpose-built for the GPU datacenter era — a ground-up platform for multi-vendor, multi-scheduler GPU infrastructure.

🌐

Multi-Vendor GPU

Manage NVIDIA Blackwell, AMD MI450, and Intel accelerators from the same control plane. Your GPU vendor wins on silicon merit, not management lock-in.

🔄

Multi-Scheduler

SLURM, Run:ai, and KAI orchestrated through a unified abstraction layer. Mix schedulers across partitions within the same platform.

💰

55–72% Cost Savings

Break free from proprietary networking lock-in. Open Ethernet + C3 delivers equivalent performance at a fraction of InfiniBand-only BOM cost.

🧬

Topology Aware

NVLink domain and UALoE fabric topology drives GPU-aware scheduling. Jobs land on physically optimal GPU groups, not random allocations.

📐

MIG Slicing

Full MIG partition lifecycle — create, delete, query profiles. Run inference on GPU slices while training owns the rest.

📊

FinOps Built In

Per-tenant, per-workload, per-GPU-hour cost metering. Know exactly what each team consumes and allocate GPU spend with precision.

Agentic AI Infrastructure

Human in the Middle.
Not in the Weeds.

Meet Ask Ada — your GPU infrastructure co-pilot.

Nine autonomous agents manage every stage of the Metal-to-Workload pipeline. They discover racks, form clusters, provision platforms, allocate GPUs, and submit workloads — continuously and without scripts.

Humans don't push buttons. They supervise. Every critical decision surfaces for approval. Agents learn from supervisory feedback through adaptive memory, getting smarter with every deployment cycle.

See Ada in Action

🔍

Discovery

📦

Inventory

🔗

Cluster

⚙️

Platform

🧠

Orchestrator

📋

Scheduler

🚀