Cruz Compute Controller - C3 | Dorado Software
Cruz Compute Controller

Metal to Workload.
One Platform.

Cruz Compute Controller, C3, manages the complete infrastructure lifecycle — from bare metal rack discovery to running AI workloads — across every GPU vendor and every scheduler, from a single control plane.

Six Stages. Fully Automated.

Every step from powering on a rack to monitoring a running AI training job, orchestrated through a single pipeline with autonomous agents at each stage.

Stage 1
🔍
Discover
Redfish / BMC auto-discovery of GPU racks and servers
Stage 2
📦
Cluster
Form bare metal or Kubernetes clusters from discovered nodes
Stage 3
⚙️
Platform
Compose scalable units with networking, storage & GPU topology
Stage 4
📋
Schedule
Multi-scheduler orchestration — SLURM, Run:ai, KAI unified
Stage 5
🚀
Workload
Submit training jobs, inference endpoints, MIG-partitioned workloads
Stage 6
📊
Observe
GPU telemetry, FinOps metering, tenant cost allocation

Infrastructure Meets AI Operations

C3 is architecturally split into two complementary products that cover the full Metal-to-Workload lifecycle.

🏗️

Open Rack Controller

Rack-scale infrastructure management. Discover hardware, form clusters, provision platforms — from a single rack BMC to a fleet of GPU servers.
  • Bare metal discovery via Redfish, iDRAC, and BMC protocols
  • Rack management with full power, thermal, and topology visibility
  • Kubernetes cluster lifecycle — create, discover, health-check
  • CNI detection, driver registry, and network validation
  • Composable platform provisioning across scalable units
  • GPU driver registry and automated installation
Supported Rack Platforms
Aivres NVL72 Aivres HGX B300 Celestica Helios NVIDIA MGX Supermicro
🧠

AI Cluster Manager

GPU-aware workload orchestration. Allocate GPUs, manage schedulers, run training jobs, and meter consumption — across every vendor.
  • Multi-scheduler orchestration — SLURM, Run:ai, and KAI from one API
  • GPU allocation with full-device and MIG slice granularity
  • NVLink and UALoE topology-aware scheduling
  • Real-time GPU telemetry — utilization, temperature, power, throttling
  • Multi-tenant GPU quotas with N:N scheduler-tenant mapping
  • FinOps cost metering per tenant, per workload, per GPU-hour
  • Inference endpoint management — vLLM, Triton, KServe
Supported GPU Vendors
NVIDIA Blackwell NVIDIA Hopper AMD MI450 AMD MI300X Intel Gaudi

From Rack-Scale to Fleet-Scale

C3 manages GPU infrastructure in two complementary modes for maximum flexibility and density.

Rack Management

The rack is the compute unit

A single rack BMC is the entry point. All GPUs are interconnected within the rack via NVLink Switch Trays or UALoE fabric — no inter-rack GPU networking required. Ideal for maximum-density AI training with tightly coupled GPU interconnects.

72
GPUs per Rack
130
TB/s NVLink
1
BMC Entry Point
Liquid
Cooling Ready

Scalable Units

Compose clusters from distributed GPU servers

Multiple discrete GPU servers are networked together into composable platforms. Each server's BMC is discovered individually or through a K8s control plane. Inter-node fabric (InfiniBand, RoCEv2, Ethernet) is provisioned as part of the platform.

N
Servers per Unit
800G
Inter-Node Fabric
K8s
Native Orchestration
Multi
Scheduler Support

Built Different

Purpose-built for the GPU datacenter era — a ground-up platform for multi-vendor, multi-scheduler GPU infrastructure.

🌐

Multi-Vendor GPU

Manage NVIDIA Blackwell, AMD MI450, and Intel accelerators from the same control plane. Your GPU vendor wins on silicon merit, not management lock-in.

🔄

Multi-Scheduler

SLURM, Run:ai, and KAI orchestrated through a unified abstraction layer. Mix schedulers across partitions within the same platform.

💰

55–72% Cost Savings

Break free from proprietary networking lock-in. Open Ethernet + C3 delivers equivalent performance at a fraction of InfiniBand-only BOM cost.

🧬

Topology Aware

NVLink domain and UALoE fabric topology drives GPU-aware scheduling. Jobs land on physically optimal GPU groups, not random allocations.

📐

MIG Slicing

Full MIG partition lifecycle — create, delete, query profiles. Run inference on GPU slices while training owns the rest.

📊

FinOps Built In

Per-tenant, per-workload, per-GPU-hour cost metering. Know exactly what each team consumes and allocate GPU spend with precision.

6
Pipeline Stages Automated
3
GPU Vendors Supported
3
Schedulers Unified
9
Autonomous Agents

Human in the Middle.
Not in the Weeds.

Meet Ask Ada — your GPU infrastructure co-pilot.

Nine autonomous agents manage every stage of the Metal-to-Workload pipeline. They discover racks, form clusters, provision platforms, allocate GPUs, and submit workloads — continuously and without scripts.

Humans don't push buttons. They supervise. Every critical decision surfaces for approval. Agents learn from supervisory feedback through adaptive memory, getting smarter with every deployment cycle.

See Ada in Action
🔍
Discovery
📦
Inventory
🔗
Cluster
⚙️
Platform
🧠
Orchestrator
📋
Scheduler
🚀
Workload
📊
Telemetry
🛡️
Safety

See C3 Running Live

Walk through the full Metal-to-Workload pipeline on real GPU infrastructure — Aivres NVL72 racks, Celestica Helios with 72 AMD MI450 GPUs, and Supermicro HGX B300 clusters. Multi-vendor, multi-scheduler, one platform.