AI Infrastructure Management Built for GPUaaS and NeoCloud environments

From Design and Deployment to Observability and Management

Unlock the full potential of your AI infrastructure with dynamic deployment of composed platforms.

 

Cruz Open Compute Orchestrator (COCO) by Dorado revolutionizes AI infrastructure management, setting a new standard for efficiency, scalability, and ROI. Designed to enable organizations to deploy and operate GPU-as-a-Service (GPUaaS) platforms and NeoCloud environments, COCO leverages open compute standards and UEC/UAlink readiness to deliver a simple framework to design and deploy Compute Clusters and Workloads. COCO breaks down operational silos between NetOps, CloudOps, and AIOps and integrates AI Tenancy with Cruz well established multi-tenant features to best support business needs. With real-time visibility into job status and resource utilization, seamless integration with leading job schedulers like Slurm and Kubernetes, and AI-powered recommendations for optimizing performance, COCO maximizes GPU utilization rates, empowering organizations to unlock the full potential of their AI investments.   

Comprehensive platform for managing AI infrastructure from bare metal to orchestrating AI Compute Clusters, Tenancy and Interconnect — initial orchestration, deployment to ongoing management operations and observability.

 

Purpose-built for GPUaaS and NeoCloud environments.

Deployment

Rapid infrastructure provisioning and configuration — deploy complete AI infrastructure in hours

Management & Operations

Automated management throughout infrastructure lifetime

Observability

Continuous monitoring, scaling, and optimization 

Transform GPU infrastructure into a shared service platform with AI multi-tenant support. Enable multiple teams and tenants to efficiently use GPU resources while monitoring isolution, security, and performance 

 

Maximize ROI.

Resource Isolation

Secure tenant separation with namespace isolation, network policies, and quota management

Usage Metering

Detailed tracking of GPU hours, memory usage, and compute cycles per tenant

Fair Scheduling

Intelligent workload distribution to maximize utilization and minimize idol time

Built on open standards with support for next-generation AI networking technologies

 

 Empowers limitless growth. 

UEC Ready/Aware

Universal Ethernet Consortium standards support for high-performance networking. 400G/800G Ethernet with ultra-low latency and lossless RDMA fabric.  

UAELink Ready/Aware

Ultra Accelerator Link technology for optimized GPU-to-GPU communication. Open standard supporting multi-vendor GPU ecosystems.  

Key Features

Automated lifecycle management Day 0 to Day N  

Design, deploy, manage AI Infrastructure 

Rapidly orchestrate and deploy compute clusters  

Multi-tenant GPUaaS with secure isolation 

GPU partitioning support to leverage multi-tenancy 

Intelligent job scheduling via Slurm and Run:AI integrations

UEC & UAILink-ready network fabrics

Key Benefits

Reduce Complexity

Open and scalable tool supporting management of baremetal and virtualized Compute, Networking, and Storage from Day 0 to Day N  

 

Rapid Deployment

Day 0 to production in hours with automated provisioning, configuration, and valiation of compute infrastructure 

Reduce OPEX

Hyperscale efficiency with Enterprise-grade trust. Proactive monitoring and intelligent resource allocation

AI-Optimized Performance

Purpose-built for GPU workloads with optimized networking, storage, and compute configurations for training and inference

Maximize ROI

GPUaaS multi-tenacy and intelligent scheduling achieves improved utlization, maximixing return on GPU infrastructure investment

Cruz Open Compute Orchestrator Architecture
 
ComposePlatformArch-3

More Cruz Solutions

Cruz Integrated Products: Resource Management/NMS, Advanced Monitoring, Orchestration and Control
Learn more
Integrated Solutions for SONiC-based Networking: HW | SONiC | Cruz Management Platform + Support / Services
Learn more