Three-Command CLI Workflow for Model Deployment

This blog post focuses on new features and improvements. For a comprehensive list, including bug fixes, please see the release notes.

Three-Command CLI Workflow for Model Deployment

Getting models from development to production typically involves multiple tools, configuration files, and deployment steps. You scaffold a model locally, test it in isolation, configure infrastructure, write deployment scripts, and then push to production. Each step requires context switching and manual coordination.

With Clarifai 12.2, we’ve streamlined this into a 3-command workflow: model init, model serve, and model deploy. These commands handle scaffolding, local testing, and production deployment with automatic infrastructure provisioning, GPU selection, and health checks built in.

This isn’t just faster. It removes the friction between building a model and running it at scale. The CLI handles dependency management, runtime configuration, and deployment orchestration, so you can focus on model logic instead of infrastructure setup.

This release also introduces Training on Pipelines, allowing you to train models directly within pipeline workflows using dedicated compute resources. We’ve added Video Intelligence support through the UI, improved artifact lifecycle management, and expanded deployment capabilities with dynamic nodepool routing and new cloud provider support.

Let’s walk through what’s new and how to get started.

Streamlined Model Deployment: 3 Commands to Production

The typical model deployment workflow involves multiple steps: scaffold a project structure, install dependencies, write configuration files, test locally, containerize, provision infrastructure, and deploy. Each step requires switching contexts and managing configuration across different tools.

Clarifai’s CLI consolidates this into three commands that handle the entire lifecycle from scaffolding to production deployment.

How It Works

1. Initialize a model project

clarifai model init --toolkit vllm --model-name Qwen/Qwen3-0.6B

This scaffolds a complete model directory with the structure Clarifai expects: config.yaml, requirements.txt, and model.py. You can use built-in toolkits (HuggingFace, vLLM, LMStudio, Ollama) or start from scratch with a base template.

The generated config.yaml includes smart defaults for runtime settings, compute requirements, and deployment configuration. You can modify these or leave them as-is for basic deployments.

2. Test locally

clarifai model serve

This starts a local inference server that behaves exactly like the production deployment. You can test your model with real requests, verify behavior, and iterate quickly without deploying to the cloud.

The serve command supports multiple modes:

Environment mode: Runs directly in your local Python environment
Docker mode: Builds and runs in a container for production parity
Standalone gRPC mode: Exposes a gRPC endpoint for integration testing

3. Deploy to production

clarifai model deploy

This command handles everything: validates your config, builds the container, provisions infrastructure (cluster, nodepool, deployment), and monitors until the model is ready.

The CLI shows structured deployment phases with progress indicators, so you know exactly what’s happening at each step. Once deployed, you get a public API endpoint that’s ready to handle inference requests.

Intelligent Infrastructure Provisioning

The CLI now handles GPU selection automatically during model initialization. GPU auto-selection analyzes your model’s memory requirements and toolkit specifications, then selects appropriate GPU instances.

Multi-cloud instance discovery works across cloud providers. You can use GPU shorthands like h100 or legacy instance names, and the CLI normalizes them across AWS, Azure, DigitalOcean, and other supported providers.

Custom Docker base images let you optimize build times. If you have a pre-built image with common dependencies, the CLI can use it as a base layer for faster toolkit builds.

Deployment Lifecycle Management

Once deployed, you need visibility into how models are running and the ability to control them. The CLI provides commands for the full deployment lifecycle:

Check deployment status:

clarifai model status --deployment

View logs:

clarifai model logs --deployment

Undeploy:

clarifai model undeploy --deployment

The CLI also supports managing deployments directly by ID, which is useful for scripting or CI/CD pipelines.

Enhanced Local Development

Local testing is critical for fast iteration, but it often diverges from production behavior. The CLI bridges this gap with local runners that mirror production environments.

The model serve command now supports:

Concurrency controls: Limit the number of simultaneous requests to simulate production load
Optional Docker image retention: Keep built images for faster restarts during development
Health-check configuration: Configure health-check settings using flags like --health-check-port, --disable-health-check, and --auto-find-health-check-port

Local runners also support the same inference modes as production (streaming, batch, multi-input), so you can test complex workflows locally before deploying.

Simplified Configuration

Model configuration used to require manually editing YAML files with exact field names and nested structures. The CLI now handles normalization automatically.

When you initialize a model, config.yaml includes only the fields you need to customize. Smart defaults fill in the rest. If you add fields with slightly incorrect names or formats, the CLI normalizes them during deployment.

This reduces configuration errors and makes it easier to migrate existing models to Clarifai.

Why This Matters

The 3-command workflow removes friction from model deployment. You go from idea to production API in minutes instead of hours or days. The CLI handles infrastructure complexity, so you don’t need to be an expert in Kubernetes, Docker, or cloud compute to deploy models at scale.

This also standardizes deployment across teams. Everyone uses the same commands, the same configuration format, and the same testing workflow. This makes it easier to share models, reproduce deployments, and onboard new team members.

For a complete guide on the new CLI workflow, including examples and advanced configuration options, see the Deploy Your First Model via CLI documentation.

Training on Pipelines

Clarifai Pipelines, introduced in 12.0, allow you to define and execute long-running, multi-step AI workflows. With 12.2, you can now train models directly within pipeline workflows using dedicated compute resources.

Training on Pipelines integrates model training into the same orchestration layer as inference and data processing. This means training jobs run on the same infrastructure as your other workloads, with the same autoscaling, monitoring, and cost controls.

How It Works

You can initialize training pipelines using templates via the CLI. This creates a pipeline structure with pre-configured training steps. You specify your dataset, model architecture, and training parameters in the pipeline configuration, then run it like any other pipeline.

This creates a pipeline structure with pre-configured training steps. You specify your dataset, model architecture, and training parameters in the pipeline configuration, then run it like any other pipeline.

The platform handles:

Provisioning GPUs for training workloads
Scaling compute based on job requirements
Saving checkpoints as Artifacts for versioning
Monitoring training metrics and logs

Once training completes, the resulting model is automatically compatible with Clarifai’s Compute Orchestration platform, so you can deploy it using the same model deploy workflow. Read more about Pipelines here.

UI Experience

We’ve also launched a new UI for training models within pipelines. You can configure training parameters, select datasets, and monitor progress directly from the platform without writing code or managing infrastructure.

This makes it easier for teams without deep ML engineering expertise to train custom models and integrate them into production workflows.

Training on Pipelines is available in Public Preview. For more details, see the Pipelines documentation.

Artifact Lifecycle Improvements

With 12.2, we’ve improved how Artifacts handle expiration and versioning.

Artifacts no longer expire automatically by default. Previously, artifacts had a default retention policy that would delete them after a certain period. Now, artifacts persist indefinitely unless you explicitly set an expires_at value during upload.

This gives you full control over artifact lifecycle management. You can set expiration dates for temporary outputs (like intermediate checkpoints during experimentation) while keeping production artifacts indefinitely.

The CLI now displays latest-version-id alongside artifact visibility, making it easier to reference the most recent version without listing all versions first.

These changes make Artifacts more predictable and easier to manage for long-term storage of pipeline outputs.

Video Intelligence

Clarifai now supports video intelligence through the UI. You can connect video streams to your application and apply AI analysis to detect objects, track movement, and generate insights in real time.

This expands Clarifai’s capabilities beyond image and text processing to handle live video feeds, enabling use cases like security monitoring, retail analytics, and automated content moderation for video platforms.

Video Intelligence is available now.

Deployment Enhancements

We’ve made several improvements to how deployments work across compute infrastructure.

Dynamic nodepool routing allows you to attach multiple nodepools to a single deployment with configurable scheduling strategies. This gives you more control over how traffic is distributed across different compute resources, which is useful for handling spillover traffic or routing to specific hardware based on request type.

Deployment visibility has been improved with status chips and enhanced list views across Deployments, Nodepools, and Clusters. You can see at a glance which deployments are healthy, which are scaling, and which need attention.

New cloud provider support: We’ve added DigitalOcean and Azure as supported instance providers, giving you more flexibility in where you deploy models.

Start and stop deployments explicitly: You can now pause deployments without deleting them. This preserves configuration while freeing up compute resources, which is useful for dev/test environments or models with intermittent traffic.

Redesigned Deployment details page provides expanded status visibility, including replica counts, nodepool health, and request metrics, all in one view.

Additional Changes

Platform Updates

We’ve launched several UI improvements to make the platform easier to navigate and use:

New Model Library UI provides a streamlined experience for browsing and exploring models
Universal Search added to the navbar for quick access to models, datasets, and workflows
New account experience with improved onboarding and settings management
Home 3.0 interface with a refreshed design and better organization of recent activity

Playground Improvements

The Playground now includes major upgrades to the Universal Search experience, with multi-panel (compare mode) support, improved workspace handling, and smarter model auto-selection. Model selections are panel-aware to prevent cross-panel conflicts, and the UI can display simplified model names for a cleaner experience.

Pipeline Step Visibility

You can now set pipeline steps to be publicly visible during initialization through both the CLI and builder APIs. By default, pipelines and pipeline step templates are created with PRIVATE visibility, but you can override this when sharing workflows across teams or with the community.

Modules Deprecation

Support for Modules has been fully dropped. Modules previously extended Clarifai’s UIs and enabled customized backend processing, but they’ve been replaced by more flexible alternatives like Artifacts and Pipelines.

Python SDK Updates

We’ve made several improvements to the Python SDK, including:

Fixed ModelRunner health server starting twice, which could cause “Address already in use” errors
Added admission-control support for model runners
Improved signal handling and zombie process reaping in runner containers
Refactored the MCP server implementation for better logging clarity

For a complete list of SDK updates, see the Python SDK changelog.

Ready to Start Building?

You can start using the new 3-command deployment workflow today. Initialize a model with clarifai model init, test it locally with clarifai model serve, and deploy to production with clarifai model deploy.

For teams running long-running training jobs, Training on Pipelines provides a way to integrate model training into the same orchestration layer as your inference workloads, with dedicated compute and automatic checkpoint management.

Video Intelligence support adds real-time video stream processing to the platform, and deployment improvements give you more control over how models run across different compute environments.

The new CLI workflow is available now. Check out the Deploy Your First Model via CLI guide to get started, or explore the full 12.2 release notes for complete details.

If you have questions or need help while building, join us on Discord. Our community and team are there to help.

What's Hot

Clarifai Reasoning Engine Achieves 414 Tokens Per Second on Kimi K2.5

Seniors ballot every week just to play mahjong with young S’poreans

EU’s Patience Is Running Out, Expects Google To Pay Up Instantly

Three-Command CLI Workflow for Model Deployment

Clarifai Reasoning Engine Achieves 414 Tokens Per Second on Kimi K2.5

Influencer Marketing in Numbers: Key Stats

Tremble Chatbot App Access, Costs, and Feature Insights

U.S. Holds Off on New AI Chip Export Rules in Surprise Move in Tech Export Wars

How Joseph Paradiso’s sensing innovations bridge the arts, medicine, and ecology | MIT News

A better method for planning complex visual tasks | MIT News

How a Chinese AI Firm Quietly Pulled Off a Hardware Power Move

The World’s Heart Beats in Bytes — Why Europe Needs Better Tech Cardio

HHS Is Using AI Tools From Palantir to Target ‘DEI’ and ‘Gender Ideology’ in Grants

Rising Digital Financial Fraud in South Africa

Most Popular

How a Chinese AI Firm Quietly Pulled Off a Hardware Power Move

The World’s Heart Beats in Bytes — Why Europe Needs Better Tech Cardio

HHS Is Using AI Tools From Palantir to Target ‘DEI’ and ‘Gender Ideology’ in Grants

Subscribe to Updates

What's Hot

Three-Command CLI Workflow for Model Deployment

Three-Command CLI Workflow for Model Deployment

Streamlined Model Deployment: 3 Commands to Production

How It Works

Intelligent Infrastructure Provisioning

Deployment Lifecycle Management

Enhanced Local Development

Simplified Configuration

Why This Matters

Training on Pipelines

How It Works

UI Experience

Artifact Lifecycle Improvements

Video Intelligence

Deployment Enhancements

Additional Changes

Platform Updates

Playground Improvements

Pipeline Step Visibility

Modules Deprecation

Python SDK Updates

Ready to Start Building?

Related Posts