Favicon of Together AI

Together AI

Together AI gives developers inference, fine-tuning, and GPU clusters for open-source model apps.

Visit Together AI
Screenshot of Together AI website

Together AI is an AI infrastructure cloud for teams building with open-source models. It combines inference, fine-tuning, GPU clusters, storage, and code sandboxes in one developer platform.

Key Highlights

  • Serverless Inference for open-source models with no infrastructure to manage
  • Batch Inference for asynchronous jobs, described at up to 30 billion tokens per model
  • Dedicated Inference and containers for single-tenant model and media workloads
  • GPU Clusters with NVIDIA H100, H200, and B200 capacity
  • Fine-Tuning, Sandbox, and Managed Storage for model shaping, code execution, and storage

What Makes It Different

Together AI combines broad infrastructure with systems research. Its site claims 2x faster inference, 60% lower cost, and 90% faster pre-training through workload-specific optimization and the Together Kernel Collection. Instead of selling only an API, it lets teams move from serverless inference to dedicated endpoints or reserved clusters.

Features & Capabilities

Developers can run models on demand, submit batch jobs, deploy dedicated endpoints, or use containers for generative media. Compute spans self-serve clusters to thousands of GPUs, with object storage, parallel filesystems, and zero egress fees.

For model shaping, Together AI supports fine-tuning open-source models. The site says this can improve accuracy, reduce hallucinations, and control behavior without managing training infrastructure. Sandbox adds secure code execution and development environments.

User Ratings and Testimonials

Together AI does not publish a third-party rating, customer names, or customer reviews. The main buying caution is billing: estimates may combine token rates, GPU hours, sandbox compute, storage, and fine-tuning tokens.

Pricing & Value

The pricing page is usage-based and says teams can start free, but it does not document a full free plan. Published prices include:

  • Serverless Inference: per 1M tokens, visible rows include $0.03 input/$0.12 output and $2.10 input/$4.40 output
  • Dedicated Inference: 1x H100 80 GB at $6.49/hour, 1x HGX B200 180GB at $11.95/hour
  • GPU Clusters: on-demand H100 at $5.49/hour, H200 at $6.79/hour, B200 at $9.95/hour
  • Sandbox and Storage: $0.0446/hour per vCPU, $0.0149/hour per GiB RAM, $0.03 per 60 minute code session, and $0.16/GiB/month storage
  • Fine-Tuning: up to 16B supervised fine-tuning starts at $0.48 per 1M tokens for LoRA and $0.54 for full fine-tuning

FAQs

What do Together AI do?

It provides AI cloud infrastructure for running, fine-tuning, and scaling open-source models through inference APIs and GPU clusters.

Is Together AI free to use?

The pricing page says you can start for free, but it does not document a free plan, trial, credits, or open-source access.

Is Together AI a good company?

It is a private AI infrastructure company. Judge it by latency, model coverage, uptime, support, and total cost for your workload.

Who funds Together AI?

Public funding reports name General Catalyst and Prosperity7 as recent lead investors, with Salesforce Ventures, Nvidia, and others involved.

How does Together AI work?

You call its APIs or deploy dedicated GPU infrastructure, then choose serverless inference, batch jobs, fine-tuning, storage, or clusters.

Tags:

Share:

Chat with AI

Ask specific questions about this tool.

Ad
Favicon

 

  
 

You might also like

Favicon

 

  
  
Favicon

 

  
  
Favicon

 

  
  
Rankings:
Curated by Michał Śnieżyński. Website may contain affiliate links.

Command Menu