Favicon of Groq

Groq

Groq runs open AI models on its own LPU chips, giving developers very fast, low cost token inference through an OpenAI compatible API.

Visit Groq
Screenshot of Groq website

Groq runs open large language models on custom hardware built only for inference, so responses come back very fast at a predictable per token price. It is built for developers and teams who serve AI models in production and care about latency and cost. You reach the models through GroqCloud, an OpenAI compatible API you point existing code at in two lines.

Key Highlights

  • Custom LPU chips, first designed in 2016 specifically for inference, not general GPUs
  • GroqCloud hosts open models including GPT-OSS, Llama, Qwen3, Kimi K2, and Whisper
  • OpenAI compatible API: change the base URL and key, keep your existing code
  • Pay per token pricing in USD with no idle infrastructure charges
  • Batch API runs async workloads at 50% lower cost, plus built-in web search and code execution

What Makes It Different

Most inference providers run on GPUs alone. Groq designed its own chip, the LPU (Language Processing Unit), purpose-built for running models rather than training them. That hardware produces high token-per-second speeds, with Llama 3.1 8B Instant served at roughly 840 tokens per second. Pricing stays linear and published up front, with no surge pricing, so a model costs the same per million tokens at any volume.

Features & Capabilities

You call GroqCloud the same way you call OpenAI: set the base URL to the Groq endpoint, add your API key, and your existing client library works. You pick from a catalog of open models for chat, plus Whisper for transcription and text-to-speech voices. Compound systems route a query across models and call server-side tools (web search, code execution, browser automation) billed by usage. Groq says 3 million developers and teams build on the platform, including the McLaren Formula 1 team.

User Ratings and Testimonials

Groq is widely recognized as one of the fastest inference providers, and the 2025 Artificial Analysis AI Adoption Survey lists it among providers developers use or consider. Fintool reported chat speed up 7.41x and costs down 89% after switching to GroqCloud. The main trade-off is scope: Groq hosts open models, not proprietary ones like GPT-4 or Claude, so teams needing those must look elsewhere.

Pricing & Value

Groq uses pay-as-you-go, per token pricing (all prices in USD per million tokens):

  • Llama 3.1 8B Instant: $0.05 input and $0.08 output, the cheapest listed chat model
  • GPT-OSS 20B: $0.075 input and $0.30 output
  • GPT-OSS 120B: $0.15 input and $0.60 output
  • Llama 3.3 70B Versatile: $0.59 input and $0.79 output
  • Whisper Large v3 Turbo: $0.04 per hour of audio transcribed

New users start on a free tier before adding billing, and the Batch API plus prompt caching cut costs further for high-volume workloads. The predictable pricing is the main draw for teams that need to plan inference spend.

FAQs

Is Groq owned by Nvidia?

No. Groq is an independent, privately held company founded in 2016 by Jonathan Ross, with its own LPU chips and venture backing.

What is Groq used for?

Running open AI models fast and cheaply. Developers use GroqCloud to serve chat, speech, and transcription through an OpenAI compatible API.

Is Groq a Chinese company?

No. Groq is a US company headquartered in Mountain View, California, in Silicon Valley.

Is Groq going public?

Not yet. Groq is still a private company funded by venture investors and has not announced an IPO date.

What is inference in Groq?

Inference is running an already-trained model to generate answers. Groq runs this step on its LPU chips for high token-per-second speed.

Is Groq inference free?

There is a free tier to start, but production use is pay-as-you-go, priced per million tokens in USD. The Batch API costs 50% less.

Is Groq better than ChatGPT?

They are different things. Groq is inference infrastructure for open models, while ChatGPT is a consumer chatbot built on OpenAI's own models.

Will Nvidia buy Groq?

There is no confirmed deal. Acquisition talk is speculation, and Groq remains an independent company at the time of writing.

Share:

Chat with AI

Ask specific questions about this tool.

Ad
Favicon

 

  
 

You might also like

Favicon

 

  
  
Favicon

 

  
  
Favicon

 

  
  
Rankings:
Curated by Michał Śnieżyński. Website may contain affiliate links.

Command Menu