Chat Completions

Create conversational AI experiences with the chat completions endpoint. Learn about parameters, message formats, and advanced features.

Endpoint Overview

The chat completions endpoint allows you to have conversations with AI models using a structured message format.

POST /v1/chat/completions

Create a chat completion response for the given conversation messages.

Basic Request

Here's a simple example of how to create a chat completion:

Basic Chat Completion
curl -X POST "https://api.tav-ai.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is artificial intelligence?"
      }
    ]
  }'

Message Roles

Messages in the conversation can have different roles that define their purpose:

👤

user

Messages from the end user or human in the conversation

"role": "user"
🤖

assistant

Messages from the AI assistant in previous turns

"role": "assistant"
⚙️

system

Instructions that guide the assistant's behavior

"role": "system"

Conversation Example

Here's how to structure a multi-turn conversation:

Multi-turn Conversation
{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant that explains complex topics simply."
    },
    {
      "role": "user",
      "content": "What is machine learning?"
    },
    {
      "role": "assistant",
      "content": "Machine learning is a type of artificial intelligence where computers learn to make predictions or decisions by finding patterns in data, rather than being explicitly programmed for every scenario."
    },
    {
      "role": "user",
      "content": "Can you give me a simple example?"
    }
  ]
}

Request Parameters

Customize your chat completions with these parameters:

model string Required

The model to use for completion (e.g., "gpt-4o-mini", "gpt-4o")

messages array Required

Array of message objects representing the conversation history

max_tokens integer Optional

Maximum number of tokens to generate (default: model's max)

temperature number Optional

Controls randomness (0.0 to 2.0, default: 1.0)

top_p number Optional

Nucleus sampling parameter (0.0 to 1.0, default: 1.0)

frequency_penalty number Optional

Penalize repeated tokens (-2.0 to 2.0, default: 0.0)

presence_penalty number Optional

Penalize new tokens based on presence (-2.0 to 2.0, default: 0.0)

stop string | array Optional

Stop sequences where generation should end

stream boolean Optional

Stream response as it's generated (default: false)

Advanced Parameters Example

Here's how to use advanced parameters to fine-tune the response:

Advanced Configuration
{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a creative writing assistant."
    },
    {
      "role": "user",
      "content": "Write a short story about a robot discovering emotions."
    }
  ],
  "max_tokens": 500,
  "temperature": 0.8,
  "top_p": 0.9,
  "frequency_penalty": 0.5,
  "presence_penalty": 0.2,
  "stop": ["\n\nTHE END"]
}

Response Format

The API returns a structured response with the completion and metadata:

Response Structure
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Artificial intelligence (AI) is a branch of computer science that aims to create machines capable of performing tasks that typically require human intelligence..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 45,
    "total_tokens": 57
  }
}

Response Fields

id

Unique identifier for the completion

choices

Array of completion choices (usually one)

finish_reason

Why generation stopped: "stop", "length", "content_filter"

usage

Token usage information for billing

Streaming Responses

For real-time applications, you can stream responses as they're generated:

Streaming Request
curl -X POST "https://api.tav-ai.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Tell me a story"
      }
    ],
    "stream": true
  }'
Python Streaming Example
import requests
import json

response = requests.post(
    "https://api.tav-ai.com/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "model": "gpt-4o-mini",
        "messages": [{"role": "user", "content": "Tell me a story"}],
        "stream": True
    },
    stream=True
)

for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = line[6:]
            if data != '[DONE]':
                chunk = json.loads(data)
                if chunk['choices'][0]['delta'].get('content'):
                    print(chunk['choices'][0]['delta']['content'], end='', flush=True)

Next Steps

Now that you understand chat completions, explore these related topics:

🤖 Model Selection

Learn about different models and their capabilities

Explore Models →

⚡ Quick Setup

Get started with your first API integration

Setup Guide →

📊 Dashboard

Monitor usage and manage your API keys

Open Dashboard →