Chat Completions - Tav-AI Documentation

Endpoint Overview

The chat completions endpoint allows you to have conversations with AI models using a structured message format.

POST /v1/chat/completions

Create a chat completion response for the given conversation messages.

Basic Request

Here's a simple example of how to create a chat completion:

Basic Chat Completion

curl -X POST "https://api.tav-ai.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is artificial intelligence?"
      }
    ]
  }'

Message Roles

Messages in the conversation can have different roles that define their purpose:

👤

user

Messages from the end user or human in the conversation

"role": "user"

🤖

assistant

Messages from the AI assistant in previous turns

"role": "assistant"

⚙️

system

Instructions that guide the assistant's behavior

"role": "system"

Conversation Example

Here's how to structure a multi-turn conversation:

Multi-turn Conversation

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant that explains complex topics simply."
    },
    {
      "role": "user",
      "content": "What is machine learning?"
    },
    {
      "role": "assistant",
      "content": "Machine learning is a type of artificial intelligence where computers learn to make predictions or decisions by finding patterns in data, rather than being explicitly programmed for every scenario."
    },
    {
      "role": "user",
      "content": "Can you give me a simple example?"
    }
  ]
}

Request Parameters

Customize your chat completions with these parameters:

model string Required

The model to use for completion (e.g., "gpt-4o-mini", "gpt-4o")

messages array Required

Array of message objects representing the conversation history

max_tokens integer Optional

Maximum number of tokens to generate (default: model's max)

temperature number Optional

Controls randomness (0.0 to 2.0, default: 1.0)

top_p number Optional

Nucleus sampling parameter (0.0 to 1.0, default: 1.0)

frequency_penalty number Optional

Penalize repeated tokens (-2.0 to 2.0, default: 0.0)

presence_penalty number Optional

Penalize new tokens based on presence (-2.0 to 2.0, default: 0.0)

stop string | array Optional

Stop sequences where generation should end

stream boolean Optional

Stream response as it's generated (default: false)

Advanced Parameters Example

Here's how to use advanced parameters to fine-tune the response:

Advanced Configuration

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a creative writing assistant."
    },
    {
      "role": "user",
      "content": "Write a short story about a robot discovering emotions."
    }
  ],
  "max_tokens": 500,
  "temperature": 0.8,
  "top_p": 0.9,
  "frequency_penalty": 0.5,
  "presence_penalty": 0.2,
  "stop": ["\n\nTHE END"]
}

Response Format

The API returns a structured response with the completion and metadata:

Response Structure

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Artificial intelligence (AI) is a branch of computer science that aims to create machines capable of performing tasks that typically require human intelligence..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 45,
    "total_tokens": 57
  }
}

Response Fields

id

Unique identifier for the completion

choices

Array of completion choices (usually one)

finish_reason

Why generation stopped: "stop", "length", "content_filter"

usage

Token usage information for billing

Streaming Responses

For real-time applications, you can stream responses as they're generated:

Streaming Request

curl -X POST "https://api.tav-ai.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Tell me a story"
      }
    ],
    "stream": true
  }'

Python Streaming Example

import requests
import json

response = requests.post(
    "https://api.tav-ai.com/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "model": "gpt-4o-mini",
        "messages": [{"role": "user", "content": "Tell me a story"}],
        "stream": True
    },
    stream=True
)

for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = line[6:]
            if data != '[DONE]':
                chunk = json.loads(data)
                if chunk['choices'][0]['delta'].get('content'):
                    print(chunk['choices'][0]['delta']['content'], end='', flush=True)

Next Steps

Now that you understand chat completions, explore these related topics:

🤖 Model Selection

Learn about different models and their capabilities

Explore Models →

⚡ Quick Setup

Get started with your first API integration

Setup Guide →

📊 Dashboard

Monitor usage and manage your API keys

Open Dashboard →