⚡ NEW: Flux Klein 9B — Faster inference, stunning quality · Try Now
curl --request POST \
--url https://modelslab.com/api/v7/llm/v1/messages \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '
{
"model": "<string>",
"max_tokens": 2,
"messages": [
{
"content": "<string>"
}
],
"system": "<string>",
"temperature": 1,
"top_p": 0.5,
"stream": false
}
'{
"id": "<string>",
"content": [
{
"text": "<string>"
}
],
"model": "<string>",
"usage": {
"input_tokens": 123,
"output_tokens": 123
}
}Anthropic-compatible messages endpoint. Works with the Anthropic SDK, Claude Code, and any Anthropic-compatible client.
curl --request POST \
--url https://modelslab.com/api/v7/llm/v1/messages \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '
{
"model": "<string>",
"max_tokens": 2,
"messages": [
{
"content": "<string>"
}
],
"system": "<string>",
"temperature": 1,
"top_p": 0.5,
"stream": false
}
'{
"id": "<string>",
"content": [
{
"text": "<string>"
}
],
"model": "<string>",
"usage": {
"input_tokens": 123,
"output_tokens": 123
}
}Documentation Index
Fetch the complete documentation index at: https://docs.modelslab.com/llms.txt
Use this file to discover all available pages before exploring further.
POST https://modelslab.com/api/v7/llm/v1/messages
x-api-key header or as a Bearer token.
curl -X POST https://modelslab.com/api/v7/llm/v1/messages \
-H "x-api-key: $MODELSLAB_API_KEY" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "Qwen/Qwen2.5-VL-72B-Instruct-together",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'
{
"model": "Qwen/Qwen2.5-VL-72B-Instruct-together",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"system": "You are a helpful assistant.",
"temperature": 0.7,
"top_p": 1,
"stream": false
}
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "The capital of France is Paris."
}
],
"model": "Qwen/Qwen2.5-VL-72B-Instruct-together",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 15,
"output_tokens": 8
}
}
"stream": true to receive Server-Sent Events:
curl -X POST https://modelslab.com/api/v7/llm/v1/messages \
-H "x-api-key: $MODELSLAB_API_KEY" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "Qwen/Qwen2.5-VL-72B-Instruct-together",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Write a haiku"}],
"stream": true
}'
base_url and api_key:
from anthropic import Anthropic
client = Anthropic(
api_key="YOUR_MODELSLAB_API_KEY",
base_url="https://modelslab.com/api/v7/llm",
)
# Non-streaming
message = client.messages.create(
model="Qwen/Qwen2.5-VL-72B-Instruct-together",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain quantum computing"}
],
)
print(message.content[0].text)
# Streaming
with client.messages.stream(
model="Qwen/Qwen2.5-VL-72B-Instruct-together",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a story"}],
) as stream:
for text in stream.text_stream:
print(text, end="")
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: 'YOUR_MODELSLAB_API_KEY',
baseURL: 'https://modelslab.com/api/v7/llm',
});
const message = await client.messages.create({
model: 'Qwen/Qwen2.5-VL-72B-Instruct-together',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello!' },
],
});
console.log(message.content[0].text);
ANTHROPIC_BASE_URL="https://modelslab.com/api/v7/llm" \
ANTHROPIC_AUTH_TOKEN="YOUR_MODELSLAB_API_KEY" \
claude --model "Qwen/Qwen2.5-VL-72B-Instruct-together"
API key authentication via x-api-key header
Anthropic API version
Model ID to use
Maximum number of tokens to generate
x >= 1Array of input messages
Show child attributes
System prompt
Sampling temperature
0 <= x <= 1Nucleus sampling parameter
0 <= x <= 1Whether to stream the response
Message response
Was this page helpful?