> ## Documentation Index > Fetch the complete documentation index at: https://docs.modelslab.com/llms.txt > Use this file to discover all available pages before exploring further. # Errors > Error reference for the GGUF Cloud gateway — 401, 404, 503, and 502 responses, their JSON shape, and how to handle them. The GGUF Cloud gateway authenticates your request, confirms the deployment is yours and ready, then proxies to your model's `llama-server`. Gateway-level errors are returned in a consistent JSON shape. ## Error Response Format ```json theme={null} { "error": { "type": "authentication_error", "message": "Human-readable error description" } } ``` This shape applies to errors raised by the **gateway** (auth, routing, readiness). Errors raised by the model server itself (for example an invalid sampling parameter) are passed through from `llama-server` and may use the standard OpenAI/Anthropic error shape. ## HTTP Status Codes **Cause**: The API key is missing or invalid. **Common Issues**: * No key sent in the `Authorization`, `x-api-key`, or `key` header. * The key is incorrect, revoked, or not a valid ModelsLab API key. **Example Response**: ```json theme={null} { "error": { "type": "authentication_error", "message": "Invalid API key." } } ``` **Solution**: Send your ModelsLab API key in one of the accepted headers (see [Authentication](/gguf-cloud/authentication)) and verify it hasn't been revoked in your [dashboard](https://modelslab.com/dashboard/api-keys). **Cause**: There is no deployment for this `deployment_id`, or it does not belong to your account. **Example Response**: ```json theme={null} { "error": { "type": "not_found_error", "message": "No deployment found for this endpoint." } } ``` **Solution**: Check the `deployment_id` in your base URL against the one on your [deployment dashboard](https://modelslab.com/gguf-cloud), and confirm the API key belongs to the same account that owns the deployment. **Cause**: The deployment exists but is not ready — it is still deploying or has been paused. **Example Response**: ```json theme={null} { "error": { "type": "overloaded_error", "message": "Endpoint is not ready (status: deploying). It may still be deploying or paused." } } ``` **Solution**: Wait until the deployment shows **Ready** on the dashboard, then retry. If it is paused, resume it first. **Cause**: The deployment is marked ready but the model pod is currently unreachable — typically restarting or warming up after a config change. **Example Response**: ```json theme={null} { "error": { "type": "api_error", "message": "The model endpoint is unreachable right now. It may be restarting — try again in a moment." } } ``` **Solution**: This is transient. Retry shortly, ideally with exponential backoff. If it persists, check the deployment status on the dashboard. ## Handling Errors Treat `503` and `502` as **retryable** — the endpoint is coming up or restarting. Retry with backoff and stop on `401`/`404`, which require a fix on your side. ```python theme={null} import time import httpx def call_with_retry(url, headers, payload, max_retries=4): for attempt in range(max_retries): response = httpx.post(url, headers=headers, json=payload, timeout=120) if response.status_code == 200: return response.json() if response.status_code in (502, 503): # Endpoint still deploying / restarting — retry with backoff time.sleep(2 ** attempt) continue # 401 / 404 are not retryable — surface the gateway message error = response.json().get("error", {}) raise Exception(f"{error.get('type')}: {error.get('message')}") raise Exception("Max retries exceeded — endpoint did not become ready") ``` ## Related Base URLs and the accepted auth headers. Error reference for the rest of the ModelsLab API.