Catalogo tecnico delle API OpenAI e dei modelli
Executive summary
Questo report consolida, in forma di catalogo tecnico, gli endpoint REST (più streaming/realtime) e i modelli documentati ufficialmente, assumendo come base la versione più recente della documentazione disponibile al 12 aprile 2026. La superficie API è descritta come insieme di API REST, streaming (Server-Sent Events) e realtime.
Punti chiave operativi:
- Endpoint “core” per generazione: Responses API (consigliata come interfaccia unificata per modelli recenti; i modelli “latest” sono disponibili via Responses).
- Rate limiting espresso principalmente in RPM/TPM (oltre a RPD/TPD/IPM), con variazioni per modello e tier; alcune operazioni hanno limiti specifici (es. ingest su Vector Store).
- Error handling: codici HTTP 4xx/5xx con tipologie di errore (BadRequest, Authentication, PermissionDenied, NotFound, UnprocessableEntity, RateLimit, InternalServerError) e best practice di retry/backoff.
Deliverable aggiuntivi (estratti/machine-readable) generati in questo lavoro:
Nota metodologica importante: una parte del catalogo endpoint è stata derivata da una specifica OpenAPI pubblica (mirror) e da pagine di API reference; alcune famiglie di endpoint molto recenti presenti nell’API reference (es. “ChatKit”, “Containers”, “Skills”, “Videos”, alcuni endpoint avanzati di Responses) non risultano integralmente rappresentate nella specifica OpenAPI mirror analizzata e richiederebbero un’estrazione dedicata dalle singole pagine di riferimento per completare tutti i dettagli richiesti per endpoint (parametri completi + esempi). Questo report segnala esplicitamente dove mancano dettagli.
Convenzioni comuni e standard trasversali
Base URL e versione:
- Base URL:
https://api.openai.com/v1(prefisso/v1per tutti i percorsi REST).
Autenticazione e header:
- Header minimo standard (quasi tutti gli endpoint REST):
Authorization: Bearer <OPENAI_API_KEY>Content-Type: application/json(per payload JSON)
- Per endpoint “beta/legacy” (es. Assistants v2 nelle API legacy) la documentazione/esempi indicano l’uso di
OpenAI-Beta: assistants=v2.
Metadati:
- Oggetti come Conversation supportano
metadatafino a 16 coppie chiave/valore; chiavi max 64 char, valori max 512 char.
Paginazione (pattern comune):
- Endpoint “list” tipicamente espongono
limit(1..100, default 20) e cursoriafter/beforeoppureafter/order.
Streaming (SSE):
- Alcuni endpoint supportano
stream: truecon Server-Sent Events.
Rate limits:
- I rate limit sono misurati tipicamente in RPM/RPD/TPM/TPD e, per immagini, IPM; variano in base a modello e tier dell’organizzazione/progetto.
- Esempio di limite specifico: ingest file batch su Vector Store con limite di circa 300 RPM per vector store su alcuni endpoint.
Error handling (schema e codici):
- Tipi di errore per status code: 400, 401, 403, 404, 422, 429, >=500; retry con backoff in caso di 429/5xx e gestione dei timeout.
Esempio JSON (generico) di errore applicabile alla maggior parte degli endpoint (schema tipico “error object”; i campi precisi possono variare):
json
{
"error": {
"message": "Invalid request",
"type": "invalid_request_error",
"param": "model",
"code": "model_not_found"
}
}Catalogo endpoint
Diagramma mermaid: mappa logica degli endpoint
mermaid
flowchart LR
A[Client] -->|HTTPS REST /v1| R[REST APIs]
A -->|SSE stream=true| S[Streaming (SSE)]
A -->|WebRTC/WebSocket| RT[Realtime API]
R --> RESP[/responses + subresources/]
R --> CHAT[/chat/completions/]
R --> EMB[/embeddings/]
R --> IMG[/images/*/]
R --> AUD[/audio/*/]
R --> FILES[/files, uploads/]
R --> BATCH[/batches/]
R --> FT[/fine_tuning/]
R --> EVAL[/evals/]
R --> VS[/vector_stores/]
R --> ORG[/organization/* admin/]Tabella endpoint (endpoint, metodo, descrizione, rate limit)
La tabella seguente elenca gli endpoint estratti dalla specifica OpenAPI mirror analizzata; per endpoint aggiuntivi presenti nell’API Reference ma non rappresentati in tale specifica (es. POST /v1/responses/input_tokens, POST /v1/responses/{id}/cancel, POST /v1/responses/compact, /v1/conversations), la sezione successiva integra.
| endpoint | metodo | descrizione | rate_limit |
|---|---|---|---|
| /v1/assistants | GET | Returns a list of assistants. | dipende da tier/modello (v. docs Rate limits) |
| /v1/assistants | POST | Create an assistant with a model and instructions. | dipende da tier/modello (v. docs Rate limits) |
| /v1/assistants/ | GET | Retrieves an assistant. | dipende da tier/modello (v. docs Rate limits) |
| /v1/assistants/ | POST | Modifies an assistant. | dipende da tier/modello (v. docs Rate limits) |
| /v1/assistants/ | DELETE | Delete an assistant. | dipende da tier/modello (v. docs Rate limits) |
| /v1/audio/speech | POST | Generates audio from the input text. | dipende da tier/modello (v. docs Rate limits) |
| /v1/audio/transcriptions | POST | Transcribes audio into the input language. | dipende da tier/modello (v. docs Rate limits) |
| /v1/audio/translations | POST | Translates audio into English. | dipende da tier/modello (v. docs Rate limits) |
| /v1/batches | GET | List your organization's batches. | dipende da tier/modello (v. docs Rate limits) |
| /v1/batches | POST | Creates and executes a batch from an uploaded file of requests | dipende da tier/modello (v. docs Rate limits) |
| /v1/batches/ | GET | Retrieves a batch. | dipende da tier/modello (v. docs Rate limits) |
| /v1/batches/{batch_id}/cancel | POST | Cancels an in-progress batch. | dipende da tier/modello (v. docs Rate limits) |
| /v1/chat/completions | GET | List stored chat completions. Only Chat Completions that have been stored with the store parameter set to true will be returned. | dipende da tier/modello (v. docs Rate limits) |
| /v1/chat/completions | POST | Creates a model response for the given chat conversation. | dipende da tier/modello (v. docs Rate limits) |
| /v1/chat/completions/ | GET | Get a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true will be returned. | dipende da tier/modello (v. docs Rate limits) |
| /v1/chat/completions/ | POST | Modify a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true can be modified. | dipende da tier/modello (v. docs Rate limits) |
| /v1/chat/completions/ | DELETE | Delete a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true can be deleted. | dipende da tier/modello (v. docs Rate limits) |
| /v1/chat/completions/{completion_id}/messages | GET | Get the messages in a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true will be returned. | dipende da tier/modello (v. docs Rate limits) |
| /v1/completions | POST | Creates a completion for the provided prompt and parameters. | dipende da tier/modello (v. docs Rate limits) |
| /v1/embeddings | POST | Creates an embedding vector representing the input text. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals | GET | List evals for a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals | POST | Create an Eval. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/ | GET | Get an Eval by ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/ | POST | Update an Eval by ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/ | DELETE | Delete an Eval by ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs | GET | Get a list of runs for an Eval. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs | POST | Create a run for an Eval. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs/ | GET | Get an Eval run by ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs/ | DELETE | Delete an Eval run by ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs/{run_id}/cancel | POST | Cancel an Eval run. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs/{run_id}/output_items | GET | Get a list of output items for an Eval run. | dipende da tier/modello (v. docs Rate limits) |
| /v1/evals/{eval_id}/runs/{run_id}/output_items/ | GET | Get an output item for an Eval run. | dipende da tier/modello (v. docs Rate limits) |
| /v1/files | GET | Returns a list of files that belong to the user's organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/files | POST | Upload a file that can be used across various endpoints/features. | dipende da tier/modello (v. docs Rate limits) |
| /v1/files/ | GET | Returns information about a specific file. | dipende da tier/modello (v. docs Rate limits) |
| /v1/files/ | DELETE | Delete a file. | dipende da tier/modello (v. docs Rate limits) |
| /v1/files/{file_id}/content | GET | Returns the contents of the specified file. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions | GET | List checkpoint permissions for a fine-tuned model checkpoint. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions | POST | Create a permission for a fine-tuned model checkpoint. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions/ | GET | Retrieve a permission for a fine-tuned model checkpoint. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions/ | DELETE | Delete a permission for a fine-tuned model checkpoint. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/jobs | GET | List your organization's fine-tuning jobs | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/jobs | POST | Creates a fine-tuning job which begins the process of creating a new model from a given dataset. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/jobs/ | GET | Get info about a fine-tuning job. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/jobs/{fine_tuning_job_id}/cancel | POST | Immediately cancel a fine-tune job. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/jobs/{fine_tuning_job_id}/checkpoints | GET | List checkpoints for a fine-tuning job. | dipende da tier/modello (v. docs Rate limits) |
| /v1/fine_tuning/jobs/{fine_tuning_job_id}/events | GET | Get fine-grained status updates for a fine-tuning job. | dipende da tier/modello (v. docs Rate limits) |
| /v1/images/edits | POST | Creates an edited or extended image given an original image and a prompt. | dipende da tier/modello (v. docs Rate limits) |
| /v1/images/generations | POST | Creates an image given a prompt. | dipende da tier/modello (v. docs Rate limits) |
| /v1/images/variations | POST | Creates a variation of a given image. | dipende da tier/modello (v. docs Rate limits) |
| /v1/models | GET | Lists the currently available models, and provides basic information about each one such as the owner and availability. | dipende da tier/modello (v. docs Rate limits) |
| /v1/models/ | GET | Retrieves a model instance, providing basic information about the model such as the owner and permissioning. | dipende da tier/modello (v. docs Rate limits) |
| /v1/moderations | POST | Classifies if text and/or image inputs are potentially harmful. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/admin_api_keys | GET | List organization-wide admin API keys | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/admin_api_keys | POST | Create an organization-wide admin API key | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/admin_api_keys/ | GET | Retrieve a admin API key | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/admin_api_keys/ | DELETE | Delete an admin API key | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/audit_logs | GET | List user actions and configuration changes within this organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/certificates | GET | List uploaded certificates | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/certificates | POST | Upload a certificate for use by the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/certificates/ | GET | Get details for a certificate. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/certificates/ | POST | Update a certificate. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/certificates/ | DELETE | Delete a certificate from the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/costs | GET | Get cost details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/invites | GET | Returns a list of invites in the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/invites | POST | Create an invite for a user to the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/invites/ | GET | Retrieves an invite. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/invites/ | DELETE | Delete an invite. If the invite has already been accepted, it cannot be deleted. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects | GET | Lists all projects for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects | POST | Create a new project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/ | GET | Retrieves a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/ | POST | Modifies a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/api_keys | GET | Returns a list of API keys in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/api_keys/ | GET | Retrieves an API key in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/api_keys/ | DELETE | Deletes an API key from the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/archive | POST | Archives a project in the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/certificates | GET | List certificates for a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/certificates/activate | POST | Activates a certificate for a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/certificates/deactivate | POST | Deactivates a certificate for a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/rate_limits | GET | Returns the rate limits per model for a project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/rate_limits/ | POST | Updates a rate limit. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/service_accounts | GET | Returns a list of service accounts in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/service_accounts | POST | Creates a service account in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/service_accounts/ | GET | Retrieves a service account in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/service_accounts/ | DELETE | Deletes a service account from the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/users | GET | Returns a list of users in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/users | POST | Adds a user to the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/users/ | GET | Retrieves a user in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/users/ | POST | Modifies a user's role in the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/projects/{project_id}/users/ | DELETE | Deletes a user from the project. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/users | GET | Lists all of the users in the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/users/ | GET | Retrieves a user by their identifier. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/users/ | POST | Modifies a user's role in the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/users/ | DELETE | Deletes a user from the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/audio_speeches | GET | Get audio speeches usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/audio_transcriptions | GET | Get audio transcriptions usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/code_interpreter_sessions | GET | Get code interpreter sessions usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/completions | GET | Get completions usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/embeddings | GET | Get embeddings usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/images | GET | Get images usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/moderations | GET | Get moderations usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/organization/usage/vector_stores | GET | Get vector stores usage details for the organization. | dipende da tier/modello (v. docs Rate limits) |
| /v1/realtime/sessions | POST | Create an ephemeral API token for use in client-side applications with the Realtime API. Can be configured with the same session parameters as the session.update client event. | dipende da tier/modello (v. docs Rate limits) |
| /v1/realtime/transcription_sessions | POST | Create an ephemeral API token for use in client-side applications with the Realtime API specifically for realtime transcriptions. Can be configured with the same session parameters as the transcription_session.update client event. | dipende da tier/modello (v. docs Rate limits) |
| /v1/responses | POST | Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search. | dipende da tier/modello (v. docs Rate limits) |
| /v1/responses/ | GET | Retrieves a model response with the given ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/responses/ | DELETE | Deletes a model response with the given ID. | dipende da tier/modello (v. docs Rate limits) |
| /v1/responses/{response_id}/input_items | GET | Returns a list of input items for a given response. | dipende da tier/modello (v. docs Rate limits) |
| /v1/uploads | POST | Creates an intermediate Upload object that you can add Parts to. Currently, an Upload can accept at most 8 GB in total and expires after an hour after you create it. | dipende da tier/modello (v. docs Rate limits) |
| /v1/uploads/{upload_id}/cancel | POST | Cancels the Upload. No Parts may be added after an Upload is cancelled. | dipende da tier/modello (v. docs Rate limits) |
| /v1/uploads/{upload_id}/complete | POST | Completes the Upload. After completing, the Upload is removed and a new File object is created. | dipende da tier/modello (v. docs Rate limits) |
| /v1/uploads/{upload_id}/parts | POST | Adds a Part to an Upload object. A Part represents a chunk of bytes from the file you are uploading. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores | GET | Returns a list of vector stores. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores | POST | Create a vector store. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/ | GET | Retrieves a vector store. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/ | POST | Modifies a vector store. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/ | DELETE | Delete a vector store. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/file_batches | POST | Create a vector store file batch. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/file_batches/ | GET | Retrieves a vector store file batch. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel | POST | Cancel a vector store file batch. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/files | GET | Returns a list of vector store files in a batch. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/files | GET | Returns a list of vector store files. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/files | POST | Create a vector store file by attaching a File to a vector store. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/files/ | GET | Retrieves a vector store file. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/files/ | POST | Modifies a vector store file. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/files/ | DELETE | Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/files/{file_id}/content | GET | Retrieve the parsed contents of a vector store file. | dipende da tier/modello (v. docs Rate limits) |
| /v1/vector_stores/{vector_store_id}/search | POST | Search a vector store for relevant chunks based on a query and file attributes filter. | dipende da tier/modello (v. docs Rate limits) |
Rate limits: il valore puntuale (RPM/TPM ecc.) dipende da modello e tier; consultare la guida “Rate limits”.
Schede endpoint principali (dettaglio tecnico)
Di seguito le schede tecniche complete per i principali endpoint di runtime, ingestion e retrieval. Per l’elenco completo con parametri e esempi degli endpoint presenti nella specifica OpenAPI mirror analizzata, fare riferimento al JSON scaricabile.
Responses API
POST /v1/responses
Funzione: genera una risposta del modello (testo/JSON, con tool calling opzionale).
Parametri request (body JSON, top-level):
model(required, string): ID modello.input(required, string | array): testo o lista di input items (multimodale).instructions(optional, string): istruzioni “developer/system” per la request.max_output_tokens(optional, integer): bound superiore output (include reasoning tokens se applicabile).temperature,top_p(optional, number): parametri campionamento.stream(optional, boolean): abilita SSE streaming.include(optional, array): includi campi aggiuntivi (es. risultati web/file search, logprobs).truncation(optional,"auto"|"disabled"): comportamento se input eccede context.store(optional, boolean): persistenza per retrieval successivo (dove supportato).metadata(optional, oggetto fino a 16 coppie): metadati.
Headers richiesti:
Authorization: Bearer ...Content-Type: application/json
Esempio cURL:
bash
curl https://api.openai.com/v1/responses \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"input": "Scrivi una funzione Apex che valida un IBAN."
}'Esempio response (successo, ridotto):
json
{
"id": "resp_...",
"object": "response",
"status": "completed",
"model": "gpt-4o-2024-08-06",
"output": [
{
"type": "message",
"role": "assistant",
"content": [{ "type": "output_text", "text": "..." }]
}
],
"usage": { "input_tokens": 0, "output_tokens": 0, "total_tokens": 0 }
}Errori comuni:
- 400 invalid request (es. input troppo grande con
truncation="disabled"). - 401/403 authz/authn, 429 rate limit, 5xx server error.
GET /v1/responses/{response_id}
Query params notevoli:
include[](optional): includi campi extra.stream(optional, boolean): stream SSE anche in retrieve.include_obfuscation,starting_after(uso avanzato streaming/event replay).
DELETE /v1/responses/{response_id}
Restituisce { "id": "...", "object": "response", "deleted": true }.
GET /v1/responses/{response_id}/input_items
Query params:
after(cursor),limit(1..100, default 20),order(asc|desc, defaultdesc),include[].
Endpoint avanzati di Responses presenti nella documentazione recente ma non integralmente inclusi nella specifica OpenAPI mirror analizzata:
POST /v1/responses/input_tokens: conteggio token input.POST /v1/responses/{response_id}/cancel: annulla response in background (solo se creata conbackground=true).POST /v1/responses/compact: compaction conversazioni lunghe.
Esempio POST /v1/responses/input_tokens:
bash
curl -X POST https://api.openai.com/v1/responses/input_tokens \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5",
"input": "Tell me a joke."
}'json
{ "object": "response.input_tokens", "input_tokens": 11 }Conversations API
Questi endpoint (nuovi) sono documentati nell’API reference recente e forniscono uno stato conversazionale separato dalla generazione.
POST /v1/conversations
Body:
items(optional, array): elementi iniziali; max 20 items per chiamata.metadata(optional, 16 coppie key/value, vincoli chiave/valore).
Esempio:
bash
curl https://api.openai.com/v1/conversations \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"metadata": {"topic": "demo"},
"items": [
{ "type": "message", "role": "user", "content": "Hello!" }
]
}'GET /v1/conversations/{conversation_id}
Restituisce oggetto conversation con metadata.
POST /v1/conversations/{conversation_id} (update)
Body:
metadata(required nel body di update): aggiorna metadati.
Endpoint “Items” della Conversations API: l’API reference elenca operazioni Create/Retrieve/Delete/List su items, ma in questo report non sono stati estratti i percorsi e gli esempi completi dalla pagina specifica di ciascun metodo (necessari per soddisfare “body/response per endpoint” in modo esaustivo).
Chat Completions API
Nota: è presente una superficie “stored chat completions” (con store=true) con operazioni list/retrieve/modify/delete e listing dei messages.
POST /v1/chat/completions
Body (campi principali):
model(required, string)messages(required, array):temperature,top_p(optional)max_completion_tokens/max_tokens(optional, legacy vs nuovo naming)response_format(optional; Structured Outputs)tools,tool_choice,parallel_tool_calls(optional)store(optional, boolean),stream(optional, boolean)
Esempio cURL minimale:
bash
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4-mini",
"messages": [
{"role":"user","content":"Genera una regex per validare un IBAN IT."}
]
}'Operazioni correlate (stored):
GET /v1/chat/completionsGET|POST|DELETE /v1/chat/completions/{completion_id}GET /v1/chat/completions/{completion_id}/messages
Embeddings API
POST /v1/embeddings
Body:
model(required)input(required, string o array)dimensions(optional),encoding_format(optional),user(optional)
Esempio:
bash
curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-large",
"input": "Customer churn prediction features"
}'Images API
POST /v1/images/generations
Body (principali):
prompt(required)model,n,size,quality,style,background,response_format(optional)
POST /v1/images/edits e POST /v1/images/variations (multipart) sono presenti nella specifica mirror.
Nota deprecazioni: modelli DALL·E snapshot sono indicati come deprecati con shut down (es. maggio 2026).
Audio API
Sintesi vocale:
POST /v1/audio/speech(JSON): richiedemodel,input,voice; opzionaliresponse_format,speed,instructions.
Trascrizione/Traduzione:
POST /v1/audio/transcriptions(multipart): richiedefile,model; opzionalilanguage,prompt,response_format,temperature,timestamp_granularities[],stream.POST /v1/audio/translations(multipart).
L’API reference recente mostra anche endpoint “Create a voice” e “Voice Consents” (CRUD + list); in questo report non sono disponibili, dalla specifica mirror, i percorsi e gli esempi completi per tali endpoint, quindi non viene riportata una scheda esaustiva per ciascuno.
Files e Uploads
Files:
GET /v1/files,POST /v1/files(multipart:file,purpose),GET|DELETE /v1/files/{file_id},GET /v1/files/{file_id}/content.
Uploads (upload multi-part fino a grandi dimensioni):
POST /v1/uploads(JSON:filename,purpose,bytes,mime_type)POST /v1/uploads/{upload_id}/partsPOST /v1/uploads/{upload_id}/completePOST /v1/uploads/{upload_id}/cancel
Vector Stores
Vector store (CRUD + search):
POST /v1/vector_stores,GET /v1/vector_stores,GET|POST|DELETE /v1/vector_stores/{vector_store_id},POST /v1/vector_stores/{vector_store_id}/search
Files su vector store:
POST /v1/vector_stores/{vector_store_id}/files(attach file)GET|POST|DELETE /v1/vector_stores/{vector_store_id}/files/{file_id}GET /v1/vector_stores/{vector_store_id}/files/{file_id}/content
File batches (ingest asincrono):
POST /v1/vector_stores/{vector_store_id}/file_batchesGET /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}POST /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancelGET /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/files
Rate limit ingest specifico (per vector store) indicato nella guida rate limits.
Batches
POST /v1/batches(JSON:input_file_id,endpoint,completion_window,metadata?)GET /v1/batches,GET /v1/batches/{batch_id},POST /v1/batches/{batch_id}/cancel
Fine-tuning ed Evals
Fine-tuning:
POST /v1/fine_tuning/jobs(required:model,training_file; opzionalivalidation_file,hyperparameters,integrations,seed,method,metadata,suffix)GET /v1/fine_tuning/jobs,GET /v1/fine_tuning/jobs/{id},POST /v1/fine_tuning/jobs/{id}/cancelGET /v1/fine_tuning/jobs/{id}/events,GET /v1/fine_tuning/jobs/{id}/checkpoints- Checkpoint permissions CRUD su
/v1/fine_tuning/checkpoints/{checkpoint}/permissions...
Evals:
- CRUD su
/v1/evals, run management su/v1/evals/{eval_id}/runs..., output_items listing/retrieve.
Administration (Organization/Projects/Usage)
Endpoint amministrativi includono:
- Audit logs, costs, invites, users, admin API keys
- Projects CRUD, project users/service accounts, project API keys, project rate limits, project certificates
- Usage breakdown per categoria (images, embeddings, moderations, vector stores, ecc.)
Nota: questi endpoint spesso richiedono privilegi elevati e governance di organizzazione/progetto; usare service accounts e segregazione per progetto, evitando di esporre chiavi admin nel client.
Diagramma mermaid: flusso tipico request → response
mermaid
sequenceDiagram
participant C as Client
participant O as OpenAI API (/v1)
C->>O: HTTPS POST /responses (Authorization Bearer, JSON)
alt stream=false
O-->>C: 200 JSON {response}
else stream=true
O-->>C: 200 SSE events (delta, completed)
end
Note over C,O: Errori: 4xx/5xx; 429 rate limit -> retry con backoffCatalogo modelli
Premesse e fonti di verità
- La documentazione “Models” raccomanda un modello flagship e varianti mini/nano per tradeoff costo/latency; riporta anche prezzi per alcuni modelli e segnala disponibilità via Responses API.
- L’elenco effettivo di modelli disponibili e i relativi snapshot possono variare per account/tier e nel tempo; la documentazione include anche modelli/snapshot deprecati con date di shut down.
Tabella modelli (modello, tipo, parametri chiave, uso consigliato, note)
La tabella seguente è un “seed catalog” derivato dall’estrazione di identificativi modello presenti nella documentazione testuale (include anche modelli storici/deprecati). Per un uso operativo: preferire gli ID “latest” o snapshot moderni e verificare eventuali deprecazioni.
| modello | tipo | parametri_chiave | uso_consigliato | note |
|---|---|---|---|---|
| gpt-4o-mini-transcribe | audio/stt | file, model, language?, prompt?, response_format?, timestamp_granularities? | trascrizione e diarizzazione (se supportata) | |
| gpt-4o-transcribe | audio/stt | file, model, language?, prompt?, response_format?, timestamp_granularities? | trascrizione e diarizzazione (se supportata) | |
| gpt-4o-transcribe-diarize | audio/stt | file, model, language?, prompt?, response_format?, timestamp_granularities? | trascrizione e diarizzazione (se supportata) | |
| gpt-4o-transcribe-latest | audio/stt | file, model, language?, prompt?, response_format?, timestamp_granularities? | trascrizione e diarizzazione (se supportata) | |
| whisper-1 | audio/stt | file, model, language?, prompt?, response_format?, timestamp_granularities? | trascrizione e diarizzazione (se supportata) | |
| gpt-4o-mini-tts | audio/tts | model, input, voice, response_format?, speed? | sintesi vocale | |
| gpt-4o-tts | audio/tts | model, input, voice, response_format?, speed? | sintesi vocale | |
| text-embedding-3-large | embedding | input, model, dimensions?, encoding_format?, user? | ricerca semantica, clustering, RAG | |
| text-embedding-3-small | embedding | input, model, dimensions?, encoding_format?, user? | ricerca semantica, clustering, RAG | |
| text-embedding-ada-002 | embedding | input, model, dimensions?, encoding_format?, user? | ricerca semantica, clustering, RAG | modello storico |
| omni-moderation-latest | moderation | input (text/image), model | classificazione contenuti, policy enforcement | |
| text-moderation-latest | moderation | input (text/image), model | classificazione contenuti, policy enforcement | limitazioni regionali possibili |
| text-moderation-stable | moderation | input (text/image), model | classificazione contenuti, policy enforcement | modello “stable” |
| text-moderation-007 | moderation | input (text/image), model | classificazione contenuti, policy enforcement | modello storico |
| gpt-realtime | realtime | session config, audio codecs, turn detection, events | voice agents, speech-to-speech low-latency | |
| gpt-realtime-1.5 | realtime | session config, audio codecs, turn detection, events | voice agents, speech-to-speech low-latency | |
| gpt-realtime-mini | realtime | session config, audio codecs, turn detection, events | voice agents, speech-to-speech low-latency | |
| gpt-image-1 | vision/image | prompt, model?, size?, quality?, n?, response_format? | generazione/modifica immagini | |
| gpt-image-1-mini | vision/image | prompt, model?, size?, quality?, n?, response_format? | generazione/modifica immagini | variante cost-efficient |
| gpt-image-1.5 | vision/image | prompt, model?, size?, quality?, n?, response_format? | generazione/modifica immagini | |
| dall-e-2 | vision/image | prompt, model?, size?, quality?, n?, response_format? | generazione/modifica immagini | deprecato/shutdown in roadmap |
| dall-e-3 | vision/image | prompt, model?, size?, quality?, n?, response_format? | generazione/modifica immagini | deprecato/shutdown in roadmap |
| o1 | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | storico |
| o1-mini | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | deprecato (vedi deprecations) |
| o1-preview | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | deprecato (vedi deprecations) |
| o1-pro | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | |
| o1-pro-2025-03-19 | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | snapshot |
| o3 | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | |
| o3-2025-04-16 | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | snapshot |
| o3-mini | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | |
| o3-mini-2025-01-31 | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | snapshot |
| o4-mini | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | |
| o4-mini-2025-04-16 | reasoning | reasoning_effort, max_output_tokens, tools?, temperature? | ragionamento multi-step, pianificazione, problemi complessi | snapshot |
| gpt-5.4 | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | |
| gpt-5.4-mini | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | |
| gpt-5.4-nano | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | |
| gpt-4.1 | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | |
| gpt-4o | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | |
| gpt-3.5-turbo | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | storico/deprecazioni selettive |
| gpt-3.5-turbo-0125 | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | snapshot |
| gpt-4-0613 | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | storico |
| computer-use-preview | general/chat | messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format? | chat, analisi testo, coding, multimodale | tool specialized |
(La lista completa è disponibile nel CSV seed scaricabile.)
Differenze e capacità (focus su modelli “correnti” indicati in Models)
- GPT-5.4 / mini / nano: la pagina Models indica GPT-5.4 come modello “flagship” per ragionamento e coding e mini/nano per latenza/costo; riporta anche prezzi per input/output MTok e contesto (fino a 1M per GPT-5.4).
- Modelli specializzati:
- Image: famiglia GPT Image (es. gpt-image-1.5 e variante mini).
- Realtime: gpt-realtime-1.5 e gpt-realtime-mini per speech-to-speech a bassa latenza.
- Speech generation / transcription: GPT-4o mini TTS, GPT-4o Transcribe, GPT-4o mini Transcribe.
Prezzi:
- La pagina Models include prezzi espliciti per alcuni modelli frontier (es. GPT-5.4, mini, nano) in $/Input MTok e $/Output MTok.
- Per prezzi completi e aggiornati su tutte le famiglie (incluse immagini/audio/strumenti), la documentazione rimanda alla sezione Pricing; in questo report non tutti i prezzi per ogni modello sono riportati perché non erano esplicitati nelle pagine “Models” consultate.
Deprecazioni:
- La guida Deprecations elenca modelli/snapshot e (in alcuni casi) endpoint storici in ritiro, con date di shut down e sostituzioni raccomandate (es. migrazioni da DALL·E a GPT Image, da o1-preview/o1-mini verso o3/o4-mini, ecc.).
Best practice finali e riferimenti
Best practice tecniche (sintesi):
- Preferire Responses API come interfaccia unificata per modelli recenti e tool calling; usare streaming SSE (
stream=true) per UX a bassa latenza percepita; gestire correttamente replay/obfuscation quando necessario. - Implementare retry con exponential backoff + jitter per 429 e alcune classi 5xx; distinguere errori permanenti (400/401/403/404/422) da transient.
- Osservare i rate limit per unità (RPM/TPM/TPD/IPM) e, dove presenti, limiti per risorsa (es. ingest su vector store).
- Per oggetti con
metadata, rispettare vincoli di cardinalità e lunghezza; usare metadata per correlazioni (tenant, environment, trace id) e auditing. - Evitare dipendenze da modelli in deprecazione; monitorare la pagina Deprecations e migrare agli equivalenti consigliati.
- Per endpoint “beta/legacy” (Assistants v2, ecc.): isolare in moduli dedicati e includere gli header richiesti (es.
OpenAI-Beta) solo dove necessario.
Riferimenti principali (integrati via citazioni nel report):
- API Overview (panoramica REST/streaming/realtime).
- API Reference (Responses, Conversations, ecc.).
- Rate limits.
- Error codes e mapping error types.
- Models (scelta modello, capacità e prezzi per alcuni modelli).
- Deprecations (modelli/endpoint deprecati e sostituzioni).