Skip to content

Catalogo tecnico delle API OpenAI e dei modelli

Executive summary

Questo report consolida, in forma di catalogo tecnico, gli endpoint REST (più streaming/realtime) e i modelli documentati ufficialmente, assumendo come base la versione più recente della documentazione disponibile al 12 aprile 2026. La superficie API è descritta come insieme di API REST, streaming (Server-Sent Events) e realtime.

Punti chiave operativi:

  • Endpoint “core” per generazione: Responses API (consigliata come interfaccia unificata per modelli recenti; i modelli “latest” sono disponibili via Responses).
  • Rate limiting espresso principalmente in RPM/TPM (oltre a RPD/TPD/IPM), con variazioni per modello e tier; alcune operazioni hanno limiti specifici (es. ingest su Vector Store).
  • Error handling: codici HTTP 4xx/5xx con tipologie di errore (BadRequest, Authentication, PermissionDenied, NotFound, UnprocessableEntity, RateLimit, InternalServerError) e best practice di retry/backoff.

Deliverable aggiuntivi (estratti/machine-readable) generati in questo lavoro:

Nota metodologica importante: una parte del catalogo endpoint è stata derivata da una specifica OpenAPI pubblica (mirror) e da pagine di API reference; alcune famiglie di endpoint molto recenti presenti nell’API reference (es. “ChatKit”, “Containers”, “Skills”, “Videos”, alcuni endpoint avanzati di Responses) non risultano integralmente rappresentate nella specifica OpenAPI mirror analizzata e richiederebbero un’estrazione dedicata dalle singole pagine di riferimento per completare tutti i dettagli richiesti per endpoint (parametri completi + esempi). Questo report segnala esplicitamente dove mancano dettagli.

Convenzioni comuni e standard trasversali

Base URL e versione:

  • Base URL: https://api.openai.com/v1 (prefisso /v1 per tutti i percorsi REST).

Autenticazione e header:

  • Header minimo standard (quasi tutti gli endpoint REST):
    • Authorization: Bearer <OPENAI_API_KEY>
    • Content-Type: application/json (per payload JSON)
  • Per endpoint “beta/legacy” (es. Assistants v2 nelle API legacy) la documentazione/esempi indicano l’uso di OpenAI-Beta: assistants=v2.

Metadati:

  • Oggetti come Conversation supportano metadata fino a 16 coppie chiave/valore; chiavi max 64 char, valori max 512 char.

Paginazione (pattern comune):

  • Endpoint “list” tipicamente espongono limit (1..100, default 20) e cursori after/before oppure after/order.

Streaming (SSE):

  • Alcuni endpoint supportano stream: true con Server-Sent Events.

Rate limits:

  • I rate limit sono misurati tipicamente in RPM/RPD/TPM/TPD e, per immagini, IPM; variano in base a modello e tier dell’organizzazione/progetto.
  • Esempio di limite specifico: ingest file batch su Vector Store con limite di circa 300 RPM per vector store su alcuni endpoint.

Error handling (schema e codici):

  • Tipi di errore per status code: 400, 401, 403, 404, 422, 429, >=500; retry con backoff in caso di 429/5xx e gestione dei timeout.

Esempio JSON (generico) di errore applicabile alla maggior parte degli endpoint (schema tipico “error object”; i campi precisi possono variare):

json
{
  "error": {
    "message": "Invalid request",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found"
  }
}

Catalogo endpoint

Diagramma mermaid: mappa logica degli endpoint

mermaid
flowchart LR
  A[Client] -->|HTTPS REST /v1| R[REST APIs]
  A -->|SSE stream=true| S[Streaming (SSE)]
  A -->|WebRTC/WebSocket| RT[Realtime API]

  R --> RESP[/responses + subresources/]
  R --> CHAT[/chat/completions/]
  R --> EMB[/embeddings/]
  R --> IMG[/images/*/]
  R --> AUD[/audio/*/]
  R --> FILES[/files, uploads/]
  R --> BATCH[/batches/]
  R --> FT[/fine_tuning/]
  R --> EVAL[/evals/]
  R --> VS[/vector_stores/]
  R --> ORG[/organization/* admin/]

Tabella endpoint (endpoint, metodo, descrizione, rate limit)

La tabella seguente elenca gli endpoint estratti dalla specifica OpenAPI mirror analizzata; per endpoint aggiuntivi presenti nell’API Reference ma non rappresentati in tale specifica (es. POST /v1/responses/input_tokens, POST /v1/responses/{id}/cancel, POST /v1/responses/compact, /v1/conversations), la sezione successiva integra.

endpointmetododescrizionerate_limit
/v1/assistantsGETReturns a list of assistants.dipende da tier/modello (v. docs Rate limits)
/v1/assistantsPOSTCreate an assistant with a model and instructions.dipende da tier/modello (v. docs Rate limits)
/v1/assistants/GETRetrieves an assistant.dipende da tier/modello (v. docs Rate limits)
/v1/assistants/POSTModifies an assistant.dipende da tier/modello (v. docs Rate limits)
/v1/assistants/DELETEDelete an assistant.dipende da tier/modello (v. docs Rate limits)
/v1/audio/speechPOSTGenerates audio from the input text.dipende da tier/modello (v. docs Rate limits)
/v1/audio/transcriptionsPOSTTranscribes audio into the input language.dipende da tier/modello (v. docs Rate limits)
/v1/audio/translationsPOSTTranslates audio into English.dipende da tier/modello (v. docs Rate limits)
/v1/batchesGETList your organization's batches.dipende da tier/modello (v. docs Rate limits)
/v1/batchesPOSTCreates and executes a batch from an uploaded file of requestsdipende da tier/modello (v. docs Rate limits)
/v1/batches/GETRetrieves a batch.dipende da tier/modello (v. docs Rate limits)
/v1/batches/{batch_id}/cancelPOSTCancels an in-progress batch.dipende da tier/modello (v. docs Rate limits)
/v1/chat/completionsGETList stored chat completions. Only Chat Completions that have been stored with the store parameter set to true will be returned.dipende da tier/modello (v. docs Rate limits)
/v1/chat/completionsPOSTCreates a model response for the given chat conversation.dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/GETGet a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true will be returned.dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/POSTModify a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true can be modified.dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/DELETEDelete a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true can be deleted.dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/{completion_id}/messagesGETGet the messages in a stored chat completion. Only Chat Completions that have been stored with the store parameter set to true will be returned.dipende da tier/modello (v. docs Rate limits)
/v1/completionsPOSTCreates a completion for the provided prompt and parameters.dipende da tier/modello (v. docs Rate limits)
/v1/embeddingsPOSTCreates an embedding vector representing the input text.dipende da tier/modello (v. docs Rate limits)
/v1/evalsGETList evals for a project.dipende da tier/modello (v. docs Rate limits)
/v1/evalsPOSTCreate an Eval.dipende da tier/modello (v. docs Rate limits)
/v1/evals/GETGet an Eval by ID.dipende da tier/modello (v. docs Rate limits)
/v1/evals/POSTUpdate an Eval by ID.dipende da tier/modello (v. docs Rate limits)
/v1/evals/DELETEDelete an Eval by ID.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runsGETGet a list of runs for an Eval.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runsPOSTCreate a run for an Eval.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/GETGet an Eval run by ID.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/DELETEDelete an Eval run by ID.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/{run_id}/cancelPOSTCancel an Eval run.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/{run_id}/output_itemsGETGet a list of output items for an Eval run.dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/{run_id}/output_items/GETGet an output item for an Eval run.dipende da tier/modello (v. docs Rate limits)
/v1/filesGETReturns a list of files that belong to the user's organization.dipende da tier/modello (v. docs Rate limits)
/v1/filesPOSTUpload a file that can be used across various endpoints/features.dipende da tier/modello (v. docs Rate limits)
/v1/files/GETReturns information about a specific file.dipende da tier/modello (v. docs Rate limits)
/v1/files/DELETEDelete a file.dipende da tier/modello (v. docs Rate limits)
/v1/files/{file_id}/contentGETReturns the contents of the specified file.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissionsGETList checkpoint permissions for a fine-tuned model checkpoint.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissionsPOSTCreate a permission for a fine-tuned model checkpoint.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions/GETRetrieve a permission for a fine-tuned model checkpoint.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions/DELETEDelete a permission for a fine-tuned model checkpoint.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobsGETList your organization's fine-tuning jobsdipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobsPOSTCreates a fine-tuning job which begins the process of creating a new model from a given dataset.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/GETGet info about a fine-tuning job.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/{fine_tuning_job_id}/cancelPOSTImmediately cancel a fine-tune job.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/{fine_tuning_job_id}/checkpointsGETList checkpoints for a fine-tuning job.dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/{fine_tuning_job_id}/eventsGETGet fine-grained status updates for a fine-tuning job.dipende da tier/modello (v. docs Rate limits)
/v1/images/editsPOSTCreates an edited or extended image given an original image and a prompt.dipende da tier/modello (v. docs Rate limits)
/v1/images/generationsPOSTCreates an image given a prompt.dipende da tier/modello (v. docs Rate limits)
/v1/images/variationsPOSTCreates a variation of a given image.dipende da tier/modello (v. docs Rate limits)
/v1/modelsGETLists the currently available models, and provides basic information about each one such as the owner and availability.dipende da tier/modello (v. docs Rate limits)
/v1/models/GETRetrieves a model instance, providing basic information about the model such as the owner and permissioning.dipende da tier/modello (v. docs Rate limits)
/v1/moderationsPOSTClassifies if text and/or image inputs are potentially harmful.dipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keysGETList organization-wide admin API keysdipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keysPOSTCreate an organization-wide admin API keydipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keys/GETRetrieve a admin API keydipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keys/DELETEDelete an admin API keydipende da tier/modello (v. docs Rate limits)
/v1/organization/audit_logsGETList user actions and configuration changes within this organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificatesGETList uploaded certificatesdipende da tier/modello (v. docs Rate limits)
/v1/organization/certificatesPOSTUpload a certificate for use by the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates/GETGet details for a certificate.dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates/POSTUpdate a certificate.dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates/DELETEDelete a certificate from the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/costsGETGet cost details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/invitesGETReturns a list of invites in the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/invitesPOSTCreate an invite for a user to the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/invites/GETRetrieves an invite.dipende da tier/modello (v. docs Rate limits)
/v1/organization/invites/DELETEDelete an invite. If the invite has already been accepted, it cannot be deleted.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projectsGETLists all projects for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projectsPOSTCreate a new project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/GETRetrieves a project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/POSTModifies a project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/api_keysGETReturns a list of API keys in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/api_keys/GETRetrieves an API key in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/api_keys/DELETEDeletes an API key from the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/archivePOSTArchives a project in the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/certificatesGETList certificates for a project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/certificates/activatePOSTActivates a certificate for a project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/certificates/deactivatePOSTDeactivates a certificate for a project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/rate_limitsGETReturns the rate limits per model for a project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/rate_limits/POSTUpdates a rate limit.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accountsGETReturns a list of service accounts in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accountsPOSTCreates a service account in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accounts/GETRetrieves a service account in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accounts/DELETEDeletes a service account from the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/usersGETReturns a list of users in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/usersPOSTAdds a user to the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users/GETRetrieves a user in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users/POSTModifies a user's role in the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users/DELETEDeletes a user from the project.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usersGETLists all of the users in the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/users/GETRetrieves a user by their identifier.dipende da tier/modello (v. docs Rate limits)
/v1/organization/users/POSTModifies a user's role in the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/users/DELETEDeletes a user from the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/audio_speechesGETGet audio speeches usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/audio_transcriptionsGETGet audio transcriptions usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/code_interpreter_sessionsGETGet code interpreter sessions usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/completionsGETGet completions usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/embeddingsGETGet embeddings usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/imagesGETGet images usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/moderationsGETGet moderations usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/vector_storesGETGet vector stores usage details for the organization.dipende da tier/modello (v. docs Rate limits)
/v1/realtime/sessionsPOSTCreate an ephemeral API token for use in client-side applications with the Realtime API. Can be configured with the same session parameters as the session.update client event.dipende da tier/modello (v. docs Rate limits)
/v1/realtime/transcription_sessionsPOSTCreate an ephemeral API token for use in client-side applications with the Realtime API specifically for realtime transcriptions. Can be configured with the same session parameters as the transcription_session.update client event.dipende da tier/modello (v. docs Rate limits)
/v1/responsesPOSTCreates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search.dipende da tier/modello (v. docs Rate limits)
/v1/responses/GETRetrieves a model response with the given ID.dipende da tier/modello (v. docs Rate limits)
/v1/responses/DELETEDeletes a model response with the given ID.dipende da tier/modello (v. docs Rate limits)
/v1/responses/{response_id}/input_itemsGETReturns a list of input items for a given response.dipende da tier/modello (v. docs Rate limits)
/v1/uploadsPOSTCreates an intermediate Upload object that you can add Parts to. Currently, an Upload can accept at most 8 GB in total and expires after an hour after you create it.dipende da tier/modello (v. docs Rate limits)
/v1/uploads/{upload_id}/cancelPOSTCancels the Upload. No Parts may be added after an Upload is cancelled.dipende da tier/modello (v. docs Rate limits)
/v1/uploads/{upload_id}/completePOSTCompletes the Upload. After completing, the Upload is removed and a new File object is created.dipende da tier/modello (v. docs Rate limits)
/v1/uploads/{upload_id}/partsPOSTAdds a Part to an Upload object. A Part represents a chunk of bytes from the file you are uploading.dipende da tier/modello (v. docs Rate limits)
/v1/vector_storesGETReturns a list of vector stores.dipende da tier/modello (v. docs Rate limits)
/v1/vector_storesPOSTCreate a vector store.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/GETRetrieves a vector store.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/POSTModifies a vector store.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/DELETEDelete a vector store.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batchesPOSTCreate a vector store file batch.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches/GETRetrieves a vector store file batch.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancelPOSTCancel a vector store file batch.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/filesGETReturns a list of vector store files in a batch.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/filesGETReturns a list of vector store files.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/filesPOSTCreate a vector store file by attaching a File to a vector store.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/GETRetrieves a vector store file.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/POSTModifies a vector store file.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/DELETEDelete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/{file_id}/contentGETRetrieve the parsed contents of a vector store file.dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/searchPOSTSearch a vector store for relevant chunks based on a query and file attributes filter.dipende da tier/modello (v. docs Rate limits)

Rate limits: il valore puntuale (RPM/TPM ecc.) dipende da modello e tier; consultare la guida “Rate limits”.

Schede endpoint principali (dettaglio tecnico)

Di seguito le schede tecniche complete per i principali endpoint di runtime, ingestion e retrieval. Per l’elenco completo con parametri e esempi degli endpoint presenti nella specifica OpenAPI mirror analizzata, fare riferimento al JSON scaricabile.

Responses API

POST /v1/responses
Funzione: genera una risposta del modello (testo/JSON, con tool calling opzionale).

Parametri request (body JSON, top-level):

  • model (required, string): ID modello.
  • input (required, string | array): testo o lista di input items (multimodale).
  • instructions (optional, string): istruzioni “developer/system” per la request.
  • max_output_tokens (optional, integer): bound superiore output (include reasoning tokens se applicabile).
  • temperature, top_p (optional, number): parametri campionamento.
  • stream (optional, boolean): abilita SSE streaming.
  • include (optional, array): includi campi aggiuntivi (es. risultati web/file search, logprobs).
  • truncation (optional, "auto"|"disabled"): comportamento se input eccede context.
  • store (optional, boolean): persistenza per retrieval successivo (dove supportato).
  • metadata (optional, oggetto fino a 16 coppie): metadati.

Headers richiesti:

  • Authorization: Bearer ...
  • Content-Type: application/json

Esempio cURL:

bash
curl https://api.openai.com/v1/responses \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "input": "Scrivi una funzione Apex che valida un IBAN."
  }'

Esempio response (successo, ridotto):

json
{
  "id": "resp_...",
  "object": "response",
  "status": "completed",
  "model": "gpt-4o-2024-08-06",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{ "type": "output_text", "text": "..." }]
    }
  ],
  "usage": { "input_tokens": 0, "output_tokens": 0, "total_tokens": 0 }
}

Errori comuni:

  • 400 invalid request (es. input troppo grande con truncation="disabled").
  • 401/403 authz/authn, 429 rate limit, 5xx server error.

GET /v1/responses/{response_id}
Query params notevoli:

  • include[] (optional): includi campi extra.
  • stream (optional, boolean): stream SSE anche in retrieve.
  • include_obfuscation, starting_after (uso avanzato streaming/event replay).

DELETE /v1/responses/{response_id}
Restituisce { "id": "...", "object": "response", "deleted": true }.

GET /v1/responses/{response_id}/input_items
Query params:

  • after (cursor), limit (1..100, default 20), order (asc|desc, default desc), include[].

Endpoint avanzati di Responses presenti nella documentazione recente ma non integralmente inclusi nella specifica OpenAPI mirror analizzata:

  • POST /v1/responses/input_tokens: conteggio token input.
  • POST /v1/responses/{response_id}/cancel: annulla response in background (solo se creata con background=true).
  • POST /v1/responses/compact: compaction conversazioni lunghe.

Esempio POST /v1/responses/input_tokens:

bash
curl -X POST https://api.openai.com/v1/responses/input_tokens \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "input": "Tell me a joke."
  }'
json
{ "object": "response.input_tokens", "input_tokens": 11 }

Conversations API

Questi endpoint (nuovi) sono documentati nell’API reference recente e forniscono uno stato conversazionale separato dalla generazione.

POST /v1/conversations
Body:

  • items (optional, array): elementi iniziali; max 20 items per chiamata.
  • metadata (optional, 16 coppie key/value, vincoli chiave/valore).

Esempio:

bash
curl https://api.openai.com/v1/conversations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {"topic": "demo"},
    "items": [
      { "type": "message", "role": "user", "content": "Hello!" }
    ]
  }'

GET /v1/conversations/{conversation_id}
Restituisce oggetto conversation con metadata.

POST /v1/conversations/{conversation_id} (update)
Body:

  • metadata (required nel body di update): aggiorna metadati.

Endpoint “Items” della Conversations API: l’API reference elenca operazioni Create/Retrieve/Delete/List su items, ma in questo report non sono stati estratti i percorsi e gli esempi completi dalla pagina specifica di ciascun metodo (necessari per soddisfare “body/response per endpoint” in modo esaustivo).

Chat Completions API

Nota: è presente una superficie “stored chat completions” (con store=true) con operazioni list/retrieve/modify/delete e listing dei messages.

POST /v1/chat/completions
Body (campi principali):

  • model (required, string)
  • messages (required, array):
  • temperature, top_p (optional)
  • max_completion_tokens / max_tokens (optional, legacy vs nuovo naming)
  • response_format (optional; Structured Outputs)
  • tools, tool_choice, parallel_tool_calls (optional)
  • store (optional, boolean), stream (optional, boolean)

Esempio cURL minimale:

bash
curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4-mini",
    "messages": [
      {"role":"user","content":"Genera una regex per validare un IBAN IT."}
    ]
  }'

Operazioni correlate (stored):

  • GET /v1/chat/completions
  • GET|POST|DELETE /v1/chat/completions/{completion_id}
  • GET /v1/chat/completions/{completion_id}/messages

Embeddings API

POST /v1/embeddings
Body:

  • model (required)
  • input (required, string o array)
  • dimensions (optional), encoding_format (optional), user (optional)

Esempio:

bash
curl https://api.openai.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-large",
    "input": "Customer churn prediction features"
  }'

Images API

POST /v1/images/generations
Body (principali):

  • prompt (required)
  • model, n, size, quality, style, background, response_format (optional)

POST /v1/images/edits e POST /v1/images/variations (multipart) sono presenti nella specifica mirror.

Nota deprecazioni: modelli DALL·E snapshot sono indicati come deprecati con shut down (es. maggio 2026).

Audio API

Sintesi vocale:

  • POST /v1/audio/speech (JSON): richiede model, input, voice; opzionali response_format, speed, instructions.

Trascrizione/Traduzione:

  • POST /v1/audio/transcriptions (multipart): richiede file, model; opzionali language, prompt, response_format, temperature, timestamp_granularities[], stream.
  • POST /v1/audio/translations (multipart).

L’API reference recente mostra anche endpoint “Create a voice” e “Voice Consents” (CRUD + list); in questo report non sono disponibili, dalla specifica mirror, i percorsi e gli esempi completi per tali endpoint, quindi non viene riportata una scheda esaustiva per ciascuno.

Files e Uploads

Files:

  • GET /v1/files, POST /v1/files (multipart: file, purpose), GET|DELETE /v1/files/{file_id}, GET /v1/files/{file_id}/content.

Uploads (upload multi-part fino a grandi dimensioni):

  • POST /v1/uploads (JSON: filename, purpose, bytes, mime_type)
  • POST /v1/uploads/{upload_id}/parts
  • POST /v1/uploads/{upload_id}/complete
  • POST /v1/uploads/{upload_id}/cancel

Vector Stores

Vector store (CRUD + search):

  • POST /v1/vector_stores, GET /v1/vector_stores, GET|POST|DELETE /v1/vector_stores/{vector_store_id}, POST /v1/vector_stores/{vector_store_id}/search

Files su vector store:

  • POST /v1/vector_stores/{vector_store_id}/files (attach file)
  • GET|POST|DELETE /v1/vector_stores/{vector_store_id}/files/{file_id}
  • GET /v1/vector_stores/{vector_store_id}/files/{file_id}/content

File batches (ingest asincrono):

  • POST /v1/vector_stores/{vector_store_id}/file_batches
  • GET /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}
  • POST /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel
  • GET /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/files

Rate limit ingest specifico (per vector store) indicato nella guida rate limits.

Batches

  • POST /v1/batches (JSON: input_file_id, endpoint, completion_window, metadata?)
  • GET /v1/batches, GET /v1/batches/{batch_id}, POST /v1/batches/{batch_id}/cancel

Fine-tuning ed Evals

Fine-tuning:

  • POST /v1/fine_tuning/jobs (required: model, training_file; opzionali validation_file, hyperparameters, integrations, seed, method, metadata, suffix)
  • GET /v1/fine_tuning/jobs, GET /v1/fine_tuning/jobs/{id}, POST /v1/fine_tuning/jobs/{id}/cancel
  • GET /v1/fine_tuning/jobs/{id}/events, GET /v1/fine_tuning/jobs/{id}/checkpoints
  • Checkpoint permissions CRUD su /v1/fine_tuning/checkpoints/{checkpoint}/permissions...

Evals:

  • CRUD su /v1/evals, run management su /v1/evals/{eval_id}/runs..., output_items listing/retrieve.

Administration (Organization/Projects/Usage)

Endpoint amministrativi includono:

  • Audit logs, costs, invites, users, admin API keys
  • Projects CRUD, project users/service accounts, project API keys, project rate limits, project certificates
  • Usage breakdown per categoria (images, embeddings, moderations, vector stores, ecc.)

Nota: questi endpoint spesso richiedono privilegi elevati e governance di organizzazione/progetto; usare service accounts e segregazione per progetto, evitando di esporre chiavi admin nel client.

Diagramma mermaid: flusso tipico request → response

mermaid
sequenceDiagram
  participant C as Client
  participant O as OpenAI API (/v1)
  C->>O: HTTPS POST /responses (Authorization Bearer, JSON)
  alt stream=false
    O-->>C: 200 JSON {response}
  else stream=true
    O-->>C: 200 SSE events (delta, completed)
  end
  Note over C,O: Errori: 4xx/5xx; 429 rate limit -> retry con backoff

Catalogo modelli

Premesse e fonti di verità

  • La documentazione “Models” raccomanda un modello flagship e varianti mini/nano per tradeoff costo/latency; riporta anche prezzi per alcuni modelli e segnala disponibilità via Responses API.
  • L’elenco effettivo di modelli disponibili e i relativi snapshot possono variare per account/tier e nel tempo; la documentazione include anche modelli/snapshot deprecati con date di shut down.

Tabella modelli (modello, tipo, parametri chiave, uso consigliato, note)

La tabella seguente è un “seed catalog” derivato dall’estrazione di identificativi modello presenti nella documentazione testuale (include anche modelli storici/deprecati). Per un uso operativo: preferire gli ID “latest” o snapshot moderni e verificare eventuali deprecazioni.

modellotipoparametri_chiaveuso_consigliatonote
gpt-4o-mini-transcribeaudio/sttfile, model, language?, prompt?, response_format?, timestamp_granularities?trascrizione e diarizzazione (se supportata)
gpt-4o-transcribeaudio/sttfile, model, language?, prompt?, response_format?, timestamp_granularities?trascrizione e diarizzazione (se supportata)
gpt-4o-transcribe-diarizeaudio/sttfile, model, language?, prompt?, response_format?, timestamp_granularities?trascrizione e diarizzazione (se supportata)
gpt-4o-transcribe-latestaudio/sttfile, model, language?, prompt?, response_format?, timestamp_granularities?trascrizione e diarizzazione (se supportata)
whisper-1audio/sttfile, model, language?, prompt?, response_format?, timestamp_granularities?trascrizione e diarizzazione (se supportata)
gpt-4o-mini-ttsaudio/ttsmodel, input, voice, response_format?, speed?sintesi vocale
gpt-4o-ttsaudio/ttsmodel, input, voice, response_format?, speed?sintesi vocale
text-embedding-3-largeembeddinginput, model, dimensions?, encoding_format?, user?ricerca semantica, clustering, RAG
text-embedding-3-smallembeddinginput, model, dimensions?, encoding_format?, user?ricerca semantica, clustering, RAG
text-embedding-ada-002embeddinginput, model, dimensions?, encoding_format?, user?ricerca semantica, clustering, RAGmodello storico
omni-moderation-latestmoderationinput (text/image), modelclassificazione contenuti, policy enforcement
text-moderation-latestmoderationinput (text/image), modelclassificazione contenuti, policy enforcementlimitazioni regionali possibili
text-moderation-stablemoderationinput (text/image), modelclassificazione contenuti, policy enforcementmodello “stable”
text-moderation-007moderationinput (text/image), modelclassificazione contenuti, policy enforcementmodello storico
gpt-realtimerealtimesession config, audio codecs, turn detection, eventsvoice agents, speech-to-speech low-latency
gpt-realtime-1.5realtimesession config, audio codecs, turn detection, eventsvoice agents, speech-to-speech low-latency
gpt-realtime-minirealtimesession config, audio codecs, turn detection, eventsvoice agents, speech-to-speech low-latency
gpt-image-1vision/imageprompt, model?, size?, quality?, n?, response_format?generazione/modifica immagini
gpt-image-1-minivision/imageprompt, model?, size?, quality?, n?, response_format?generazione/modifica immaginivariante cost-efficient
gpt-image-1.5vision/imageprompt, model?, size?, quality?, n?, response_format?generazione/modifica immagini
dall-e-2vision/imageprompt, model?, size?, quality?, n?, response_format?generazione/modifica immaginideprecato/shutdown in roadmap
dall-e-3vision/imageprompt, model?, size?, quality?, n?, response_format?generazione/modifica immaginideprecato/shutdown in roadmap
o1reasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessistorico
o1-minireasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessideprecato (vedi deprecations)
o1-previewreasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessideprecato (vedi deprecations)
o1-proreasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessi
o1-pro-2025-03-19reasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessisnapshot
o3reasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessi
o3-2025-04-16reasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessisnapshot
o3-minireasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessi
o3-mini-2025-01-31reasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessisnapshot
o4-minireasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessi
o4-mini-2025-04-16reasoningreasoning_effort, max_output_tokens, tools?, temperature?ragionamento multi-step, pianificazione, problemi complessisnapshot
gpt-5.4general/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodale
gpt-5.4-minigeneral/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodale
gpt-5.4-nanogeneral/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodale
gpt-4.1general/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodale
gpt-4ogeneral/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodale
gpt-3.5-turbogeneral/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodalestorico/deprecazioni selettive
gpt-3.5-turbo-0125general/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodalesnapshot
gpt-4-0613general/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodalestorico
computer-use-previewgeneral/chatmessages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?chat, analisi testo, coding, multimodaletool specialized

(La lista completa è disponibile nel CSV seed scaricabile.)

Differenze e capacità (focus su modelli “correnti” indicati in Models)

  • GPT-5.4 / mini / nano: la pagina Models indica GPT-5.4 come modello “flagship” per ragionamento e coding e mini/nano per latenza/costo; riporta anche prezzi per input/output MTok e contesto (fino a 1M per GPT-5.4).
  • Modelli specializzati:
    • Image: famiglia GPT Image (es. gpt-image-1.5 e variante mini).
    • Realtime: gpt-realtime-1.5 e gpt-realtime-mini per speech-to-speech a bassa latenza.
    • Speech generation / transcription: GPT-4o mini TTS, GPT-4o Transcribe, GPT-4o mini Transcribe.

Prezzi:

  • La pagina Models include prezzi espliciti per alcuni modelli frontier (es. GPT-5.4, mini, nano) in $/Input MTok e $/Output MTok.
  • Per prezzi completi e aggiornati su tutte le famiglie (incluse immagini/audio/strumenti), la documentazione rimanda alla sezione Pricing; in questo report non tutti i prezzi per ogni modello sono riportati perché non erano esplicitati nelle pagine “Models” consultate.

Deprecazioni:

  • La guida Deprecations elenca modelli/snapshot e (in alcuni casi) endpoint storici in ritiro, con date di shut down e sostituzioni raccomandate (es. migrazioni da DALL·E a GPT Image, da o1-preview/o1-mini verso o3/o4-mini, ecc.).

Best practice finali e riferimenti

Best practice tecniche (sintesi):

  • Preferire Responses API come interfaccia unificata per modelli recenti e tool calling; usare streaming SSE (stream=true) per UX a bassa latenza percepita; gestire correttamente replay/obfuscation quando necessario.
  • Implementare retry con exponential backoff + jitter per 429 e alcune classi 5xx; distinguere errori permanenti (400/401/403/404/422) da transient.
  • Osservare i rate limit per unità (RPM/TPM/TPD/IPM) e, dove presenti, limiti per risorsa (es. ingest su vector store).
  • Per oggetti con metadata, rispettare vincoli di cardinalità e lunghezza; usare metadata per correlazioni (tenant, environment, trace id) e auditing.
  • Evitare dipendenze da modelli in deprecazione; monitorare la pagina Deprecations e migrare agli equivalenti consigliati.
  • Per endpoint “beta/legacy” (Assistants v2, ecc.): isolare in moduli dedicati e includere gli header richiesti (es. OpenAI-Beta) solo dove necessario.

Riferimenti principali (integrati via citazioni nel report):

  • API Overview (panoramica REST/streaming/realtime).
  • API Reference (Responses, Conversations, ecc.).
  • Rate limits.
  • Error codes e mapping error types.
  • Models (scelta modello, capacità e prezzi per alcuni modelli).
  • Deprecations (modelli/endpoint deprecati e sostituzioni).

Workspace reference: /Users/jeanpaul/projects/cs-repository