Catalogo tecnico delle API OpenAI e dei modelli

Executive summary

Questo report consolida, in forma di catalogo tecnico, gli endpoint REST (più streaming/realtime) e i modelli documentati ufficialmente, assumendo come base la versione più recente della documentazione disponibile al 12 aprile 2026. La superficie API è descritta come insieme di API REST, streaming (Server-Sent Events) e realtime.

Punti chiave operativi:

Endpoint “core” per generazione: Responses API (consigliata come interfaccia unificata per modelli recenti; i modelli “latest” sono disponibili via Responses).
Rate limiting espresso principalmente in RPM/TPM (oltre a RPD/TPD/IPM), con variazioni per modello e tier; alcune operazioni hanno limiti specifici (es. ingest su Vector Store).
Error handling: codici HTTP 4xx/5xx con tipologie di errore (BadRequest, Authentication, PermissionDenied, NotFound, UnprocessableEntity, RateLimit, InternalServerError) e best practice di retry/backoff.

Deliverable aggiuntivi (estratti/machine-readable) generati in questo lavoro:

Nota metodologica importante: una parte del catalogo endpoint è stata derivata da una specifica OpenAPI pubblica (mirror) e da pagine di API reference; alcune famiglie di endpoint molto recenti presenti nell’API reference (es. “ChatKit”, “Containers”, “Skills”, “Videos”, alcuni endpoint avanzati di Responses) non risultano integralmente rappresentate nella specifica OpenAPI mirror analizzata e richiederebbero un’estrazione dedicata dalle singole pagine di riferimento per completare tutti i dettagli richiesti per endpoint (parametri completi + esempi). Questo report segnala esplicitamente dove mancano dettagli.

Convenzioni comuni e standard trasversali

Base URL e versione:

Base URL: https://api.openai.com/v1 (prefisso /v1 per tutti i percorsi REST).

Autenticazione e header:

Header minimo standard (quasi tutti gli endpoint REST):
- Authorization: Bearer <OPENAI_API_KEY>
- Content-Type: application/json (per payload JSON)
Per endpoint “beta/legacy” (es. Assistants v2 nelle API legacy) la documentazione/esempi indicano l’uso di OpenAI-Beta: assistants=v2.

Metadati:

Oggetti come Conversation supportano metadata fino a 16 coppie chiave/valore; chiavi max 64 char, valori max 512 char.

Paginazione (pattern comune):

Endpoint “list” tipicamente espongono limit (1..100, default 20) e cursori after/before oppure after/order.

Streaming (SSE):

Alcuni endpoint supportano stream: true con Server-Sent Events.

Rate limits:

I rate limit sono misurati tipicamente in RPM/RPD/TPM/TPD e, per immagini, IPM; variano in base a modello e tier dell’organizzazione/progetto.
Esempio di limite specifico: ingest file batch su Vector Store con limite di circa 300 RPM per vector store su alcuni endpoint.

Error handling (schema e codici):

Tipi di errore per status code: 400, 401, 403, 404, 422, 429, >=500; retry con backoff in caso di 429/5xx e gestione dei timeout.

Esempio JSON (generico) di errore applicabile alla maggior parte degli endpoint (schema tipico “error object”; i campi precisi possono variare):

json

{
  "error": {
    "message": "Invalid request",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found"
  }
}

Catalogo endpoint

Diagramma mermaid: mappa logica degli endpoint

mermaid

flowchart LR
  A[Client] -->|HTTPS REST /v1| R[REST APIs]
  A -->|SSE stream=true| S[Streaming (SSE)]
  A -->|WebRTC/WebSocket| RT[Realtime API]

  R --> RESP[/responses + subresources/]
  R --> CHAT[/chat/completions/]
  R --> EMB[/embeddings/]
  R --> IMG[/images/*/]
  R --> AUD[/audio/*/]
  R --> FILES[/files, uploads/]
  R --> BATCH[/batches/]
  R --> FT[/fine_tuning/]
  R --> EVAL[/evals/]
  R --> VS[/vector_stores/]
  R --> ORG[/organization/* admin/]

Tabella endpoint (endpoint, metodo, descrizione, rate limit)

La tabella seguente elenca gli endpoint estratti dalla specifica OpenAPI mirror analizzata; per endpoint aggiuntivi presenti nell’API Reference ma non rappresentati in tale specifica (es. POST /v1/responses/input_tokens, POST /v1/responses/{id}/cancel, POST /v1/responses/compact, /v1/conversations), la sezione successiva integra.

endpoint	metodo	descrizione	rate_limit
/v1/assistants	GET	Returns a list of assistants.	dipende da tier/modello (v. docs Rate limits)
/v1/assistants	POST	Create an assistant with a model and instructions.	dipende da tier/modello (v. docs Rate limits)
/v1/assistants/	GET	Retrieves an assistant.	dipende da tier/modello (v. docs Rate limits)
/v1/assistants/	POST	Modifies an assistant.	dipende da tier/modello (v. docs Rate limits)
/v1/assistants/	DELETE	Delete an assistant.	dipende da tier/modello (v. docs Rate limits)
/v1/audio/speech	POST	Generates audio from the input text.	dipende da tier/modello (v. docs Rate limits)
/v1/audio/transcriptions	POST	Transcribes audio into the input language.	dipende da tier/modello (v. docs Rate limits)
/v1/audio/translations	POST	Translates audio into English.	dipende da tier/modello (v. docs Rate limits)
/v1/batches	GET	List your organization's batches.	dipende da tier/modello (v. docs Rate limits)
/v1/batches	POST	Creates and executes a batch from an uploaded file of requests	dipende da tier/modello (v. docs Rate limits)
/v1/batches/	GET	Retrieves a batch.	dipende da tier/modello (v. docs Rate limits)
/v1/batches/{batch_id}/cancel	POST	Cancels an in-progress batch.	dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions	GET	List stored chat completions. Only Chat Completions that have been stored with the `store` parameter set to `true` will be returned.	dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions	POST	Creates a model response for the given chat conversation.	dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/	GET	Get a stored chat completion. Only Chat Completions that have been stored with the `store` parameter set to `true` will be returned.	dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/	POST	Modify a stored chat completion. Only Chat Completions that have been stored with the `store` parameter set to `true` can be modified.	dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/	DELETE	Delete a stored chat completion. Only Chat Completions that have been stored with the `store` parameter set to `true` can be deleted.	dipende da tier/modello (v. docs Rate limits)
/v1/chat/completions/{completion_id}/messages	GET	Get the messages in a stored chat completion. Only Chat Completions that have been stored with the `store` parameter set to `true` will be returned.	dipende da tier/modello (v. docs Rate limits)
/v1/completions	POST	Creates a completion for the provided prompt and parameters.	dipende da tier/modello (v. docs Rate limits)
/v1/embeddings	POST	Creates an embedding vector representing the input text.	dipende da tier/modello (v. docs Rate limits)
/v1/evals	GET	List evals for a project.	dipende da tier/modello (v. docs Rate limits)
/v1/evals	POST	Create an Eval.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/	GET	Get an Eval by ID.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/	POST	Update an Eval by ID.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/	DELETE	Delete an Eval by ID.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs	GET	Get a list of runs for an Eval.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs	POST	Create a run for an Eval.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/	GET	Get an Eval run by ID.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/	DELETE	Delete an Eval run by ID.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/{run_id}/cancel	POST	Cancel an Eval run.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/{run_id}/output_items	GET	Get a list of output items for an Eval run.	dipende da tier/modello (v. docs Rate limits)
/v1/evals/{eval_id}/runs/{run_id}/output_items/	GET	Get an output item for an Eval run.	dipende da tier/modello (v. docs Rate limits)
/v1/files	GET	Returns a list of files that belong to the user's organization.	dipende da tier/modello (v. docs Rate limits)
/v1/files	POST	Upload a file that can be used across various endpoints/features.	dipende da tier/modello (v. docs Rate limits)
/v1/files/	GET	Returns information about a specific file.	dipende da tier/modello (v. docs Rate limits)
/v1/files/	DELETE	Delete a file.	dipende da tier/modello (v. docs Rate limits)
/v1/files/{file_id}/content	GET	Returns the contents of the specified file.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions	GET	List checkpoint permissions for a fine-tuned model checkpoint.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions	POST	Create a permission for a fine-tuned model checkpoint.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions/	GET	Retrieve a permission for a fine-tuned model checkpoint.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/checkpoints/{fine_tuned_model_checkpoint}/permissions/	DELETE	Delete a permission for a fine-tuned model checkpoint.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs	GET	List your organization's fine-tuning jobs	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs	POST	Creates a fine-tuning job which begins the process of creating a new model from a given dataset.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/	GET	Get info about a fine-tuning job.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/{fine_tuning_job_id}/cancel	POST	Immediately cancel a fine-tune job.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/{fine_tuning_job_id}/checkpoints	GET	List checkpoints for a fine-tuning job.	dipende da tier/modello (v. docs Rate limits)
/v1/fine_tuning/jobs/{fine_tuning_job_id}/events	GET	Get fine-grained status updates for a fine-tuning job.	dipende da tier/modello (v. docs Rate limits)
/v1/images/edits	POST	Creates an edited or extended image given an original image and a prompt.	dipende da tier/modello (v. docs Rate limits)
/v1/images/generations	POST	Creates an image given a prompt.	dipende da tier/modello (v. docs Rate limits)
/v1/images/variations	POST	Creates a variation of a given image.	dipende da tier/modello (v. docs Rate limits)
/v1/models	GET	Lists the currently available models, and provides basic information about each one such as the owner and availability.	dipende da tier/modello (v. docs Rate limits)
/v1/models/	GET	Retrieves a model instance, providing basic information about the model such as the owner and permissioning.	dipende da tier/modello (v. docs Rate limits)
/v1/moderations	POST	Classifies if text and/or image inputs are potentially harmful.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keys	GET	List organization-wide admin API keys	dipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keys	POST	Create an organization-wide admin API key	dipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keys/	GET	Retrieve a admin API key	dipende da tier/modello (v. docs Rate limits)
/v1/organization/admin_api_keys/	DELETE	Delete an admin API key	dipende da tier/modello (v. docs Rate limits)
/v1/organization/audit_logs	GET	List user actions and configuration changes within this organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates	GET	List uploaded certificates	dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates	POST	Upload a certificate for use by the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates/	GET	Get details for a certificate.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates/	POST	Update a certificate.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/certificates/	DELETE	Delete a certificate from the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/costs	GET	Get cost details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/invites	GET	Returns a list of invites in the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/invites	POST	Create an invite for a user to the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/invites/	GET	Retrieves an invite.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/invites/	DELETE	Delete an invite. If the invite has already been accepted, it cannot be deleted.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects	GET	Lists all projects for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects	POST	Create a new project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/	GET	Retrieves a project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/	POST	Modifies a project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/api_keys	GET	Returns a list of API keys in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/api_keys/	GET	Retrieves an API key in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/api_keys/	DELETE	Deletes an API key from the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/archive	POST	Archives a project in the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/certificates	GET	List certificates for a project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/certificates/activate	POST	Activates a certificate for a project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/certificates/deactivate	POST	Deactivates a certificate for a project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/rate_limits	GET	Returns the rate limits per model for a project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/rate_limits/	POST	Updates a rate limit.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accounts	GET	Returns a list of service accounts in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accounts	POST	Creates a service account in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accounts/	GET	Retrieves a service account in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/service_accounts/	DELETE	Deletes a service account from the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users	GET	Returns a list of users in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users	POST	Adds a user to the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users/	GET	Retrieves a user in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users/	POST	Modifies a user's role in the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/projects/{project_id}/users/	DELETE	Deletes a user from the project.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/users	GET	Lists all of the users in the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/users/	GET	Retrieves a user by their identifier.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/users/	POST	Modifies a user's role in the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/users/	DELETE	Deletes a user from the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/audio_speeches	GET	Get audio speeches usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/audio_transcriptions	GET	Get audio transcriptions usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/code_interpreter_sessions	GET	Get code interpreter sessions usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/completions	GET	Get completions usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/embeddings	GET	Get embeddings usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/images	GET	Get images usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/moderations	GET	Get moderations usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/organization/usage/vector_stores	GET	Get vector stores usage details for the organization.	dipende da tier/modello (v. docs Rate limits)
/v1/realtime/sessions	POST	Create an ephemeral API token for use in client-side applications with the Realtime API. Can be configured with the same session parameters as the `session.update` client event.	dipende da tier/modello (v. docs Rate limits)
/v1/realtime/transcription_sessions	POST	Create an ephemeral API token for use in client-side applications with the Realtime API specifically for realtime transcriptions. Can be configured with the same session parameters as the `transcription_session.update` client event.	dipende da tier/modello (v. docs Rate limits)
/v1/responses	POST	Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search.	dipende da tier/modello (v. docs Rate limits)
/v1/responses/	GET	Retrieves a model response with the given ID.	dipende da tier/modello (v. docs Rate limits)
/v1/responses/	DELETE	Deletes a model response with the given ID.	dipende da tier/modello (v. docs Rate limits)
/v1/responses/{response_id}/input_items	GET	Returns a list of input items for a given response.	dipende da tier/modello (v. docs Rate limits)
/v1/uploads	POST	Creates an intermediate Upload object that you can add Parts to. Currently, an Upload can accept at most 8 GB in total and expires after an hour after you create it.	dipende da tier/modello (v. docs Rate limits)
/v1/uploads/{upload_id}/cancel	POST	Cancels the Upload. No Parts may be added after an Upload is cancelled.	dipende da tier/modello (v. docs Rate limits)
/v1/uploads/{upload_id}/complete	POST	Completes the Upload. After completing, the Upload is removed and a new File object is created.	dipende da tier/modello (v. docs Rate limits)
/v1/uploads/{upload_id}/parts	POST	Adds a Part to an Upload object. A Part represents a chunk of bytes from the file you are uploading.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores	GET	Returns a list of vector stores.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores	POST	Create a vector store.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/	GET	Retrieves a vector store.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/	POST	Modifies a vector store.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/	DELETE	Delete a vector store.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches	POST	Create a vector store file batch.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches/	GET	Retrieves a vector store file batch.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel	POST	Cancel a vector store file batch.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/files	GET	Returns a list of vector store files in a batch.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files	GET	Returns a list of vector store files.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files	POST	Create a vector store file by attaching a File to a vector store.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/	GET	Retrieves a vector store file.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/	POST	Modifies a vector store file.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/	DELETE	Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/files/{file_id}/content	GET	Retrieve the parsed contents of a vector store file.	dipende da tier/modello (v. docs Rate limits)
/v1/vector_stores/{vector_store_id}/search	POST	Search a vector store for relevant chunks based on a query and file attributes filter.	dipende da tier/modello (v. docs Rate limits)

Rate limits: il valore puntuale (RPM/TPM ecc.) dipende da modello e tier; consultare la guida “Rate limits”.

Schede endpoint principali (dettaglio tecnico)

Di seguito le schede tecniche complete per i principali endpoint di runtime, ingestion e retrieval. Per l’elenco completo con parametri e esempi degli endpoint presenti nella specifica OpenAPI mirror analizzata, fare riferimento al JSON scaricabile.

Responses API

POST /v1/responses
Funzione: genera una risposta del modello (testo/JSON, con tool calling opzionale).

Parametri request (body JSON, top-level):

model (required, string): ID modello.
input (required, string | array): testo o lista di input items (multimodale).
instructions (optional, string): istruzioni “developer/system” per la request.
max_output_tokens (optional, integer): bound superiore output (include reasoning tokens se applicabile).
temperature, top_p (optional, number): parametri campionamento.
stream (optional, boolean): abilita SSE streaming.
include (optional, array): includi campi aggiuntivi (es. risultati web/file search, logprobs).
truncation (optional, "auto"|"disabled"): comportamento se input eccede context.
store (optional, boolean): persistenza per retrieval successivo (dove supportato).
metadata (optional, oggetto fino a 16 coppie): metadati.

Headers richiesti:

Authorization: Bearer ...
Content-Type: application/json

Esempio cURL:

bash

curl https://api.openai.com/v1/responses \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "input": "Scrivi una funzione Apex che valida un IBAN."
  }'

Esempio response (successo, ridotto):

json

{
  "id": "resp_...",
  "object": "response",
  "status": "completed",
  "model": "gpt-4o-2024-08-06",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{ "type": "output_text", "text": "..." }]
    }
  ],
  "usage": { "input_tokens": 0, "output_tokens": 0, "total_tokens": 0 }
}

Errori comuni:

400 invalid request (es. input troppo grande con truncation="disabled").
401/403 authz/authn, 429 rate limit, 5xx server error.

GET /v1/responses/{response_id}
Query params notevoli:

include[] (optional): includi campi extra.
stream (optional, boolean): stream SSE anche in retrieve.
include_obfuscation, starting_after (uso avanzato streaming/event replay).

DELETE /v1/responses/{response_id}
Restituisce { "id": "...", "object": "response", "deleted": true }.

GET /v1/responses/{response_id}/input_items
Query params:

after (cursor), limit (1..100, default 20), order (asc|desc, default desc), include[].

Endpoint avanzati di Responses presenti nella documentazione recente ma non integralmente inclusi nella specifica OpenAPI mirror analizzata:

POST /v1/responses/input_tokens: conteggio token input.
POST /v1/responses/{response_id}/cancel: annulla response in background (solo se creata con background=true).
POST /v1/responses/compact: compaction conversazioni lunghe.

Esempio POST /v1/responses/input_tokens:

bash

curl -X POST https://api.openai.com/v1/responses/input_tokens \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "input": "Tell me a joke."
  }'

json

{ "object": "response.input_tokens", "input_tokens": 11 }

Conversations API

Questi endpoint (nuovi) sono documentati nell’API reference recente e forniscono uno stato conversazionale separato dalla generazione.

POST /v1/conversations
Body:

items (optional, array): elementi iniziali; max 20 items per chiamata.
metadata (optional, 16 coppie key/value, vincoli chiave/valore).

Esempio:

bash

curl https://api.openai.com/v1/conversations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {"topic": "demo"},
    "items": [
      { "type": "message", "role": "user", "content": "Hello!" }
    ]
  }'

GET /v1/conversations/{conversation_id}
Restituisce oggetto conversation con metadata.

POST /v1/conversations/{conversation_id} (update)
Body:

metadata (required nel body di update): aggiorna metadati.

Endpoint “Items” della Conversations API: l’API reference elenca operazioni Create/Retrieve/Delete/List su items, ma in questo report non sono stati estratti i percorsi e gli esempi completi dalla pagina specifica di ciascun metodo (necessari per soddisfare “body/response per endpoint” in modo esaustivo).

Chat Completions API

Nota: è presente una superficie “stored chat completions” (con store=true) con operazioni list/retrieve/modify/delete e listing dei messages.

POST /v1/chat/completions
Body (campi principali):

model (required, string)
messages (required, array):
temperature, top_p (optional)
max_completion_tokens / max_tokens (optional, legacy vs nuovo naming)
response_format (optional; Structured Outputs)
tools, tool_choice, parallel_tool_calls (optional)
store (optional, boolean), stream (optional, boolean)

Esempio cURL minimale:

bash

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4-mini",
    "messages": [
      {"role":"user","content":"Genera una regex per validare un IBAN IT."}
    ]
  }'

Operazioni correlate (stored):

GET /v1/chat/completions
GET|POST|DELETE /v1/chat/completions/{completion_id}
GET /v1/chat/completions/{completion_id}/messages

Embeddings API

POST /v1/embeddings
Body:

model (required)
input (required, string o array)
dimensions (optional), encoding_format (optional), user (optional)

Esempio:

bash

curl https://api.openai.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-large",
    "input": "Customer churn prediction features"
  }'

Images API

POST /v1/images/generations
Body (principali):

prompt (required)
model, n, size, quality, style, background, response_format (optional)

POST /v1/images/edits e POST /v1/images/variations (multipart) sono presenti nella specifica mirror.

Nota deprecazioni: modelli DALL·E snapshot sono indicati come deprecati con shut down (es. maggio 2026).

Audio API

Sintesi vocale:

POST /v1/audio/speech (JSON): richiede model, input, voice; opzionali response_format, speed, instructions.

Trascrizione/Traduzione:

POST /v1/audio/transcriptions (multipart): richiede file, model; opzionali language, prompt, response_format, temperature, timestamp_granularities[], stream.
POST /v1/audio/translations (multipart).

L’API reference recente mostra anche endpoint “Create a voice” e “Voice Consents” (CRUD + list); in questo report non sono disponibili, dalla specifica mirror, i percorsi e gli esempi completi per tali endpoint, quindi non viene riportata una scheda esaustiva per ciascuno.

Files e Uploads

Files:

GET /v1/files, POST /v1/files (multipart: file, purpose), GET|DELETE /v1/files/{file_id}, GET /v1/files/{file_id}/content.

Uploads (upload multi-part fino a grandi dimensioni):

POST /v1/uploads (JSON: filename, purpose, bytes, mime_type)
POST /v1/uploads/{upload_id}/parts
POST /v1/uploads/{upload_id}/complete
POST /v1/uploads/{upload_id}/cancel

Vector Stores

Vector store (CRUD + search):

POST /v1/vector_stores, GET /v1/vector_stores, GET|POST|DELETE /v1/vector_stores/{vector_store_id}, POST /v1/vector_stores/{vector_store_id}/search

Files su vector store:

POST /v1/vector_stores/{vector_store_id}/files (attach file)
GET|POST|DELETE /v1/vector_stores/{vector_store_id}/files/{file_id}
GET /v1/vector_stores/{vector_store_id}/files/{file_id}/content

File batches (ingest asincrono):

POST /v1/vector_stores/{vector_store_id}/file_batches
GET /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}
POST /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel
GET /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/files

Rate limit ingest specifico (per vector store) indicato nella guida rate limits.

Batches

POST /v1/batches (JSON: input_file_id, endpoint, completion_window, metadata?)
GET /v1/batches, GET /v1/batches/{batch_id}, POST /v1/batches/{batch_id}/cancel

Fine-tuning ed Evals

Fine-tuning:

POST /v1/fine_tuning/jobs (required: model, training_file; opzionali validation_file, hyperparameters, integrations, seed, method, metadata, suffix)
GET /v1/fine_tuning/jobs, GET /v1/fine_tuning/jobs/{id}, POST /v1/fine_tuning/jobs/{id}/cancel
GET /v1/fine_tuning/jobs/{id}/events, GET /v1/fine_tuning/jobs/{id}/checkpoints
Checkpoint permissions CRUD su /v1/fine_tuning/checkpoints/{checkpoint}/permissions...

Evals:

CRUD su /v1/evals, run management su /v1/evals/{eval_id}/runs..., output_items listing/retrieve.

Administration (Organization/Projects/Usage)

Endpoint amministrativi includono:

Audit logs, costs, invites, users, admin API keys
Projects CRUD, project users/service accounts, project API keys, project rate limits, project certificates
Usage breakdown per categoria (images, embeddings, moderations, vector stores, ecc.)

Nota: questi endpoint spesso richiedono privilegi elevati e governance di organizzazione/progetto; usare service accounts e segregazione per progetto, evitando di esporre chiavi admin nel client.

Diagramma mermaid: flusso tipico request → response

mermaid

sequenceDiagram
  participant C as Client
  participant O as OpenAI API (/v1)
  C->>O: HTTPS POST /responses (Authorization Bearer, JSON)
  alt stream=false
    O-->>C: 200 JSON {response}
  else stream=true
    O-->>C: 200 SSE events (delta, completed)
  end
  Note over C,O: Errori: 4xx/5xx; 429 rate limit -> retry con backoff

Catalogo modelli

Premesse e fonti di verità

La documentazione “Models” raccomanda un modello flagship e varianti mini/nano per tradeoff costo/latency; riporta anche prezzi per alcuni modelli e segnala disponibilità via Responses API.
L’elenco effettivo di modelli disponibili e i relativi snapshot possono variare per account/tier e nel tempo; la documentazione include anche modelli/snapshot deprecati con date di shut down.

Tabella modelli (modello, tipo, parametri chiave, uso consigliato, note)

La tabella seguente è un “seed catalog” derivato dall’estrazione di identificativi modello presenti nella documentazione testuale (include anche modelli storici/deprecati). Per un uso operativo: preferire gli ID “latest” o snapshot moderni e verificare eventuali deprecazioni.

modello	tipo	parametri_chiave	uso_consigliato	note
gpt-4o-mini-transcribe	audio/stt	file, model, language?, prompt?, response_format?, timestamp_granularities?	trascrizione e diarizzazione (se supportata)
gpt-4o-transcribe	audio/stt	file, model, language?, prompt?, response_format?, timestamp_granularities?	trascrizione e diarizzazione (se supportata)
gpt-4o-transcribe-diarize	audio/stt	file, model, language?, prompt?, response_format?, timestamp_granularities?	trascrizione e diarizzazione (se supportata)
gpt-4o-transcribe-latest	audio/stt	file, model, language?, prompt?, response_format?, timestamp_granularities?	trascrizione e diarizzazione (se supportata)
whisper-1	audio/stt	file, model, language?, prompt?, response_format?, timestamp_granularities?	trascrizione e diarizzazione (se supportata)
gpt-4o-mini-tts	audio/tts	model, input, voice, response_format?, speed?	sintesi vocale
gpt-4o-tts	audio/tts	model, input, voice, response_format?, speed?	sintesi vocale
text-embedding-3-large	embedding	input, model, dimensions?, encoding_format?, user?	ricerca semantica, clustering, RAG
text-embedding-3-small	embedding	input, model, dimensions?, encoding_format?, user?	ricerca semantica, clustering, RAG
text-embedding-ada-002	embedding	input, model, dimensions?, encoding_format?, user?	ricerca semantica, clustering, RAG	modello storico
omni-moderation-latest	moderation	input (text/image), model	classificazione contenuti, policy enforcement
text-moderation-latest	moderation	input (text/image), model	classificazione contenuti, policy enforcement	limitazioni regionali possibili
text-moderation-stable	moderation	input (text/image), model	classificazione contenuti, policy enforcement	modello “stable”
text-moderation-007	moderation	input (text/image), model	classificazione contenuti, policy enforcement	modello storico
gpt-realtime	realtime	session config, audio codecs, turn detection, events	voice agents, speech-to-speech low-latency
gpt-realtime-1.5	realtime	session config, audio codecs, turn detection, events	voice agents, speech-to-speech low-latency
gpt-realtime-mini	realtime	session config, audio codecs, turn detection, events	voice agents, speech-to-speech low-latency
gpt-image-1	vision/image	prompt, model?, size?, quality?, n?, response_format?	generazione/modifica immagini
gpt-image-1-mini	vision/image	prompt, model?, size?, quality?, n?, response_format?	generazione/modifica immagini	variante cost-efficient
gpt-image-1.5	vision/image	prompt, model?, size?, quality?, n?, response_format?	generazione/modifica immagini
dall-e-2	vision/image	prompt, model?, size?, quality?, n?, response_format?	generazione/modifica immagini	deprecato/shutdown in roadmap
dall-e-3	vision/image	prompt, model?, size?, quality?, n?, response_format?	generazione/modifica immagini	deprecato/shutdown in roadmap
o1	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	storico
o1-mini	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	deprecato (vedi deprecations)
o1-preview	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	deprecato (vedi deprecations)
o1-pro	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi
o1-pro-2025-03-19	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	snapshot
o3	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi
o3-2025-04-16	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	snapshot
o3-mini	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi
o3-mini-2025-01-31	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	snapshot
o4-mini	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi
o4-mini-2025-04-16	reasoning	reasoning_effort, max_output_tokens, tools?, temperature?	ragionamento multi-step, pianificazione, problemi complessi	snapshot
gpt-5.4	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale
gpt-5.4-mini	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale
gpt-5.4-nano	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale
gpt-4.1	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale
gpt-4o	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale
gpt-3.5-turbo	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale	storico/deprecazioni selettive
gpt-3.5-turbo-0125	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale	snapshot
gpt-4-0613	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale	storico
computer-use-preview	general/chat	messages/input, model, max_output_tokens, temperature/top_p, tools?, response_format?	chat, analisi testo, coding, multimodale	tool specialized

(La lista completa è disponibile nel CSV seed scaricabile.)

Differenze e capacità (focus su modelli “correnti” indicati in Models)

GPT-5.4 / mini / nano: la pagina Models indica GPT-5.4 come modello “flagship” per ragionamento e coding e mini/nano per latenza/costo; riporta anche prezzi per input/output MTok e contesto (fino a 1M per GPT-5.4).
Modelli specializzati:
- Image: famiglia GPT Image (es. gpt-image-1.5 e variante mini).
- Realtime: gpt-realtime-1.5 e gpt-realtime-mini per speech-to-speech a bassa latenza.
- Speech generation / transcription: GPT-4o mini TTS, GPT-4o Transcribe, GPT-4o mini Transcribe.

Prezzi:

La pagina Models include prezzi espliciti per alcuni modelli frontier (es. GPT-5.4, mini, nano) in $/Input MTok e $/Output MTok.
Per prezzi completi e aggiornati su tutte le famiglie (incluse immagini/audio/strumenti), la documentazione rimanda alla sezione Pricing; in questo report non tutti i prezzi per ogni modello sono riportati perché non erano esplicitati nelle pagine “Models” consultate.

Deprecazioni:

La guida Deprecations elenca modelli/snapshot e (in alcuni casi) endpoint storici in ritiro, con date di shut down e sostituzioni raccomandate (es. migrazioni da DALL·E a GPT Image, da o1-preview/o1-mini verso o3/o4-mini, ecc.).

Best practice finali e riferimenti

Best practice tecniche (sintesi):

Preferire Responses API come interfaccia unificata per modelli recenti e tool calling; usare streaming SSE (stream=true) per UX a bassa latenza percepita; gestire correttamente replay/obfuscation quando necessario.
Implementare retry con exponential backoff + jitter per 429 e alcune classi 5xx; distinguere errori permanenti (400/401/403/404/422) da transient.
Osservare i rate limit per unità (RPM/TPM/TPD/IPM) e, dove presenti, limiti per risorsa (es. ingest su vector store).
Per oggetti con metadata, rispettare vincoli di cardinalità e lunghezza; usare metadata per correlazioni (tenant, environment, trace id) e auditing.
Evitare dipendenze da modelli in deprecazione; monitorare la pagina Deprecations e migrare agli equivalenti consigliati.
Per endpoint “beta/legacy” (Assistants v2, ecc.): isolare in moduli dedicati e includere gli header richiesti (es. OpenAI-Beta) solo dove necessario.

Riferimenti principali (integrati via citazioni nel report):

API Overview (panoramica REST/streaming/realtime).
API Reference (Responses, Conversations, ecc.).
Rate limits.
Error codes e mapping error types.
Models (scelta modello, capacità e prezzi per alcuni modelli).
Deprecations (modelli/endpoint deprecati e sostituzioni).

Contratti

Client e runtime

GitHub CLI gh

Railway CLI

OpenAI API e modelli

Retrieval-Augmented Generation

Catalogo tecnico delle API OpenAI e dei modelli

Executive summary

Convenzioni comuni e standard trasversali

Catalogo endpoint

Diagramma mermaid: mappa logica degli endpoint

Tabella endpoint (endpoint, metodo, descrizione, rate limit)

Schede endpoint principali (dettaglio tecnico)

Responses API

Conversations API

Chat Completions API

Embeddings API

Images API

Audio API

Files e Uploads

Vector Stores

Batches

Fine-tuning ed Evals

Administration (Organization/Projects/Usage)

Diagramma mermaid: flusso tipico request → response

Catalogo modelli

Premesse e fonti di verità

Tabella modelli (modello, tipo, parametri chiave, uso consigliato, note)

Differenze e capacità (focus su modelli “correnti” indicati in Models)

Best practice finali e riferimenti

Catalogo tecnico delle API OpenAI e dei modelli ​

Executive summary ​

Convenzioni comuni e standard trasversali ​

Catalogo endpoint ​

Diagramma mermaid: mappa logica degli endpoint ​

Tabella endpoint (endpoint, metodo, descrizione, rate limit) ​

Schede endpoint principali (dettaglio tecnico) ​

Responses API ​

Conversations API ​

Chat Completions API ​

Embeddings API ​

Images API ​

Audio API ​

Files e Uploads ​

Vector Stores ​

Batches ​

Fine-tuning ed Evals ​

Administration (Organization/Projects/Usage) ​

Diagramma mermaid: flusso tipico request → response ​

Catalogo modelli ​

Premesse e fonti di verità ​

Tabella modelli (modello, tipo, parametri chiave, uso consigliato, note) ​

Differenze e capacità (focus su modelli “correnti” indicati in Models) ​

Best practice finali e riferimenti ​

Catalogo tecnico delle API OpenAI e dei modelli

Executive summary

Convenzioni comuni e standard trasversali

Catalogo endpoint

Diagramma mermaid: mappa logica degli endpoint

Tabella endpoint (endpoint, metodo, descrizione, rate limit)

Schede endpoint principali (dettaglio tecnico)

Responses API

Conversations API

Chat Completions API

Embeddings API

Images API

Audio API

Files e Uploads

Vector Stores

Batches

Fine-tuning ed Evals

Administration (Organization/Projects/Usage)

Diagramma mermaid: flusso tipico request → response

Catalogo modelli

Premesse e fonti di verità

Tabella modelli (modello, tipo, parametri chiave, uso consigliato, note)

Differenze e capacità (focus su modelli “correnti” indicati in Models)

Best practice finali e riferimenti