OpenAI SDK

Drop-in replacements for openai.OpenAI and openai.AsyncOpenAI with automatic telemetry on every token-consuming endpoint.

from weflayr.sdk.openai.client import OpenAI, AsyncOpenAI

Clients

Class Replaces Mode
OpenAI openai.OpenAI Synchronous
AsyncOpenAI openai.AsyncOpenAI Async / await

Constructor parameters

All parameters beyond api_key are optional and fall back to environment variables.

Parameter Env var Description
api_key Your OpenAI API key
intake_url WEFLAYR_INTAKE_URL Weflayr intake base URL
client_id WEFLAYR_CLIENT_ID Your Flare client ID
bearer_token WEFLAYR_CLIENT_SECRET Your Flare client secret

Coverage

7 covered
9 not covered
16 total endpoints
Endpoint Accessor Sync Async Stream Billing metrics Notes
create() covered client.chat.completions prompt_tokens, completion_tokens Streaming supported — injects `include_usage` automatically to capture token counts from the final chunk
create() covered client.embeddings prompt_tokens, total_tokens Tracks both prompt and total token counts
create() covered client.responses input_tokens, output_tokens, cached_tokens Stateful responses API. Also tracks prompt cache hits via `cached_tokens`
create() covered client.audio.speech char_count Billed by character count, not tokens — `char_count` is captured from the input text before the call
create() covered client.audio.transcriptions prompt_tokens Supports `whisper-1` and newer transcription models. Billed by tokens or audio seconds depending on model
create() covered client.audio.translations prompt_tokens Translates audio to English. `whisper-1` only
create() covered client.completions prompt_tokens, completion_tokens For `gpt-3.5-turbo-instruct` and similar legacy models. Also tracks `prompt_length`
generate() not covered client.images Not yet instrumented
edit() not covered client.images Not yet instrumented
create_variation() not covered client.images Not yet instrumented
create / list / retrieve / update / delete not covered client.beta.assistants Full Assistants API not instrumented — use direct API integration if needed
threads.* / runs.* not covered client.beta.threads Requires Assistants API — not yet instrumented
jobs.create() not covered client.fine_tuning Not yet instrumented
create() not covered client.moderations Free endpoint — no billing tracking. Not instrumented
create() not covered client.batches Async batch processing not yet instrumented
create / list / delete not covered client.beta.vector_stores Not yet instrumented

Examples

Chat completions — standard

from weflayr.sdk.openai.client import OpenAI

client = OpenAI(api_key="sk-...")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Explain recursion"}],
    tags={"feature": "docs", "version": "v2"},
)
print(response.choices[0].message.content)

Chat completions — streaming

with client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
    tags={"feature": "streaming-demo"},
) as stream:
    for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="")
# Token counts are captured from the final chunk automatically

Async client

import asyncio
from weflayr.sdk.openai.client import AsyncOpenAI

client = AsyncOpenAI(api_key="sk-...")

async def main():
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
        tags={"env": "production"},
    )
    print(response.choices[0].message.content)

asyncio.run(main())

Embeddings

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox",
    tags={"pipeline": "rag", "step": "embed"},
)

Text-to-Speech

audio = client.audio.speech.create(
    model="tts-1",
    voice="alloy",
    input="Hello from Weflayr!",
    tags={"locale": "en-US"},
)
# char_count is tracked automatically from `input`

Transcription

transcript = client.audio.transcriptions.create(
    model="whisper-1",
    file=open("audio.mp3", "rb"),
    tags={"source": "call-centre"},
)