Google GenAI — Example Configuration
Partial support — models.generateContent and models.generateContentStream are instrumented. Token usage is captured from usageMetadata, including thinking tokens which are added to the completion count.
Setup
const { weflayr_setup, weflayr_instrument } = require('weflayr');
weflayr_setup({
intake_url: process.env.WEFLAYR_INTAKE_URL,
client_id: process.env.WEFLAYR_CLIENT_ID,
client_secret: process.env.WEFLAYR_CLIENT_SECRET,
event_mode: 'default',
methods: [
{ call: 'models.generateContent' },
{ call: 'models.generateContentStream' },
],
});
const { GoogleGenAI } = require('@google/genai');
const ai = weflayr_instrument(new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY }));
import os
from google import genai
from weflayr import weflayr_setup, weflayr_instrument
weflayr_setup({
"intake_url": os.environ["WEFLAYR_INTAKE_URL"],
"client_id": os.environ["WEFLAYR_CLIENT_ID"],
"client_secret": os.environ["WEFLAYR_CLIENT_SECRET"],
"event_mode": "default",
"methods": [
{"call": "models.generate_content"},
{"call": "models.generate_content_stream"},
],
})
ai = weflayr_instrument(genai.Client(api_key=os.environ["GOOGLE_API_KEY"]))
Get a free API key at aistudio.google.com/app/apikey.
Non-streaming call
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Write a haiku about software engineering.',
__weflayr_tags: {
feature: 'haiku',
provider: 'google',
customer_id: 'customer-456',
},
});
console.log(response.text);
response = ai.models.generate_content(
model="gemini-2.5-flash",
contents="Write a haiku about software engineering.",
__weflayr_tags={
"feature": "haiku",
"provider": "google",
"customer_id": "customer-456",
},
)
print(response.text)
Streaming call
const stream = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Write a short poem about observability.',
__weflayr_tags: {
feature: 'poem',
mode: 'streaming',
provider: 'google',
customer_id: 'acme-corp',
},
});
for await (const chunk of stream) {
process.stdout.write(chunk.text ?? '');
}
stream = ai.models.generate_content_stream(
model="gemini-2.5-flash",
contents="Write a short poem about observability.",
__weflayr_tags={
"feature": "poem",
"mode": "streaming",
"provider": "google",
"customer_id": "acme-corp",
},
)
for chunk in stream:
print(chunk.text or "", end="", flush=True)
Multi-turn conversations
Pass an array to contents for multi-turn exchanges. Weflayr records the number of turns as count_message.
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: 'What is observability?' }] },
{ role: 'model', parts: [{ text: 'Observability is...' }] },
{ role: 'user', parts: [{ text: 'How does it apply to LLMs?' }] },
],
__weflayr_tags: { feature: 'chat', provider: 'google' },
});
response = ai.models.generate_content(
model="gemini-2.5-flash",
contents=[
{"role": "user", "parts": [{"text": "What is observability?"}]},
{"role": "model", "parts": [{"text": "Observability is..."}]},
{"role": "user", "parts": [{"text": "How does it apply to LLMs?"}]},
],
__weflayr_tags={"feature": "chat", "provider": "google"},
)
Configuration notes
| Setting | Value | Reason |
|---|---|---|
methods |
models.generateContent, models.generateContentStream |
Only these two paths are instrumented |
No middleware needed |
— | Token data is captured from the raw response in the worker normalizer |
What Weflayr captures
| Field | Source |
|---|---|
model |
response.modelVersion → fallback args.model |
token_text_prompt |
usageMetadata.promptTokenCount |
token_text_completions |
usageMetadata.candidatesTokenCount + usageMetadata.thoughtsTokenCount |
count_message |
Length of contents if array, 1 if plain string |