Streaming Response

Stream LLM responses in real-time with contextual ads appended at the end.

Streaming is optional. Use offer() for complete responses or stream() for real-time streaming.

Basic Usage


import Ads from 'notpixel';
 
const ads = new Ads({
  publisherId: 'pub_xxx',
  model: 'openai/gpt-5.2',
  input: 'How do I optimize my database?',
});
 
for await (const chunk of ads.stream()) {
  if (chunk.type === 'text') {
    process.stdout.write(chunk.content);
  } else if (chunk.type === 'ad') {
    console.log('\n' + chunk.content);
  }
}

Chunk Types

The stream yields StreamChunk objects with different types:


type StreamChunk = {
  type: 'text' | 'ad' | 'done' | 'error';
  content?: string;  // Text content
  ad?: Ad;           // Ad object (for 'ad' type)
  error?: string;    // Error message (for 'error' type)
  done?: boolean;    // True when complete
};

Type	Description
`text`	LLM response chunk - write to output immediately
`ad`	Sponsored ad block - appears after LLM completes
`done`	Stream completed successfully
`error`	An error occurred

Override Input


const ads = new Ads({
  publisherId: 'pub_xxx',
  model: 'openai/gpt-5.2',
});
 
// Override input per call
for await (const chunk of ads.stream({ input: 'Tell me a story' })) {
  // ...
}

Collecting Full Response


let fullText = '';
let ad: Ad | undefined;
 
for await (const chunk of ads.stream({ input: 'Hello' })) {
  if (chunk.type === 'text') {
    fullText += chunk.content;
  }
  if (chunk.type === 'ad') {
    ad = chunk.ad;
  }
}
 
console.log(fullText);  // Complete LLM response
console.log(ad?.cta);   // Ad call-to-action

With React/Next.js

Server Component (App Router)


// app/api/chat/route.ts
import Ads from 'notpixel';
 
export async function POST(req: Request) {
  const { message } = await req.json();
 
  const ads = new Ads({
    publisherId: process.env.NOTPIXEL_PUBLISHER_ID,
    model: 'openai/gpt-5.2',
  });
 
  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      for await (const chunk of ads.stream({ input: message })) {
        if (chunk.type === 'text' || chunk.type === 'ad') {
          controller.enqueue(encoder.encode(chunk.content));
        }
      }
      controller.close();
    },
  });
 
  return new Response(stream, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' },
  });
}

Client Component


'use client';
 
import { useState } from 'react';
 
export function Chat() {
  const [response, setResponse] = useState('');
 
  async function handleSubmit(message: string) {
    setResponse('');
 
    const res = await fetch('/api/chat', {
      method: 'POST',
      body: JSON.stringify({ message }),
    });
 
    const reader = res.body?.getReader();
    const decoder = new TextDecoder();
 
    while (reader) {
      const { done, value } = await reader.read();
      if (done) break;
      setResponse(prev => prev + decoder.decode(value));
    }
  }
 
  return <div>{response}</div>;
}

Ad Callback

Get notified when an ad is fetched:


for await (const chunk of ads.stream({
  input: 'Hello',
  onAd: (ad) => {
    console.log('Ad fetched:', ad.id);
    trackImpression(ad);
  },
})) {
  // ...
}

Abort Signal

Cancel streaming with an AbortController:


const controller = new AbortController();
 
// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);
 
try {
  for await (const chunk of ads.stream({
    input: 'Tell me a long story',
    signal: controller.signal,
  })) {
    process.stdout.write(chunk.content || '');
  }
} catch (err) {
  if (err.name === 'AbortError') {
    console.log('Stream cancelled');
  }
}

Supported Providers

Streaming works with all supported LLM providers:

Provider	Model Format	Notes
OpenAI	`openai/gpt-5.2`	Full support
Anthropic	`anthropic/claude-opus-4.5`	Full support
Google	`google/gemini-pro`	Full support
xAI	`xai/grok-1`	Full support
OpenRouter	`openrouter/model`	Full support
DeepSeek	`deepseek/deepseek-chat`	OpenAI-compatible
Groq	`groq/llama-3.1-70b`	OpenAI-compatible
Mistral	`mistral/mistral-large`	OpenAI-compatible
Together	`together/meta-llama/...`	OpenAI-compatible
LM Studio	`lmstudio/local-model`	Local, no API key

Streaming vs Offer

Choose the right method for your use case:

Method	Use Case	Returns
`offer()`	Backend APIs, simple integrations	`Promise<{ text, ad }>`
`stream()`	Real-time chat UX, low latency	`AsyncGenerator<StreamChunk>`


// Option 1: Complete response
const response = await ads.offer({ input: 'Hello' });
console.log(response.text);
 
// Option 2: Streaming
for await (const chunk of ads.stream({ input: 'Hello' })) {
  console.log(chunk);
}

How It Works

stream() starts fetching the ad in parallel (non-blocking)
LLM chunks are yielded as they arrive (type: 'text')
When LLM completes, the ad is appended (type: 'ad')
Final done chunk signals completion


[LLM chunk 1] → [LLM chunk 2] → ... → [LLM chunk N] → [Ad block] → [Done]

Error Handling


try {
  for await (const chunk of ads.stream({ input: 'Hello' })) {
    if (chunk.type === 'error') {
      console.error('Stream error:', chunk.error);
      break;
    }
    // Process chunk...
  }
} catch (err) {
  console.error('Fatal error:', err);
}

TypeScript Types

Import types for better type safety:


import Ads, { type StreamChunk, type StreamArgs } from 'notpixel';
 
async function processStream(args: StreamArgs): Promise<string> {
  const ads = new Ads({ publisherId: 'pub-xxx', model: 'openai/gpt-5.2' });
  let result = '';
 
  for await (const chunk: StreamChunk of ads.stream(args)) {
    if (chunk.type === 'text') result += chunk.content;
  }
 
  return result;
}