Conversation History & Sessions
Why History Must Be Re-sent
LLMs are stateless. Every request is independent — the model remembers nothing from previous calls. To maintain a conversation, you must send the complete message history on every request:
Turn 1: [ {user: "Hello"} ]
Turn 2: [ {user: "Hello"}, {assistant: "Hi!"}, {user: "What's 2+2?"} ]
Turn 3: [ {user: "Hello"}, {assistant: "Hi!"}, {user: "What's 2+2?"}, {assistant: "4"}, {user: "And 4+4?"} ]
In-memory Sessions (Development Only)
const sessions = new Map();
function getHistory(sessionId) {
return sessions.get(sessionId) || [];
}
function appendHistory(sessionId, userMessage, assistantReply) {
const history = getHistory(sessionId);
history.push(
{ role: 'user', content: userMessage },
{ role: 'assistant', content: assistantReply }
);
sessions.set(sessionId, history);
return history;
}
This is fine for local development but lost on every restart. Never use in production.
Redis Sessions (Production)
In production, use Redis so history survives restarts and scales across multiple server instances.
npm install ioredis uuid
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL || 'redis://localhost:6379');
export default redis;
import redis from '../lib/redis.js';
const SESSION_TTL = 60 * 60 * 24; // 24 hours in seconds
export async function getHistory(sessionId) {
const data = await redis.get(`chat:${sessionId}`);
return data ? JSON.parse(data) : [];
}
export async function saveHistory(sessionId, messages) {
await redis.setex(`chat:${sessionId}`, SESSION_TTL, JSON.stringify(messages));
}
export async function clearHistory(sessionId) {
await redis.del(`chat:${sessionId}`);
}
import { v4 as uuidv4 } from 'uuid';
import { chat } from '../services/llm.service.js';
import { getHistory, saveHistory, clearHistory } from '../services/session.service.js';
// Start or continue a conversation
router.post('/', async (req, res, next) => {
try {
const { sessionId = uuidv4(), message, provider = 'openai' } = req.body;
const history = await getHistory(sessionId);
history.push({ role: 'user', content: message });
const result = await chat({ provider, messages: history });
history.push({ role: 'assistant', content: result.content });
await saveHistory(sessionId, history);
res.json({ sessionId, reply: result.content, usage: result.usage });
} catch (err) {
next(err);
}
});
// Delete a session
router.delete('/:sessionId', async (req, res, next) => {
try {
await clearHistory(req.params.sessionId);
res.json({ message: 'Session cleared' });
} catch (err) {
next(err);
}
});
Set a reasonable TTL on session keys (e.g. 24 hours). Chat history has no value after a session ends, and it will accumulate indefinitely without a TTL.
Context Window Management
Long conversations eventually exceed the model's context window. Trim old messages from the middle of the history when it grows too large, always keeping the system prompt and the most recent exchanges.
function estimateTokens(text) {
return Math.ceil(text.length / 4); // rough approximation: 1 token ≈ 4 chars
}
export function trimHistory(messages, maxTokens = 80000) {
let totalTokens = messages.reduce((sum, m) => {
const content = typeof m.content === 'string' ? m.content : JSON.stringify(m.content);
return sum + estimateTokens(content);
}, 0);
// Remove second-oldest message (keep system message + recent turns)
while (totalTokens > maxTokens && messages.length > 4) {
const removed = messages.splice(1, 1)[0];
const content = typeof removed.content === 'string' ? removed.content : JSON.stringify(removed.content);
totalTokens -= estimateTokens(content);
}
return messages;
}
Usage in a route:
let history = await getHistory(sessionId);
history.push({ role: 'user', content: message });
history = trimHistory(history); // trim before sending
const result = await chat({ messages: history });
OpenAI Assistants API (Thread-Based History)
The Assistants API offloads history management to OpenAI. You create a Thread once and add messages to it — OpenAI maintains the history server-side. See the Assistants API page for a full implementation.
| Approach | History lives | Best for |
|---|---|---|
| Manual (Redis) | Your server | Full control, any provider |
| Assistants API | OpenAI servers | Persistent assistants, file attachments |