Build a Privacy-First AI Chatbot with Ollama, TanStack AI, and React: The Complete 2026 Guide

Building AI applications doesn't mean you have to send every conversation to OpenAI's servers or rack up massive API bills. In 2026, running powerful language models locally is not just possible it's practical, private, and often faster than cloud alternatives.
I’ll show you how to build an AI chatbot that runs entirely on your infrastructure using three cutting-edge technologies:
Ollama - Run LLMs locally with zero configuration
TanStack AI - Framework-agnostic, type-safe AI SDK with no vendor lock-in
React - Build a modern, responsive chat interface
Why Local AI Matters in 2026
The AI landscape has fundamentally shifted. What required datacenter-scale infrastructure in 2023 now runs on a laptop. Here's why developers are choosing local AI:
Privacy & Compliance
GDPR/CCPA compliance becomes trivial when data never leaves your infrastructure
Healthcare/finance apps can use AI without HIPAA/PCI concerns
No telemetry - your conversations aren't training someone else's model
Corporate secrets stay secret - code assistance without sending source to third parties
Prerequisites
Required Software
node.js
npm or pnpm or bun
Ollama
Checking Your Setup
# Verify Node.js
node --version # Should be v18.0.0 or higher
# Verify npm
npm --version
Part 1: Setting Up Ollama
Step 1: Install Ollama
brew install ollama
Step 2: Verify Installation
ollama --version
# Should output: ollama version is X.X.X
Ollama automatically starts a background service on http://localhost:11434.
Step 3: Pull a Model
We'll use DeepSeek-R1:1.5B - a small, fast model perfect for development:
ollama pull deepseek-r1:1.5b
First pull takes 5-10 minutes (1.5GB download). Subsequent pulls are instant.
Step 4: Test the Model
ollama run deepseek-r1:1.5b
You'll get an interactive prompt:
>>> Hello! How are you?
I'm doing well, thank you for asking! How can I help you today?
>>> /bye # Exit with /bye
Success! Ollama is running. Press Ctrl+D or type /bye to exit.
Step 5: Test the API
curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:1.5b",
"prompt": "Why is the sky blue?",
"stream": false
}'
You should get a JSON response with the model's answer. This confirms the API is working.
Part 2: Building the Backend with TanStack AI
Project Setup
# Create project directory
mkdir ollama-chatbot
cd ollama-chatbot
# Initialize Node.js project
npm init -y
# Install dependencies
npm install @tanstack/ai @tanstack/ai-ollama express cors dotenv
npm install -D typescript @types/node @types/express @types/cors tsx
# Initialize TypeScript
npx tsc --init
Configure TypeScript
Edit tsconfig.json:
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"outDir": "./dist",
"rootDir": "./src"
},
"include": ["src/**/*"],
"exclude": ["node_modules"]
}
Create Environment Config
Create .env:
PORT=3001
OLLAMA_BASE_URL=http://localhost:11434
MODEL=deepseek-r1:1.5b
Build the Server
Create src/server.ts:
import express from 'express';
import cors from 'cors';
import dotenv from 'dotenv';
import { chat, toServerSentEventsResponse } from '@tanstack/ai';
import { ollamaText } from '@tanstack/ai-ollama';
dotenv.config();
const app = express();
const PORT = process.env.PORT || 3001;
const MODEL = process.env.MODEL || 'deepseek-r1:1.5b';
app.use(cors());
app.use(express.json());
app.get('/health', (req, res) => {
res.json({
status: 'ok',
model: MODEL,
ollamaUrl: process.env.OLLAMA_BASE_URL
});
});
app.post('/api/chat', async (req, res) => {
try {
const { messages } = req.body;
if (!messages || !Array.isArray(messages)) {
return res.status(400).json({ error: 'Messages array required' });
}
const stream = chat({
adapter: ollamaText(MODEL),
messages,
});
return toServerSentEventsResponse(stream, res);
} catch (error) {
console.error('Chat error:', error);
res.status(500).json({
error: error instanceof Error ? error.message : 'Internal server error'
});
}
});
app.post('/api/chat/simple', async (req, res) => {
try {
const { messages } = req.body;
const stream = chat({
adapter: ollamaText(MODEL),
messages,
});
// Collect full response
let fullResponse = '';
for await (const chunk of stream) {
if (chunk.type === 'text-delta') {
fullResponse += chunk.text;
}
}
res.json({ response: fullResponse });
} catch (error) {
console.error('Chat error:', error);
res.status(500).json({
error: error instanceof Error ? error.message : 'Internal server error'
});
}
});
app.listen(PORT, () => {
console.log(`Server running on http://localhost:${PORT}`);
});
Add Scripts to package.json
{
"type": "module",
"scripts": {
"dev": "tsx watch src/server.ts",
"build": "tsc",
"start": "node dist/server.js"
}
}
Start the Server
npm run dev
Test the Backend
Test health endpoint:
curl http://localhost:3001/health
Test simple chat:
curl http://localhost:3001/api/chat/simple \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "What is 2+2?"}
]
}'
Expected response:
{
"response": "2 + 2 equals 4."
}
Perfect! Your backend is working. Keep the server running and open a new terminal for the frontend.
Part 3: Creating the React Frontend
Setup React with Vite
In a new terminal (keep the backend running):
cd ollama-chatbot
npm create vite@latest frontend -- --template react-ts
cd frontend
npm install
npm install @tanstack/react-query axios lucide-react
Configure Development Server
Edit frontend/vite.config.ts to proxy API requests:
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
export default defineConfig({
plugins: [react()],
server: {
proxy: {
'/api': {
target: 'http://localhost:3001',
changeOrigin: true,
},
},
},
});
This lets you call /api/chat from the frontend without CORS issues.
Build the Chat Interface
Replace frontend/src/App.tsx:
import { useState, useRef, useEffect } from 'react';
import { Send, Bot, User, Loader2 } from 'lucide-react';
import './App.css';
interface Message {
role: 'user' | 'assistant';
content: string;
}
function App() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState('');
const [isStreaming, setIsStreaming] = useState(false);
const messagesEndRef = useRef<HTMLDivElement>(null);
const abortControllerRef = useRef<AbortController | null>(null);
const scrollToBottom = () => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
};
useEffect(() => {
scrollToBottom();
}, [messages]);
const sendMessage = async () => {
if (!input.trim() || isStreaming) return;
const userMessage: Message = { role: 'user', content: input };
const newMessages = [...messages, userMessage];
setMessages(newMessages);
setInput('');
setIsStreaming(true);
abortControllerRef.current = new AbortController();
try {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ messages: newMessages }),
signal: abortControllerRef.current.signal,
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const reader = response.body?.getReader();
const decoder = new TextDecoder();
let assistantMessage = '';
if (!reader) throw new Error('No reader available');
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') continue;
try {
const parsed = JSON.parse(data);
if (parsed.type === 'text-delta') {
assistantMessage += parsed.text;
// Update UI with streaming text
setMessages([
...newMessages,
{ role: 'assistant', content: assistantMessage },
]);
}
} catch (e) {
// Skip invalid JSON
}
}
}
}
} catch (error) {
if (error instanceof Error && error.name === 'AbortError') {
console.log('Request cancelled');
} else {
console.error('Chat error:', error);
setMessages([
...newMessages,
{
role: 'assistant',
content: 'Sorry, an error occurred. Please try again.'
},
]);
}
} finally {
setIsStreaming(false);
abortControllerRef.current = null;
}
};
const handleKeyPress = (e: React.KeyboardEvent) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault();
sendMessage();
}
};
return (
<div className="app">
<div className="chat-container">
<div className="chat-header">
<Bot className="icon" />
<h1>Local AI Chatbot</h1>
<span className="status">
{isStreaming ? 'Thinking...' : 'Ready'}
</span>
</div>
<div className="messages">
{messages.length === 0 && (
<div className="empty-state">
<Bot size={64} className="empty-icon" />
<h2>Start a conversation</h2>
<p>This chatbot runs entirely on your local machine using Ollama</p>
</div>
)}
{messages.map((msg, idx) => (
<div key={idx} className={`message ${msg.role}`}>
<div className="message-icon">
{msg.role === 'user' ? <User size={20} /> : <Bot size={20} />}
</div>
<div className="message-content">
{msg.content}
</div>
</div>
))}
{isStreaming && messages[messages.length - 1]?.role !== 'assistant' && (
<div className="message assistant">
<div className="message-icon">
<Loader2 size={20} className="spinner" />
</div>
<div className="message-content typing">
Thinking...
</div>
</div>
)}
<div ref={messagesEndRef} />
</div>
<div className="input-container">
<textarea
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyPress={handleKeyPress}
placeholder="Type your message... (Enter to send)"
disabled={isStreaming}
rows={1}
/>
<button
onClick={sendMessage}
disabled={!input.trim() || isStreaming}
className="send-button"
>
{isStreaming ? (
<Loader2 size={20} className="spinner" />
) : (
<Send size={20} />
)}
</button>
</div>
</div>
</div>
);
}
export default App;
Start the Frontend
# In the frontend directory
npm run dev
Open http://localhost:5173 in your browser.
Congratulations! You now have a working local AI chatbot. Try asking it questions:
"What is TypeScript?"
"Write a haiku about local AI"
"Explain quantum computing in simple terms"
Part 4: Adding Streaming Responses
The code above already implements streaming! Here's how it works:
Backend: TanStack AI Streaming
// In server.ts
const stream = chat({
adapter: ollamaText(MODEL),
messages,
});
return toServerSentEventsResponse(stream, res);
TanStack AI's toServerSentEventsResponse automatically:
Converts the async iterator to Server-Sent Events format
Sends
data:prefixed chunksHandles backpressure and errors
Sends
[DONE]when complete
Frontend: SSE Parsing
// In App.tsx
const reader = response.body?.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
const parsed = JSON.parse(data);
if (parsed.type === 'text-delta') {
assistantMessage += parsed.text;
setMessages([...newMessages, {
role: 'assistant',
content: assistantMessage
}]);
}
}
}
}
This creates the "typewriter effect" users expect from modern chatbots.
Key Takeaways
Ollama makes local AI trivial - Install, pull model, run. That's it.
TanStack AI provides vendor freedom - Switch providers by changing one line
React gives you full UI control - Build exactly what your users need
Performance is competitive - Often faster than cloud APIs
Privacy is guaranteed - Your data never leaves your machine
Fun Fact
I’m using this exact setup to help my wife translate medical documents and reports, and it’s working with about 90–95% accuracy on the local setup. The model being used is translategemma:4b
Resources
Ollama: ollama.com
TanStack AI: tanstack.com/ai
Model Library: ollama.com/library



