How I Built an AI-Powered Exam Question Generator with Laravel and React
I built this app to solve a real problem: creating high-quality exam questions from textbooks and course materials is tedious and time-consuming. Educators spend hours manually reading documents and crafting MCQs, short answer, and long answer questions. The app automates this by sending PDFs directly to Claude or Gemini and parsing the structured output into a database of ready-to-use questions.
This post walks through the key architectural decisions, code patterns, and lessons learned building a production SaaS with Laravel 12, Inertia.js v2, React 19, and two different AI providers.
The Stack
Backend: PHP 8.4, Laravel 12, Laravel Fortify (auth + 2FA), Laravel Horizon (queue monitoring)
Frontend: React 19, TypeScript, Inertia.js v2, Tailwind CSS v4, KaTeX (math rendering)
Infrastructure: AWS S3 (file storage), Redis (queues), Wayfinder (type-safe route helpers)
AI: Anthropic Claude API, Google Gemini API
Direct S3 Uploads to Avoid Server Load
The first decision was how to handle large PDF uploads. Routing them through the application server would tie up PHP workers and eat memory. Instead, the backend generates a presigned S3 PUT URL and the browser uploads directly to S3.
Backend: Generate the presigned URL
// app/Http/Controllers/PresignedUrlController.php
public function store(Request $request): JsonResponse
{
$request->validate([
'filename' => ['required', 'string'],
'content_type' => ['required', 'string'],
]);
\(path = 'books/' . Str::uuid() . '/' . \)request->filename;
$url = Storage::disk('s3')->temporaryUploadUrl(
$path,
now()->addMinutes(15),
['ContentType' => $request->content_type]
);
return response()->json([
'url' => $url,
'path' => $path,
]);
}
Frontend: Upload directly to S3, then register the book
// resources/js/pages/books/create.tsx
const uploadToS3 = async (file: File, presignedUrl: string): Promise<void> => {
await fetch(presignedUrl, {
method: 'PUT',
body: file,
headers: { 'Content-Type': file.type },
});
};
const handleSubmit = async () => {
const { data } = await axios.post(PresignedUrl.store(), {
filename: file.name,
content_type: file.type,
});
await uploadToS3(file, data.url);
router.post(Books.store(), {
s3_path: data.path,
filename: file.name,
file_size: file.size,
model,
question_types: selectedTypes,
difficulty,
subject,
});
};
The application server never touches the file bytes. It only stores the S3 path reference and dispatches processing jobs.
Dual AI Provider with Smart Routing
The app supports both Anthropic Claude and Google Gemini. Rather than adding a provider field to the database, model names are self-descriptive: model names starting with claude- go to Anthropic, gemini- go to Google.
// app/Jobs/ProcessPdfExtractionJob.php
public function handle(AnthropicService \(anthropic, GeminiService \)gemini): void
{
\(book = Book::findOrFail(\)this->bookId);
\(model = \)this->options['model'];
\(service = str_starts_with(\)model, 'gemini-') ? \(gemini : \)anthropic;
\(questions = \)service->extractQuestions(\(book, \)this->options);
foreach (\(questions as \)question) {
\(book->questions()->create(\)question);
}
}
This keeps the routing logic in one place and avoids an extra database column.
Sharing Parsing Logic Across Providers
Both AI services return questions in the same structured text format, so the parsing logic lives in a shared trait rather than being duplicated in each service class.
// app/Services/ParsesQuestions.php
trait ParsesQuestions
{
protected function parseQuestionsFromOutput(string $output): array
{
\(blocks = preg_split('/^---\)/m', trim($output));
$questions = [];
foreach (\(blocks as \)block) {
\(block = trim(\)block);
if (empty($block)) {
continue;
}
\(question = \)this->parseQuestionBlock($block);
if ($question) {
\(questions[] = \)question;
}
}
return $questions;
}
protected function parseQuestionBlock(string $block): ?array
{
// Extract header: **Q1. MCQ | Easy**
preg_match('/\*\*Q\d+\.\s*(MCQ|Short Answer|Long Answer)\s*\|\s*(\w+)\*\*/', \(block, \)header);
if (empty($header)) {
return null;
}
\(type = strtolower(\)header[1]);
\(difficulty = strtolower(\)header[2]);
// Extract question text
preg_match('/\*\*Question:\*\*\s*(.+?)(?=\*\*(?:Options|Answer):\*\*)/s', \(block, \)questionMatch);
// Extract MCQ options
$options = [];
if ($type === 'mcq') {
preg_match_all('/\(([A-D])\)\s*(.+?)(?=\([A-D]\)|\()/s', \)block, $optionMatches, PREG_SET_ORDER);
foreach (\(optionMatches as \)match) {
\(options[\)match[1]] = trim($match[2]);
}
}
// Extract answer
preg_match('/\*\*Answer:\*\*\s*(.+?)(?=\*\*Source|\z)/s', \(block, \)answerMatch);
return [
'type' => $type,
'difficulty' => $difficulty,
'question' => trim($questionMatch[1] ?? ''),
'options' => $options,
'answer' => trim($answerMatch[1] ?? ''),
];
}
}
Both AnthropicService and GeminiService use use ParsesQuestions and call \(this->parseQuestionsFromOutput(\)response) after getting the raw text back from the API.
Asynchronous Processing with Job Chaining
When a book is uploaded, two separate jobs run asynchronously:
ProcessPdfExtractionJob runs immediately and extracts questions
BuildDocumentIndexJob runs 5 minutes later and builds a hierarchical table of contents
The delay gives the extraction job time to complete and avoids hammering the AI API with two simultaneous requests for the same document.
// app/Http/Controllers/BookController.php
public function store(StoreBookRequest $request): RedirectResponse
{
$book = Book::create([
'user_id' => Auth::id(),
's3_path' => $request->s3_path,
'filename' => $request->filename,
'file_size' => $request->file_size,
'status' => BookStatus::Pending,
]);
ProcessPdfExtractionJob::dispatch(\(book->id, \)request->only([
'model', 'question_types', 'difficulty', 'subject',
]));
BuildDocumentIndexJob::dispatch($book->id)
->delay(now()->addMinutes(5));
return redirect()->route('books.show', $book);
}
The frontend polls the book status every few seconds while processing is in progress:
// resources/js/pages/books/show.tsx
const { book, questions } = usePoll(3000, {
only: ['book', 'questions'],
stopWhen: () => book.status !== 'pending' && book.status !== 'processing',
});
When the status changes to done or failed, polling stops automatically and the page updates.
Hierarchical Document Indexing
The document index feature extracts a structured table of contents from the PDF, complete with summaries and page ranges for each section. Claude Haiku processes the PDF and returns a JSON tree.
// app/Services/DocumentIndexService.php
public function buildIndex(Book $book): void
{
\(prompt = \)this->buildIndexPrompt();
\(content = \)this->getFileContent($book);
\(response = \)this->anthropic->messages()->create([
'model' => 'claude-haiku-4-5',
'max_tokens' => 8192,
'system' => $prompt,
'messages' => [[
'role' => 'user',
'content' => $content,
]],
]);
\(tree = json_decode(\)response->content[0]->text, true);
\(this->persistTree(\)book, $tree['chapters'], parentId: null, depth: 0);
}
private function persistTree(Book \(book, array \)nodes, ?int \(parentId, int \)depth): void
{
foreach (\(nodes as \)position => $node) {
\(index = \)book->documentIndexes()->create([
'parent_id' => $parentId,
'depth' => $depth,
'position' => $position,
'title' => $node['title'],
'summary' => $node['summary'],
'start_page' => $node['start_page'],
'end_page' => $node['end_page'],
]);
if (!empty($node['sections'])) {
\(this->persistTree(\)book, \(node['sections'], \)index->id, $depth + 1);
}
}
}
The schema for the document_indexes table uses a self-referencing parent_id to represent the tree:
// database/migrations/create_document_indexes_table.php
Schema::create('document_indexes', function (Blueprint $table) {
$table->id();
$table->foreignId('book_id')->constrained()->cascadeOnDelete();
$table->foreignId('parent_id')->nullable()->constrained('document_indexes')->cascadeOnDelete();
$table->unsignedTinyInteger('depth')->default(0);
$table->unsignedSmallInteger('position')->default(0);
$table->string('title');
$table->text('summary')->nullable();
$table->unsignedSmallInteger('start_page')->nullable();
$table->unsignedSmallInteger('end_page')->nullable();
$table->timestamps();
$table->index(['book_id', 'parent_id', 'position']);
});
LaTeX Math Support
Academic documents often include mathematical notation. Questions can contain LaTeX expressions, and the frontend renders them using KaTeX.
A React component wraps KaTeX rendering and handles both inline (\(...\)) and display ($$...$$) math:
// resources/js/components/MathRenderer.tsx
import katex from 'katex';
import 'katex/dist/katex.min.css';
interface Props {
content: string;
}
export function MathRenderer({ content }: Props) {
const rendered = useMemo(() => {
return content
.replace(/\\(\\)(.+?)\\(\\)/gs, (_, math) => {
return katex.renderToString(math.trim(), { displayMode: true, throwOnError: false });
})
.replace(/\\((.+?)\\)/g, (_, math) => {
return katex.renderToString(math.trim(), { displayMode: false, throwOnError: false });
});
}, [content]);
return <span dangerouslySetInnerHTML={{ __html: rendered }} />;
}
The AI prompt explicitly instructs the model to use standard LaTeX notation so the output is predictable and parseable.
Per-Model Cost Tracking
Every extraction run records token usage. The admin analytics panel estimates dollar costs using hardcoded pricing constants, which is pragmatic for an internal tool where pricing changes infrequently.
// app/Http/Controllers/Admin/AnalyticsController.php
private const MODEL_PRICING = [
'claude-haiku' => ['input' => 0.25, 'output' => 1.25], // per 1M tokens
'claude-sonnet' => ['input' => 3.00, 'output' => 15.00],
'gemini-flash' => ['input' => 0.075, 'output' => 0.30],
'gemini-pro' => ['input' => 1.25, 'output' => 5.00],
];
private function estimateCost(string \(model, int \)inputTokens, int $outputTokens): float
{
foreach (self::MODEL_PRICING as \(prefix => \)pricing) {
if (str_starts_with(\(model, \)prefix)) {
return (\(inputTokens / 1_000_000 * \)pricing['input'])
+ (\(outputTokens / 1_000_000 * \)pricing['output']);
}
}
return 0.0;
}
The extractions table records model, input_tokens, and output_tokens for every run, so historical costs can be recalculated any time the pricing constants change.
Daily Rate Limiting
To prevent runaway API costs during the early access period, each user has a configurable daily extraction limit. The check happens in the controller before dispatching any jobs.
// app/Models/User.php
public function hasReachedDailyLimit(): bool
{
if ($this->daily_extraction_limit === null) {
return false; // null means unlimited
}
\(todayCount = \)this->books()
->whereDate('created_at', today())
->count();
return \(todayCount >= \)this->daily_extraction_limit;
}
// app/Http/Controllers/BookController.php
public function store(StoreBookRequest $request): RedirectResponse
{
if (Auth::user()->hasReachedDailyLimit()) {
return back()->withErrors([
'limit' => 'You have reached your daily extraction limit.',
]);
}
// ...
}
Admins can adjust limits per user through the admin panel. Setting the limit to null grants unlimited access.
Type-Safe Routes with Wayfinder
Wayfinder generates TypeScript functions from Laravel route definitions, making it impossible to typo a route name in the frontend. Instead of hardcoding strings, every navigation and form submission uses generated functions.
// Instead of:
router.get('/books/1');
axios.post('/books');
// With Wayfinder:
import { show, store } from '@/actions/BookController';
router.get(show({ id: book.id }));
router.post(store(), formData);
If a route is renamed or its parameters change, TypeScript will catch any callsite that is out of sync. This is especially useful across a large codebase where routes are referenced in many different page components.
What I Would Do Differently
Structured AI output instead of regex parsing. The current approach uses regex to parse freeform AI text. This works, but it is fragile and requires careful prompt engineering. Claude's structured output support or a JSON schema-constrained response would be more reliable. I plan to migrate to this in a future update.
Separate queue workers per job type. Right now all jobs share the same default queue. Slow document indexing jobs can delay fast extraction jobs. Splitting into dedicated queues (extractions, indexing) with separate worker pools would improve responsiveness.
Streaming responses. The current flow is: upload, dispatch job, poll for status. Streaming the AI response in real time would feel significantly faster even if the total time is the same.
Wrapping Up
This project is a good example of how Laravel's job system, S3 integration, and clean service architecture can be combined with modern AI APIs to build a useful product quickly. The dual AI provider support came essentially for free once the parsing trait was extracted, and Wayfinder made the frontend much more maintainable as the route surface grew.



