The promise of AI in education is enormous: personalized learning paths, instant feedback on essays, intelligent question generation, and tutoring that scales to thousands of students. But for enterprise clients — universities, corporate training programs, government institutions — the question isn’t can you do AI? It’s can you do AI without exposing our data?
At Cognaxa, we’ve built our AI features with a clear principle: AI augments educators, never replaces them. And your data never leaves your control.
The AI Feature Set
Before diving into the privacy architecture, here’s what our AI engine powers:
- AI Essay Grading — Evaluate long-form responses with confidence scores, rubric alignment, and detailed feedback
- Question Generation — Generate quiz questions from topic descriptions, with configurable difficulty levels
- Intelligent Tutoring — A chatbot that can answer student questions about course content
- Lesson Summarization — Auto-generate summaries from lesson content for study guides
Each of these features touches sensitive data: student responses, course IP, assessment content. Getting the privacy architecture wrong would be a dealbreaker for every enterprise customer.
Our Approach: No Training on Your Data
Let’s be explicit about what we don’t do:
- We don’t fine-tune models on customer data
- We don’t store prompts or completions in third-party systems beyond the API call lifecycle
- We don’t share data between tenants — even anonymized
Our AI pipeline uses inference-only API calls to language models. The interaction model is:
User Action → Cognaxa Server → Prompt Construction → LLM API → Response → Grading Logic → Database
The key property: the LLM never sees raw student identifiers. Our prompt construction layer strips PII before sending content for evaluation.
Prompt Engineering for Education
Generic LLM prompts produce generic results. We’ve invested heavily in education-specific prompt engineering:
// Simplified AI grading prompt structure
const gradingPrompt = {
system: `You are an expert academic grader. Evaluate the following
student response against the rubric provided. Return a JSON object
with: score (0-100), confidence (0-1), feedback (string), and
rubric_alignment (object mapping each criterion to a score).`,
user: `
Question: ${question.text}
Rubric: ${question.rubric}
Max Points: ${question.points}
Student Response: ${sanitizedResponse}
`,
};
const result = await llm.complete(gradingPrompt);
const grading = parseGradingResponse(result);
// AI grade is always a suggestion — teacher has final authority
await db.query(
`UPDATE student_quiz_attempts
SET ai_suggested_score = $1, ai_feedback = $2, ai_confidence = $3
WHERE id = $4`,
[grading.score, grading.feedback, grading.confidence, attemptId]
);
Notice that the AI score is stored as a suggestion, not a final grade. Teachers always have override authority. This is a deliberate design choice — AI should reduce grading workload, not remove human judgment from the loop.
Multi-Model Strategy
We don’t lock ourselves (or our customers) to a single AI provider:
| Feature | Primary Model | Fallback |
|---|---|---|
| Essay Grading | Gemini 1.5 Pro | OpenRouter (Claude) |
| Question Generation | Gemini 1.5 Flash | — |
| Tutoring Chatbot | Gemini 1.5 Pro | — |
| Summarization | Gemini 1.5 Flash | — |
This multi-model approach gives us:
- Resilience: If one provider has an outage, critical features keep working
- Cost optimization: Use faster/cheaper models for less complex tasks
- Future flexibility: New models can be evaluated and swapped without application changes
Data Flow Isolation
For multi-tenant deployments, AI requests are isolated at every layer:
- Tenant-scoped API keys: Each tenant can optionally bring their own AI API keys
- Request isolation: AI requests carry the tenant context — no cross-tenant prompt contamination
- Audit logging: Every AI interaction is logged with the tenant ID, feature used, and timestamp
- Rate limiting: Per-tenant AI usage limits prevent runaway costs
The Confidence Score
One of our most valuable AI features isn’t the grading itself — it’s the confidence score. Every AI-graded response comes with a 0-1 confidence value:
- High confidence (above 0.85): The AI is fairly certain about the grade. These can be auto-accepted in bulk
- Medium confidence (0.5–0.85): Worth a quick teacher review
- Low confidence (below 0.5): Flagged for manual grading
This allows teachers to focus their time on the responses where human judgment matters most, rather than reviewing every single submission.
What We’re Building Next
- RAG-based tutoring: Grounding chatbot responses in actual course content using retrieval-augmented generation
- Plagiarism detection: AI-powered similarity analysis across submissions
- Learning path recommendations: Personalized course suggestions based on performance patterns
The AI landscape moves fast. Our architecture is designed to adopt new capabilities without compromising the privacy guarantees that enterprise customers require. Because in education, the most intelligent system is the one students and institutions can trust.
Want to see our AI grading in action? Schedule a demo and we’ll walk you through the full pipeline.