Back to work
AI · Document intelligence

Knowledge Sphere

Upload documents, ask in natural language, and an AI agent answers grounded in your content with clickable inline citations on every sentence — an enterprise RAG platform combining hybrid search, an OCR document pipeline and team collaboration.

Knowledge Sphere AI chat with clickable inline source citations on each response

Knowledge Sphere: turning a cabinet of documents into a knowledge base that answers questions

Every research team and professional institution sits on a mountain of documents — reports, papers, manuals, contracts. But once knowledge is written into a PDF it is effectively sealed: finding an answer means flipping through file after file, page after page, relying on memory and keyword luck.

Knowledge Sphere was built to solve this — an AI-driven document-intelligence platform. Users upload documents, ask in natural language, and the AI assistant answers grounded in the content, attaching a traceable source to every sentence. It turns a cabinet of static documents into a knowledge base you can have a conversation with.

Shepherd Tech delivered the full-stack build of this platform, from architecture to launch. Below are its seven core capabilities and the engineering trade-offs behind them.

1. AI chat with inline citations: trustworthy because it is traceable

At the core is an AI assistant that understands your documents:

  • Natural-language Q&A: users can ask complex questions about uploaded content; the AI extracts and organises relevant information to answer in real time.
  • Conversation memory: the AI remembers earlier questions within a session, keeping context coherent and supporting follow-ups.
  • Inline citations: every response sentence carries a citation marker pointing to a specific passage in the documents. This is the bedrock of the product’s trustworthiness — the AI’s answers are not a black box; every sentence can be verified back against the source.
Design stance: augment, don’t replace. The system marks the answer and the evidence chain across vast documents; the final judgement stays with the professional — and the clickable citation is what makes it trustworthy.

2. Smart understanding and summaries: the big picture first, then the detail

Automatic document summaries — quick overview, detailed summary, section summaries
Automatic document summaries — quick overview, detailed summary, section summaries
  • Automatic summaries: each uploaded document gets multi-level summaries — quick overview, detailed summary, section summaries — so users grasp the whole without reading it all.
  • Cross-document analysis: the AI can synthesise information across multiple documents at once, not just a single file.
  • Context awareness: the AI considers the full context and intent of a question, not just literal matching.
  • Question refinement: when initial retrieval is insufficient, the AI adjusts its strategy and searches again for a more complete answer.

3. The document pipeline: large files and scans, all handled

Document processing status — each file shows OCR / embedding / summary / done progress
Document processing status — each file shows OCR / embedding / summary / done progress

Accurate Q&A starts with reading documents cleanly and structuring them. The platform builds an enterprise-grade document pipeline:

  • High-quality OCR: an enterprise-grade OCR engine handles scanned PDFs, recognising text in photocopied and scanned files with high accuracy.
  • Layout understanding: the system recognises headings, paragraphs and tables, preserving the logic of the content rather than flattening a page into a wall of text.
  • Structured OCR context: OCR results are structured and attached to the AI’s citations, deepening its understanding of the document.
  • Async background processing: large files are processed by retryable background jobs so the frontend never blocks; users can keep working and track each file’s status in real time.
  • Bulk upload: import and process many documents at once; speed and stability for multi-page documents are specifically optimised.

4. Interactive PDF: from "reading the answer" to "jumping to the source"

This is where Knowledge Sphere most embodies trust — a complete traceable chain from question to source.

Step one: ask a question, get an answer with citations. The user asks in natural language, the AI answers, and each relevant sentence carries a citation marker (e.g. 📄 2) indicating which document and passage it came from.

Clicking a citation: the PDF viewer jumps to the page and highlights the cited passage
Clicking a citation: the PDF viewer jumps to the page and highlights the cited passage

Step two: click a citation to jump to the source and highlight it. When the user clicks a citation marker (or the matching page button), the system flips the PDF to the corresponding page and selects and highlights that passage — almost no gap between answer and source, so users verify on the spot exactly where the AI got it.

Interactive PDF — OCR text-block highlighting, an inline action menu, and copy support
Interactive PDF — OCR text-block highlighting, an inline action menu, and copy support
  • Text-block highlighting: OCR blocks are colour-coded by extraction confidence, so users see at a glance where recognition is solid.
  • Direct copy: copy source text straight from a highlighted block, paired with an inline action menu (copy / search / explain / translate / follow-up).
  • Page navigation: jump anywhere in the document from the page bar or the passage list.

5. Smart search: semantic + keyword, hybrid retrieval

Search settings — semantic and keyword search dual modes
Search settings — semantic and keyword search dual modes
  • Semantic search: retrieves on the meaning and intent of the query, not literal matching — even with different wording, close meaning is found.
  • Full-text search: keeps traditional exact keyword matching.
  • Hybrid search: combines semantic vectors and full-text search, merging their rankings with Reciprocal Rank Fusion (RRF) for the most comprehensive results.
  • Scope filtering: search within specific documents or a Space, with results updating live as you type.

6. Team collaboration and permissions: share knowledge, hold the line

  • Shared Spaces: group documents by topic into collaborative workspaces for the team.
  • Role permissions: tiered Owner and Viewer access.
  • Invite-only: members join securely by invite code; only invitees can enter a workspace.
  • Document-level permissions and activity tracking: fine-grained control over document access, with tracking of who accessed what.

7. Accounts and security: enterprise-grade data protection

  • Complete account system: email-verified sign-up and login, secure password reset, idle auto-logout, profile management.
  • Multi-layer access control: permissions at the user, Space and document levels.
  • Data isolation: user data is fully separated; passwords are stored hashed with Argon2.

Architecture: a rebuild that cut latency from 20 seconds to 3

Knowledge Sphere’s engineering depth shows most in one key rebuild of the AI agent.

The early platform used a fixed-flow state machine (a fixed seven-step pipeline) — controllable, but with response latency of 15–25 seconds, and only "fake streaming" (computing everything before showing it at once).

We rebuilt it into an autonomous AI agent driven by tool calling:

  • The agent decides for itself when to call retrieval tools (passage search / summary search); the prompt offers strategy guidance rather than a forced fixed order — simple questions skip the full seven steps, complex ones get multiple retrieval rounds.
  • Response latency dropped from 15–25s to 2–5s, and fake streaming became real token streaming.
  • Added hybrid search (vector embeddings + a full-text index + RRF merge) and summary-layer embeddings for sharper retrieval.

Architecture at a glance

  • Frontend: an end-to-end type-safe API, with the same TypeScript types shared across front and back end.
  • AI agent: true token streaming + autonomous tool calling, wired to large language models.
  • Embeddings & retrieval: vector embeddings (1536-dim) + full-text search, merged with Reciprocal Rank Fusion.
  • Document processing: enterprise-grade OCR + an async, retryable background pipeline + cloud object storage.
  • Data layer: a relational database with a type-safe ORM.
  • Identity & security: Argon2 password hashing, multi-layer access control and data isolation.

In closing

Knowledge Sphere is a production-ready AI document-intelligence platform, with every core capability implemented and running — full AI chat with inline citations, the document pipeline, hybrid search, interactive PDF, team collaboration and enterprise security.

It demonstrates Shepherd Tech’s ability to deliver complex AI systems: not just calling an LLM API, but holding the entire RAG chain steady — from retrieval architecture and document pipeline to a traceable product experience.

Have something you want built?

Message us on WhatsApp. For websites we'll build a free demo; for bigger builds we'll scope it with you.