Full-stack AI-powered math OCR system — from handwritten ink to symbolic solutions in a single pipeline.
HTML5 Canvas API with pen, eraser, thickness control, undo/redo stack. Exports transparent PNG.
Canvas APIDrag-and-drop or file picker. Accepts PNG, JPEG, PDF. Preview with crop zone before OCR.
File APIVisual WYSIWYG LaTeX input with quick-insert symbol buttons. Produces raw LaTeX string.
MathQuill 0.10Real-time LaTeX → beautifully typeset math. Renders OCR output and solver results.
KaTeX 0.16Interactive polynomial graph rendering with plotted roots for degree ≥ 3 equations.
Canvas 2DSession-based equation timeline with search, type/date filters, batch delete, and export.
LocalStorage + APIDark / Light / System with glassmorphism
Scaling, high-contrast, screen-reader
Toggle symbolic vs rounded output
Auto-generated per browser session
CORS middleware, model preloading at startup (Texify + Pix2Tex), Pydantic schema validation.
Dual-model architecture: Texify runs first, Pix2Tex validates and acts as fallback when primary parse fails.
Dual-parser strategy: latex2sympy2 handles raw LaTeX directly; regex-based manual pipeline as fallback.
Stroke-width estimation via distance transform. Auto-selects between printed/handwriting/outline branches.
Regex-powered sanitization: Unicode repair, missing backslashes, implicit multiplication, inverse trig normalization. Runs between OCR output and solver input.
Canvas PNG, Upload JPEG, or Camera Photo
Stroke-width detection, binarization, deskew, shadow crush
Texify ViT inference (+ Pix2Tex fallback)
Regex cleanup, Unicode fix, backslash repair
latex2sympy2 → SymPy CAS → symbolic / numeric
KaTeX display + plot data + MongoDB persist
equationDual-LaTeX document schema with legacy field normalization
| Field | Description | Type |
|---|---|---|
| _id | Auto-generated document ID | ObjectId |
| session_id | Browser session identifier | string |
| ocr_latex | Raw LaTeX from OCR engine | string |
| final_latex | User-edited LaTeX (source of truth) | string |
| solution_latex | Solver result in LaTeX | string? |
| solution | Plain-text solution value | string? |
| image_url | Path to source image | string? |
| created_at | ISO 8601 UTC timestamp | string |
All DB calls encapsulated — routes never touch the driver directly
Schema Normalization:
setdefault() ensures legacy documents get consistent shape for frontend rendering.
Dockerfile for containerized deployment with Python 3.10 base image.
DockerfileUnit tests for LaTeX normalization pipeline. Edge cases as fixtures.
Pytest 9.0Type-safe request/response schemas with model validators.
Pydantic 2.12.env file with dotenv loader for secrets. MONGODB_URI required.
.env / dotenv