AI Video Proctoring
A browser-native AI proctoring system that detects cheating behaviour in real time during online examinations — using TensorFlow.js and MediaPipe for face detection, gaze estimation, and head-pose anomaly tracking entirely on the client device, with no video data ever leaving the browser.

What We Built
Online examinations became ubiquitous post-pandemic, but academic integrity solutions remained either prohibitively expensive enterprise tools or dangerously inadequate. This project set out to demonstrate that browser-native ML inference could deliver real proctoring value without requiring server-side video processing or privacy-invasive screen recording.
We built a full MERN application where the exam session runs in a React component that simultaneously renders the webcam feed, runs MediaPipe Face Mesh at 30fps, and pipes face landmark data into a TensorFlow.js gaze estimation model —flagging anomalies in real time and writing event logs to the MongoDB-backed API.
Technical Challenges
Real-Time Inference in the Browser
Running a multi-stage ML pipeline (face detection → landmark extraction → gaze estimation) at 30fps in a browser tab without causing video frame drops required aggressive model quantisation and WebGL backend configuration.
Privacy-Preserving Architecture
All inference runs client-side — no video pixels are transmitted to any server. Only structured anomaly event objects (timestamps, confidence scores, event types) are sent to the API, making the system GDPR-compliant by design.
Ambient Light Variance
MediaPipe Face Mesh degrades significantly in poor lighting. We implemented a client-side lux estimation step that warns examinees before the session begins if webcam input quality is insufficient.
Multi-Face Detection Handling
A second person appearing in frame (a common cheating vector) needed to be caught reliably without false positives from brief reflections. We implemented a frame-count threshold — requiring 15 consecutive multi-face frames before flagging.
ML Architecture
MediaPipe Face Mesh extracts 468 3D facial landmarks per frame using a lightweight BlazeFace detector followed by a landmark regression model. These landmarks are passed into a custom TensorFlow.js gaze model that computes eye-corner vectors and estimates horizontal/vertical gaze deviation relative to centre.
The node.js API receives structured proctoring event objects — not video — and stores them in MongoDB with session and user references. After the exam, the admin dashboard aggregates events into a timeline, highlighting suspicious intervals with confidence scores for human review.
JWT tokens gate both the exam session API and the admin report endpoints — ensuring only the authenticated examiner can access session data for a specific candidate.
What Was Built
Real-time face detection via MediaPipe Face Mesh
Browser-native TensorFlow.js inference — no server GPU required
Gaze estimation and head-pose anomaly detection
Automated proctoring event log and session report
JWT-secured exam session management
Admin dashboard with per-session analytics
The Real Product


MediaPipe Face Mesh
468 3D landmark points extracted per frame for gaze, head-pose, and multi-face detection.
WebGL TensorFlow.js
GPU-accelerated inference on the client — zero server compute cost for video analysis.
Privacy by Design
No video pixels leave the browser — only structured event objects reach the API.
Results & Impact
94%+
Detection Accuracy
MediaPipe Face Mesh achieved over 94% face presence accuracy in standard webcam environments.
< 30ms
Inference Latency
WebGL backend on TensorFlow.js processes each frame under 30ms — maintaining smooth video rendering.
91%
Anomaly Recall
Head-pose deviation model correctly flagged 91% of simulated cheating scenarios in testing.
Need a Custom AI Integration?
Build Your AI-Powered App
From browser-native ML to LLM integrations — we engineer AI solutions that run at production scale.