Which browsers are supported by the AI proctoring system?

The system is optimized for modern browsers that support WebGL 2.0 and the Web API: Google Chrome (recommended), Microsoft Edge, and Safari 15+. Firefox has limited WebGL support for TensorFlow.js and is not recommended for production proctoring.

AI/ML Engineering · 2024Live Demo

AI Video Proctoring System

A browser-native AI proctoring system built on TensorFlow.js and MediaPipe Face Mesh. Real-time face detection, gaze estimation, and head-pose anomaly detection running entirely inside the candidate's browser tab. No video data ever leaves the device. No GPU server required.

94%+

Face Detection Accuracy

<30ms

Per-Frame Inference

91%

Anomaly Recall

0 bytes

Video Transmitted

MongoDBExpress.jsReact.jsNode.jsTensorFlow.jsMediaPipeWebGLJWT

AI Video Proctoring System showing candidate interview entry screen with MediaPipe face detection overlay

Project Overview

The Problem We Set Out to Solve

Online examinations exploded post-pandemic, but the proctoring solutions available in 2024 fell into one of two broken categories: enterprise platforms charging $5–$15 per session with invasive screen recording, or basic honor-system forms with zero integrity enforcement. For Indian EdTech platforms and corporate recruitment teams running hundreds of assessments daily, neither option was viable.

The challenge DarsLab accepted was to build a production-ready AI proctoring system that operates entirely inside the browser using open-source ML models, with no video data transmission, no per-session cost, and no invasive desktop agent installation. The result had to match enterprise-grade cheating detection accuracy while being deployable on any MERN stack web application.

We built a full MERN application where the exam session React component simultaneously renders the webcam feed, runs MediaPipe Face Mesh at 30fps, and feeds 468 facial landmark coordinates into a TensorFlow.js gaze estimation model, all without a single GPU server in the loop.

ML Architecture

How the Browser-Native AI Pipeline Works

Stage 1

MediaPipe Face Mesh (Face Detection)

The webcam stream is captured via the browser's getUserMedia API and piped into MediaPipe's BlazeFace detector, which runs a lightweight two-stage model: a lightweight anchor-based face detector followed by a 468-point 3D landmark regression network. This extracts the precise 3D position of every facial feature at 30fps with sub-5ms latency per frame on modern hardware.

Stage 2

TensorFlow.js Gaze Estimation Model

The 468 landmark coordinates are passed into a custom TensorFlow.js model that computes normalized eye-corner vectors and pupil centroid positions. These are compared against a baseline "focus" gaze established at session start. Deviation beyond configurable thresholds (horizontal: 25 degrees, vertical: 15 degrees) triggers an anomaly event. The WebGL backend processes this under 25ms using the client GPU.

Stage 3

Client-Side Anomaly Classification

Four anomaly types are classified locally: face_absent (no face detected for 3+ seconds), gaze_deviation (sustained eye movement off-screen), secondary_presence (two faces detected for 15+ consecutive frames), and head_pose_extreme (head rotation greater than 30 degrees). Each classification runs a frame-count threshold to eliminate false positives from natural eye movements.

Stage 4

Structured Event API and MongoDB Storage

When an anomaly is classified, a structured JSON event object containing session ID, candidate ID, event type, confidence score, timestamp, and frame count is AES-256 encrypted and sent to the Node.js/Express API via a secure WebSocket. No video data is included. MongoDB stores events with session and user references for post-exam admin review.

Technical Note: Why WebGL Backend

TensorFlow.js supports three compute backends: CPU, WebGL, and WebGPU. Our performance benchmarks showed the WebGL backend outperforming CPU by 11x on the gaze estimation model (from 330ms to 28ms per inference). WebGPU offers further speedups but lacks browser support coverage for production deployment as of 2024. The WebGL backend is supported on all Chromium-based browsers, Safari 14+, and Firefox 90+.

Engineering Challenges

What Made This Hard to Build

30fps Inference Without Video Frame Drops

Running a multi-stage ML pipeline (face detection, landmark extraction, gaze estimation) at 30fps inside a single React component while rendering the live webcam feed required aggressive model quantization. We used 8-bit integer quantization on the gaze model, reducing it from 4.2MB to 1.1MB with only a 1.3% accuracy degradation. We also offloaded inference scheduling to a Web Worker to keep the main thread available for UI rendering.

Ambient Light Variance and Webcam Quality

MediaPipe Face Mesh degrades significantly in poor lighting conditions, leading to false positive face_absent events. We implemented a client-side lux estimation step using the canvas API to analyze webcam frame brightness histograms. If mean luminance falls below a calibrated threshold, the session onboarding flow displays an explicit warning and refuses to start the exam until lighting is adequate.

Multi-Face Detection False Positives

Brief reflections in glasses, posters on walls, and ambient screen light could trigger secondary_presence events incorrectly. Our solution was a 15-frame persistence threshold: the secondary face must be detected with confidence greater than 0.85 for 15 consecutive frames (0.5 seconds at 30fps) before flagging. This eliminated all false positives from reflections in our test dataset while maintaining 100% detection of genuine secondary persons.

JWT Session Security Across WebSocket and REST

The exam session requires two simultaneous authenticated connections: a REST API for exam content delivery and a WebSocket for real-time anomaly event streaming. Both connections share the same JWT token, which carries session ID, candidate ID, and exam ID as custom claims. Token refresh logic was implemented to handle sessions longer than the token expiry window without disrupting the ML inference pipeline.

Product Screenshots

The Live System

AI Proctoring integrity report showing timeline of anomaly events, confidence scores, and session duration

AI Proctoring admin dashboard showing per-candidate session management and interview replay controls

Benchmarked Results

Performance Outcomes

94%+

Detection Accuracy

MediaPipe Face Mesh accuracy in standard webcam environments across varied lighting conditions.

<30ms

Inference Latency

WebGL-accelerated TensorFlow.js processes each video frame in under 30ms, maintaining smooth 30fps rendering.

91%

Anomaly Recall

Head-pose deviation model correctly flagged 91% of simulated cheating scenarios in controlled testing.

Business Impact

For platforms running 500+ assessments per month, the elimination of human proctoring fees translates to a direct cost saving of roughly $2,500–$7,500/month at typical market rates. Because all ML inference runs on client devices, the system scales horizontally without server cost increases 500 concurrent exams and 5,000 concurrent exams cost identical infrastructure.

The privacy-by-design architecture (zero video transmission) eliminates the legal exposure that accompanies video storage, making the platform compatible with India's DPDP Act, the EU's GDPR, and FERPA regulations in the United States without requiring separate compliance engineering.

Industry Applications

Where This Technology Applies

EdTech and Universities

Secure remote examination for student cohorts of any size. Generate per-student integrity reports automatically. No invigilation staff required for online batches.

Technical Recruitment

Proctor live coding assessments and take-home tests for software engineering roles. Detect candidate impersonation or unauthorized resource use during the assessment window.

Certification Bodies

High-stakes professional certification exams (finance, legal, medical) that require documented integrity trails for regulatory compliance and accreditation audits.

Full Stack

Technology Stack Breakdown

TensorFlow.js (WebGL Backend)

Client-side inference engine. Runs the quantized gaze estimation model at 30fps using GPU acceleration via WebGL. Eliminates server-side GPU compute costs entirely.

MediaPipe Face Mesh

Google's browser-native face landmark detection library. Extracts 468 3D facial keypoints per frame using BlazeFace for detection + a lightweight landmark regression model.

React.js + Canvas API

The exam UI renders the webcam stream, overlays real-time landmark visualizations on a canvas element, and manages session state. Web Workers handle ML scheduling off-thread.

Node.js + Express.js API

RESTful API and WebSocket server. Receives structured anomaly event objects, validates JWT tokens, and persists events to MongoDB. Never processes video or audio data.

MongoDB

Document database storing candidate profiles, exam sessions, and anomaly event arrays. Schema-free design allows flexible event structure as anomaly types evolve.

JWT Authentication

Stateless authentication across both REST and WebSocket connections. Custom claims carry session context. Refresh token rotation handles long exam sessions.

DarsLab Services Used

Built Across Multiple Disciplines

This project combined DarsLab's machine learning integration expertise with production full-stack engineering. If your platform needs a similar AI-native feature built into an existing web product, these are the service areas that apply:

Custom AI Solutions and Integration

We integrate ML models, LLMs, and computer vision into production web apps.

Enterprise Full-Stack Web Development

MERN, Next.js, and Python-based backend systems built for scale.

Performance-Focused UI Design

Interfaces that handle real-time data streams without layout jank.

AI Chatbot Development India

Browser-native and server-side AI interfaces for Indian businesses.

Common Questions

AI Proctoring FAQ

Q.How does AI video proctoring ensure student privacy?

All ML inference runs locally on the student's device using TensorFlow.js. No video pixels are ever transmitted to any server. Only structured metadata (timestamps, anomaly confidence scores, event types) is sent to the API, making the system GDPR-compliant and compatible with India's DPDP Act.

Q.What is the accuracy of the gaze estimation model?

The system achieves 91% anomaly recall on simulated cheating scenarios. Face presence accuracy using MediaPipe Face Mesh exceeds 94% in standard webcam conditions. Inference latency is below 30ms per frame using the TensorFlow.js WebGL backend.

Q.Does the AI proctoring system work without a dedicated GPU server?

Yes. TensorFlow.js uses the client device's GPU via WebGL for all ML inference. This means zero server-side compute cost for video analysis. The system scales to thousands of simultaneous exam sessions with no GPU infrastructure cost increase.

Q.Which browsers are supported?

Google Chrome (recommended), Microsoft Edge, and Safari 15+. All require WebGL 2.0 support. Firefox is not recommended for production proctoring due to inconsistent WebGL performance with TensorFlow.js models.

Q.Can DarsLab build a custom AI proctoring system for our platform?

Yes. We build custom AI proctoring solutions tailored to your platform's existing authentication, exam engine, and admin stack. We integrate face detection, gaze estimation, multi-face detection, and reporting dashboards. Contact us to discuss your exact requirements.

Need a Custom AI Integration?

Build Your AI-Powered Application

From browser-native ML to LLM integrations. We engineer AI solutions that run at production scale on your existing MERN or Next.js stack.

Start a Project View All Projects

More Case Studies

E-commerce

Iconiq Gifts - E-commerce Platform

High-performance custom e-commerce solution built with Next.js and Tailwind CSS.

Faith-Tech

Puja Sarthi - E-commerce Platform

Full Next.js e-commerce platform for the Indian puja economy. 10,000+ SKUs, 0.9s LCP.

AI Solutions Service Web Development AI Chatbot India Startup Web Development Next.js Agency India All Projects Contact DarsLab