Company

How Goblins Cut Inference Time by 50%

Eli Mernit

January 14, 20252 min read

About Goblins

Goblins uses AI to bring students one-on-one math support. Founded by a former math teacher and AI engineer, Goblins lets students draw math on any device using a digital whiteboard, and gives them instant feedback that builds the conceptual foundation they’re missing. Today, Goblins serves thousands of students across North America.

Goblins launched in early 2024, and required OCR functionality for handwritten text and diagrams, which historically involved a patchwork of segmented processing and third-party OCR providers.

Challenges Before Beam

Goblins has a real-time app, which means inference performance and reliability are critical to their product experience.

Before turning to Beam, Goblins ran their OCR models on another serverless GPU provider, and suffered from long boot times, reliability issues, and high costs.

Last Spring, Goblins began search for a inference provider with leading performance and reliability.

Achieving Faster Inference with Beam

Goblins was able to easily migrate their workloads to Beam.

The team replaced their prior setup with Beam for inference tasks, leveraging 4090 GPUs for their OCR model.

In addition, the team used Beam’s scale-to-zero functionality to maintain minimal instances during off-peak hours (e.g., nights) and spin up GPUs during high-traffic periods.

"Our product is used in schools and during the evenings for at-home practice, so we need our GPUs to spin down at night when students aren’t using it. Beam lets us scale dynamically without paying for always-on GPUs.” - Alp Karavil, CTO @ Goblins

And when it comes to debugging, Beam’s developer workflow helps Goblins debug their apps faster than their previous provider.

Scaling OCR Inference

Since moving to Beam, Goblins is achieving 50% faster inference on their OCR models, which significantly enhances the user experience for students.

"Beam was a lot faster than we thought. We assumed it would be better [than our previous provider], but the performance blew us away.” - Alp Karavil, CTO @ Goblins

In addition, running inference on Beam cut monthly costs from over $1,000 to under $600 for certain workloads. These savings allowed the team to scale up their GPU utilization strategically.

Make sure to check out Goblins and give their app a try today!

Eli Mernit

Published January 14, 2025

How Goblins Cut Inference Time by 50%

About Goblins

Challenges Before Beam

Achieving Faster Inference with Beam

Scaling OCR Inference

More from the Beam blog

The Top Serverless GPU Providers in 2025, Ranked by Cold Start

Building a Modern Serverless Cloud for Bioinformatics

Start shipping on infra
you won’t outgrow.

How Goblins Cut Inference Time by 50%

About Goblins

Challenges Before Beam

Achieving Faster Inference with Beam

Scaling OCR Inference

More from the Beam blog

The Top Serverless GPU Providers in 2025, Ranked by Cold Start

Building a Modern Serverless Cloud for Bioinformatics

Start shipping on infrayou won’t outgrow.

Start shipping on infra
you won’t outgrow.