beam-logo

How Goblins Cut Inference Time by 50%

author-profile-picEli Mernit
January 14 2025
Company

About Goblins

Goblins uses AI to bring students one-on-one math support. Founded by a former math teacher and AI engineer, Goblins lets students draw math on any device using a digital whiteboard, and gives them instant feedback that builds the conceptual foundation they’re missing. Today, Goblins serves thousands of students across North America.

Goblins launched in early 2024, and required OCR functionality for handwritten text and diagrams, which historically involved a patchwork of segmented processing and third-party OCR providers.

content-image

Challenges Before Beam

Goblins has a real-time app, which means inference performance and reliability are critical to their product experience.

Before turning to Beam, Goblins ran their OCR models on another serverless GPU provider, and suffered from long boot times, reliability issues, and high costs.

Last Spring, Goblins began search for a inference provider with leading performance and reliability.

content-image

Achieving Faster Inference with Beam

Goblins was able to easily migrate their workloads to Beam.

The team replaced their prior setup with Beam for inference tasks, leveraging 4090 GPUs for their OCR model.

In addition, the team used Beam’s scale-to-zero functionality to maintain minimal instances during off-peak hours (e.g., nights) and spin up GPUs during high-traffic periods.

"Our product is used in schools and during the evenings for at-home practice, so we need our GPUs to spin down at night when students aren’t using it. Beam lets us scale dynamically without paying for always-on GPUs.” - Alp Karavil, CTO @ Goblins

And when it comes to debugging, Beam’s developer workflow helps Goblins debug their apps faster than their previous provider.

Scaling OCR Inference

Since moving to Beam, Goblins is achieving 50% faster inference on their OCR models, which significantly enhances the user experience for students.

"Beam was a lot faster than we thought. We assumed it would be better [than our previous provider], but the performance blew us away.” - Alp Karavil, CTO @ Goblins

In addition, running inference on Beam cut monthly costs from over $1,000 to under $600 for certain workloads. These savings allowed the team to scale up their GPU utilization strategically.

Make sure to check out Goblins and give their app a try today!

content-image


Keep Reading

card-cover-image
Company
Tutorials

The Top Serverless GPU Providers in 2025, Ranked by Cold Start

In this article, we'll break down the top serverless GPU providers by cold start times.

author-profile-picEli Mernit
card-cover-image
Company

Building a Modern Serverless Cloud for Bioinformatics

Today, the cloud feels a bit like programming with punch cards. And we think there's a better way.

author-profile-picEli Mernit

Deploy your app in minutes

Get started with $30 of free credit, refreshed every month