Skip to content
VIRUCHITH.COM
Back to articles
Systems Design Systems DesignDistributed SystemsReminder SystemsMessage QueuesUTC

From Local Alarms to Global Scale: How Reminder Systems Work

Explore the engineering behind reminder notification systems, from local OS hardware timers to distributed multi-tiered architectures. Learn how to scale from single-user alarms to millions of concurrent events using message queues, partitioning, and UTC synchronization.

Author

Viruchith Ganesan

Apr 12, 2026
2 min read
An illustration of a clock with interconnected nodes representing a distributed reminder system.

Whether it’s a simple “Drink Water” alert on your phone or a “Flash Sale” notification sent to 50 million users, the engineering behind reminder notifications is a fascinating journey from hardware-level timers to massive distributed systems.

1. The Humble Beginning: Local Device Logic

In a small, offline Todo app, the app doesn’t “stay awake” all night waiting for your 7:00 AM alarm. That would kill your battery. Instead, the app delegates the job to the Operating System (OS).

Registration: The app tells the OS (Android/iOS), “Wake me up at exactly Unix Timestamp 1712574000.” The OS Kernel: The OS maintains a sorted list of all apps requesting wake-ups. It uses low-level hardware timers to trigger a “system interrupt.” The Wake-up: When the time hits, the OS “broadcasts” an event. Your app’s Receiver catches it, executes a tiny bit of code to show the notification, and then goes back to sleep.

2. Moving to the Cloud: The Polling System

Once your app syncs data to a server (so you can see reminders on both your phone and laptop), the logic shifts. The server becomes the “Source of Truth.”

In a medium-scale system, a Scheduler Service runs a continuous loop (often every minute). It queries a database: SELECT * FROM reminders WHERE status = 'PENDING' AND due_time <= NOW()

3. The Enterprise Scale: Millions of Events

At the scale of apps like WhatsApp or Uber, a simple database query fails because there are too many rows to scan. Architects use a Distributed Multi-Tiered Architecture:

Precision Indexing: Instead of one big table, reminders are partitioned by time (e.g., one “bucket” for every hour). The Message Bus: Instead of sending notifications directly, the scheduler pushes “Due” tasks into a Message Queue (like Kafka or RabbitMQ). This decouples “finding the task” from “sending the alert.” Worker Fleets: Thousands of “Workers” pull from these queues in parallel. If 1 million people have an alarm at 9:00 AM, the system spins up more workers to handle the “thundering herd” of data.

Frequently Asked Questions

Does every insertion need a database reindexing?

Technically, yes. Whenever you add a new reminder, the database must update its index (usually a B-Tree) so it knows where the new timestamp fits. To handle millions of writes per second, developers use Log-Structured Merge-Trees (LSM-Trees), which are optimized for high-speed writes, or they “shard” the data so the load is spread across multiple database servers.

How are different time zones handled?

The Golden Rule of system design: Store in UTC, Display in Local.

  1. The user sets a reminder for “9:00 AM in Chennai.”
  2. The app converts that to UTC (e.g., 3:30 AM UTC) and saves it.
  3. The server’s internal clock always runs in UTC. When the server hits 3:30 AM, it triggers the notification regardless of where the user is currently standing.

What happens if millions of notifications are due at the exact same minute?

This is handled via Queue Sharding and Prioritization.

  • Sharding: The system splits the “due” notifications into hundreds of smaller sub-queues.
  • Prioritization: High-priority alerts (like a “Stock Price Crash”) go to a “Fast Track” queue, while low-priority ones (like a “Weekly Summary”) might be delayed by a few minutes to save system resources.

Continue reading

Related articles