ashdaily

Back-of-the-Envelope Calculations for Data Types and Data Access in Production Systems

Abstract

Back-of-the-envelope calculations are essential tools for developers to estimate the feasibility and performance of systems in production environments. These quick calculations help in understanding how data types, memory access times (from registers to caches to RAM), and payload sizes affect performance, especially when dealing with high-frequency requests like requests per second (RPS). This paper presents a structured approach to performing such calculations for different data types and memory hierarchies, providing practical examples for developers to use in production scenarios.

Introduction

Modern cloud architectures, particularly those leveraging platforms like AWS, often need to scale quickly while handling high volumes of traffic. Understanding how data is stored, accessed, and transmitted within these systems can help developers optimize for both performance and cost. This paper focuses on the relationship between data types, memory access times, and how these factors impact system performance on AWS. We provide realistic examples to help developers estimate the impact of request rates and data payload sizes on their applications, ultimately enabling better decision-making for production environments.

Data Types: Understanding Their Memory Footprint

Each data type consumes a specific amount of memory, which directly impacts storage, transmission, and memory access speeds. Understanding these fundamental sizes allows developers to estimate the memory footprint of a system.

Common Data Types and Their Sizes

The following table summarizes the bit-size and memory footprint of commonly used data types:

Data Type	Bit Size	Byte Size	Description
Boolean	1 bit	1 byte	A simple true/false value
Character (ASCII)	8 bits	1 byte	A single character in ASCII
Integer (32-bit signed)	32 bits	4 bytes	Standard 32-bit integer
Integer (64-bit signed)	64 bits	8 bytes	Standard 64-bit integer
Float (32-bit floating-point)	32 bits	4 bytes	Single precision floating-point
Double (64-bit floating-point)	64 bits	8 bytes	Double precision floating-point
String (n characters, ASCII)	8n bits	n bytes	ASCII string with `n` characters
String (UTF-16, n characters)	16n bits	2n bytes	UTF-16 encoded string with `n` characters

Example 1: Estimating Payload Size in a Production System on AWS

Let’s consider an application deployed on an m5.large AWS instance that handles the following payload for each API request:

A 32-bit integer: 4 bytes
A 64-bit double: 8 bytes
A string of 50 ASCII characters: 50 bytes

Total payload size per request:

Integer: 4 bytes
Double: 8 bytes
String: 50 bytes
Total = 4 + 8 + 50 = 62 bytes per request

Now, let’s calculate the data load for 1,000 requests per second (RPS): [ 62 , ext{bytes/request} imes 1,000 , ext{requests/second} = 62,000 , ext{bytes/second} = 62 , ext{KB/second} ] For 10,000 RPS, the total data processed per second is: [ 62 , ext{bytes/request} imes 10,000 , ext{requests/second} = 620,000 , ext{bytes/second} = 620 , ext{KB/second} ] For 100,000 RPS, the throughput becomes: [ 62 , ext{bytes/request} imes 100,000 , ext{requests/second} = 6.2 , ext{MB/second} ]

Memory Access: Hierarchical Latency in CPU and Memory

The time it takes to access data depends on where the data is stored. The CPU accesses data from registers, caches (L1, L2, L3), RAM, or even disk storage. Understanding these access times helps developers predict how efficiently their system will handle memory-bound operations.

Memory Access Times for Different Tiers

Memory Location	Typical Access Time (ns)	Description
CPU Register	~1 ns	Fastest access, stored directly in the CPU
L1 Cache	~1-2 ns	Closest cache to CPU, very fast but small
L2 Cache	~3-14 ns	Larger than L1 but slower
L3 Cache	~10-50 ns	Shared between CPU cores, larger than L2
RAM	~60-100 ns	Main memory, significantly slower than CPU caches
SSD (EBS optimized)	~50,000 ns (50 µs)	Persistent storage, faster than HDDs
HDD (EBS)	~10,000,000 ns (10 ms)	Slowest, mechanical disk drives

Example 2: Estimating Latency for Data Access

Assume you need to access data stored in different memory locations. If a single operation requires the CPU to fetch data from L1 Cache, you can estimate the latency:

L1 Cache access = ~3 ns per access
If 1,000 such accesses are made per request, total access time = (1,000 imes 3 , ext{ns} = 3,000 , ext{ns}) or 3 μs.

For comparison:

If the same data were accessed from RAM, the access time might be ~100 ns, resulting in (1,000 imes 100 , ext{ns} = 100,000 , ext{ns}) or 100 μs.

Combining Data Size and Access Latency for Production Estimation

Estimating Performance in a Real-World Scenario

Consider a system handling 1,000 requests per second, each containing the following:

An integer (4 bytes)
A string of 100 characters (100 bytes)
A float (4 bytes)

The payload size per request would be: [ 4 ext{ bytes (int)} + 100 ext{ bytes (string)} + 4 ext{ bytes (float)} = 108 ext{ bytes/request} ]

If this data is accessed from L2 cache (with an average access time of ~10 ns):

Access time for one request = 10 ns × (number of accesses, say 500) = 5,000 ns = 5 μs.
Total latency for 1,000 requests per second = (1,000 imes 5 , ext{μs} = 5 , ext{ms}).

If the data were in RAM instead:

Access time per request = 100 ns × 500 accesses = 50,000 ns = 50 μs.
Total latency for 1,000 requests per second = (1,000 imes 50 , ext{μs} = 50 , ext{ms}).

This shows how the location of data (whether in cache or RAM) can significantly impact the total time it takes for the system to handle requests.

Estimating Total Throughput

If your system needs to handle 1,000 requests per second with an average payload size of 100 bytes:

Total data processed per second = (1,000 imes 100 , ext{bytes} = 100,000 , ext{bytes/second}) = 100 KB/s.

Knowing the latency of accessing data from different memory tiers helps optimize the system's performance, especially when processing large amounts of data.

Conclusion

Back-of-the-envelope calculations are invaluable for developers who need to estimate system performance and memory usage without deep profiling tools. By understanding data type sizes, memory access latencies, and combining them with expected request rates, developers can make informed decisions about how to architect their systems for optimal performance.

Back of the envelope calculations for data types and data access in production systems