Skip to main content

Scaling & Performance in Backend Systems (Part 2)

N + 1 Query Problem (ORM Perspective)

  • Traditionally, N + 1 problem occurs at backend (server → database) level.
  • Frontend example is just for intuition; actual issue is in server-side data fetching.

ORM Solutions to N + 1

  • Modern ORMs provide bulk-fetching primitives:

Examples

  • Django ORM:

    • select_related → For foreign key relationships
    • prefetch_related → For many-to-many relationships
  • Ruby on Rails:

    • includes
  • TypeScript ORMs (Prisma, Drizzle, TypeORM):

    • left join, select, etc.
  • Raw SQL:

    • Use JOIN, LEFT JOIN, INNER JOIN

Key Fix for N + 1

  • Avoid:

    • Fetching related data inside loops
  • Instead:

    • Fetch all required data in bulk using joins or ORM primitives
  • Result:

    • Reduces queries from O(N)O(1 or constant queries)

Why N + 1 Happens with ORMs

  • ORM code looks like normal programming constructs:

    • Arrays, loops, objects
  • Developers forget underlying SQL execution

  • Each loop iteration → triggers a separate DB query


Debugging ORM Behavior

  • Many ORMs provide:

    • Option to log generated SQL queries
  • Helps:

    • Identify inefficient query patterns
    • Optimize data fetching strategy

Database Indexes

What are Indexes?

  • Index = Data structure (usually B-tree) that:

    • Stores sorted column values
    • Points to actual rows

Analogy: Library Without Index

  • No catalog → Must scan every book → Slow

  • Equivalent to:

    • Full table scan / Sequential scan

With Index

  • Catalog maps:

    • Author → Book locations
  • Database:

    • Uses index to directly locate rows
  • Result:

    • Query time reduces drastically

Performance Impact

  • Without index:

    • Full table scan → ~seconds
  • With index:

    • Lookup → milliseconds

Cost of Indexes

1. Storage Overhead

  • Index stores:

    • Sorted values + pointers
  • Larger tables → Larger index size


2. Write Overhead (Critical)

  • Every:

    • Insert
    • Update
    • Delete → Must update index
  • Too many indexes:

    • Slows down write operations significantly

Indexing Strategy

  • Default:

    • Primary key (ID) is already indexed
  • Add indexes for:

    • Frequently queried columns
    • Common join fields (e.g., author_id)

When to Add Indexes

  • Two approaches:

1. Predictive (Early)

  • Add indexes for:

    • Obvious access patterns (e.g., foreign keys)

2. Observational (Preferred)

  • Use:

    • Logs
    • Distributed tracing
  • Identify:

    • Slow queries
  • Add indexes based on real usage


Composite Index

  • Index on multiple columns:

  • Example:

    • (user_id, created_at)

Key Insight

  • Order matters:

    • Works for:

      • user_id
      • user_id + created_at
    • Does NOT work for:

      • created_at alone

Covering Index

  • Index contains all required query columns

Benefit

  • Query served directly from index
  • No need to access table

Tradeoff

  • Larger index size

Finding Missing Indexes

Tool: EXPLAIN ANALYZE

  • Shows:

    • Query execution plan
    • Index usage
    • Scan type

Indicators

  • Sequential scan → Missing index
  • Index scan → Optimized query

Database Connections Overhead

What Happens in a Connection?

  • TCP handshake (3-way)
  • Authentication
  • Encryption setup
  • Session creation
  • Memory allocation

Problem

  • Creating connection per query:

    • Adds latency
    • Wastes resources
  • Database limits:

    • ~500 connections (typical)
  • High traffic → Exhaust connections → Crash


Connection Pooling

Concept

  • Maintain reusable connections

Flow

  • Request → Borrow connection from pool
  • Execute query
  • Return connection to pool

Benefits

  • Avoids repeated connection setup
  • Prevents connection exhaustion

Types of Pooling

1. Internal Pooling

  • Each server instance has its own pool

  • Problem:

    • Total connections = sum of all instances
    • Can exceed DB limits

2. External Pooling

  • Central pool (e.g., PgBouncer)
  • All servers share it

Benefits

  • Global connection control
  • Prevents overload during scaling

Caching

Concept

  • Store results of expensive operations (e.g., DB queries)
  • Serve from cache instead of recomputing

Performance Gain

  • DB query: ~800 ms
  • Cache: ~50 ms

Cache Invalidation (Hard Problem)

  • Need to keep:

    • Cache ↔ Database in sync

Invalidation Strategies

1. Time-based (TTL)

  • Cache expires after fixed time

Pros:

  • Simple

Cons:

  • Hard to choose optimal TTL
  • Risk of stale data

2. Event-based

  • Invalidate cache on data update

Pros:

  • Accurate

Cons:

  • Must handle in all update paths
  • Easy to miss → stale data risk

Cache Storage Options

1. Local Cache

  • Stored in server memory

Pros:

  • Very fast

Cons:

  • Inconsistency across instances

2. Distributed Cache (Redis, Memcached)

  • Shared cache across servers

Pros:

  • Consistent

Cons:

  • Network latency (~50 ms)

Tiered Caching

  • Combine:

    • Local cache (hot data)
    • Distributed cache

Caching Patterns

1. Cache Aside (Lazy Loading)

  • Check cache → Miss → Fetch DB → Store in cache

Most common pattern


2. Write Through

  • Update:

    • DB + Cache simultaneously

Pros:

  • No cache miss

Cons:

  • Slower writes

3. Write Behind

  • Update cache first
  • Async update DB

Pros:

  • Fast writes

Cons:

  • Risk of inconsistency

Cache Hit Rate

  • % of requests served from cache

Example

  • 90% hit rate → Good
  • 20% hit rate → Ineffective caching

Factors Affecting Cache Hit Rate

  • TTL (expiry time)
  • Cache size
  • User access patterns

Scaling

Why Scaling?

  • Traffic ↑ → Need more capacity

Types of Scaling


Vertical Scaling (Scaling Up)

  • Increase power of single server:

    • CPU
    • RAM
    • Storage
    • Network

Advantages

  • Simple
  • No architecture changes
  • Cost-efficient initially

Disadvantages

  • Hardware limits
  • Single point of failure
  • No geographic distribution

Horizontal Scaling (Scaling Out)

  • Add more servers

Advantages

  • No hard limit
  • Fault tolerance
  • Geographic distribution
  • Better scalability

Disadvantages

  • Complex system design

Challenges in Horizontal Scaling

  • Load balancing
  • State synchronization
  • Network failures
  • Failure detection
  • Consistency issues

Load Balancing

  • Distributes requests across servers

  • Requires:

    • Algorithms
    • Additional infrastructure

Distributed Systems Reality

  • Horizontal scaling introduces:

    • New problems instead of removing old ones
  • Trade-offs:

    • Simplicity vs scalability
    • Consistency vs availability

Final Takeaways

  • Avoid N + 1 → Use joins / ORM bulk fetch

  • Index wisely → Not too many, not too few

  • Use connection pooling → Prefer external pooling at scale

  • Caching is powerful but complex (invalidation is hard)

  • Monitor:

    • Cache hit rate
    • Query performance
  • Vertical scaling:

    • Simple but limited
  • Horizontal scaling:

    • Powerful but complex