Roadmap for backend from first principles
backend engineering is a very wide scope and when I say backend engineering it is much more than building a set of crud apis the way I think backend engineering is about building reliable scalable fall tolerant and maintainable code bases and efficient systems and if one were to start today to learn back in development there are a lot of resources and
at least 1,00 resources but how do you decide what to learn how do you prioritize how do you see the big picture when and how all these different concepts come together and this is the reason it takes people years to get their head around a lot of these Concepts and principles because people primarily start with a limited scope of training uh whether it is from a college or a boot camp or a simple cop course and they eventually build onp top of that with trial and error and with
help of other developers over time now I am a backend engineer and I have faced this struggle initially when I started out I had to constantly search for resources learn from other developers I have read a lot of books on backend development I have studied hundreds of open- source code bases to see how people in the industry are building stuff and it was a very timec consuming procedure and the second problem is people start with backend development from a particular language or Frameworks
point of view it could be Express or spring boot or Ruby on Rails and the problem with that is you look at the problems that you solve with the lens of your particular language and ecosystem and there are blind spots in that let's imagine you have to switch to a different language let's say you were working with rubyan rails for years and one day your company decided to migrate to golang for performance reasons in that situation how much of your knowledge can you transfer if you don't understand the underlying systems so
here I have decided to put together a comprehensive list of videos which are based on foundational concepts of a backend system and these are from various books that I've read over the years the open source code base we'll start with a very high level understanding of how backend systems work behind the scenes we'll look at how request from browser flows through different hops the network firewalls over the internet and how it is routed to our backend server that is situated in a remote AWS server and how it responds to that request and we'll look
at what the response looks like and that should give us a pretty vivid idea of how systems communicate how a client communicates with a server and how the server responds from there we'll move to understanding HTTP protocol what is the role it plays and how the communication is established through HTTP and how HTTP raw messages look like and what are the HTTP headers what are the role of the headers the different types of headers like request headers or representational
headers General headers and security headers and we'll look at different types of HTTP methods like get method post put delete and when to use them and what are the semantics and what are the principles behind them we'll look at what is the cost flow and how does it work we'll look at how simple request defers from a pre-flight request and how a pre-flight request FL looks like from our browser to the server and back to our browser we'll look at HTTP responses the structure of it and different status codes that server returns and when to return which type of code and what are
the most commonly used HTTP status codes then we'll look at HTTP caching what are the different types of caching techniques using HTTP we have eags we have uh max age headers then we'll look at the differences between http 1. 1 and HTTP 2.0 and HTTP 3.0 and what are the differences between them we will look at how content negotiation looks like between client and server using different headers we'll see how persistent connections work in HTTP we'll look at HTTP compression different types of compression techniques like gzip and deflate and BR and which is the
commonly used technique we'll see the security aspect of it the SSL TLS and htps then we'll move on to routing how routing Maps URLs to server side logic and what is the connection between routing and HTTP methods different components of routes like path parameters and query parameters different types of routes static routes Dynamic routes nested routes hierarchical routes catchall Wild Card routes and regular expression based routes how to do API versioning using HTTP different types of versioning techniques we'll see what is the best
way to deprecate it outout and what are the best practices in the industry we look at the benefits of Route grouping and how it helps with versioning permissions and shared middleware we'll see how to secure routes how to optimize route matching performance then we'll move on to serialization and der serialization this basically means how before sending it over to the network our server translates the data into a particular format and after receiving the data from uh let's say client over the internet and how and how it translates the data received from the
client over the internet to its own native format that is called DC realization and we will see what is the need of it and how it helps with the interoperability standard the different formats that are used in serialization der serialization we have text based formats which is Json or XML we have binary formats which is prot off and what are the performance differences between these two and when to use which one we'll look at how different programming languages Implement serialization and der serialization we will explore a popular text based format
for serialization and der serialization which is Json the structure of Json the different data types like strings numbers booleans arrays and objects how realization of nested objects and collections are handled in Json how DC serializing into data structures the native data structures works like in let's say python dictionary or goang structs or JavaScript object what are the common errors while dealing with Json like handling missing or extra Fields dealing with null values or date serialization issues and time zone issues and how to implement custom serialization while before sending or
serializing data into Json we'll look at error handling and serialization der serialization for example invalid data data conversion errors unknown Fields look at the security concerns if it like injection attacks why to do validation before DC realization and validating Json schemas before processing data using Json schema validation we'll see the performance aspect of it like reducing the serialized data through compression and eliminating unnecessary Fields like serialization performance between text based and binary formats like Json versus protuff the tradeoffs
between readability and performance because in text based format you can easily check the payload and see that does not work the same in binary formats but binary formats are faster so when you you use a binary format and when you use a text Bas format that is a valid trade-off then we'll move to authentication and authorization why do we use it different types of authentication like State full State L we have basic authentication better token authentications we'll look at sessions jws cookies we'll Deep dive on oo protocol and open ID connect we'll
see how API keys work how multiactor authentications work what is salting hashing and different cryptographic techniques used in authorization we'll explore aack rback reback we'll look at what are the best practices in security like securing cookies avoiding csrf xss mitm like audit logging which basically means recording authentication authorization events for Audits and monitoring failed login attempts privilege escalation and access to sensitive resources we'll look at a
skating authenticated related error messages preventing information leakage to attackers through detailed error messages like handling edge cases for example consistency in responses across different failure modes like rate limiting and account lockout we'll see how to avoid timing attacks for example attackers can exploit time differences in error responses to infer valid credentials for example an error for a wrong password might take longer than an error for a valid username because to check for a password you have to do some kind of hashing uh or uh some use some
kind of cryptographic technique which takes a little bit of time so people can General the tiny bit of timing difference between them and guess passwords even though that's very difficult but we don't want to keep any security Halls then the next topic we'll explore is validation and transformation the different types of validation like syntactic validation for example checking whether resting is a email or not whether it is a valid phone number or not or whether it is a valid date format or not there is semantic valid for example a date uh a date of birth
cannot be in the future or the age of a person should be between one and 120 these are called semantic validations then we have type validation for example checking the input values match expected types whether it is a string or not whether it is an integer or not whether it is an array or not whether it is an object or not these types of checks are called type validation we'll see what are the best practices for validation what is the difference between client side validation and what is the of server side validation the importance of server side validation even if client
side validation is already implemented because client side validation improves user experience by providing instant feedback but server side validation is the true security implementation because that is the gateway to your business logic we'll look at the importance of failing fast by reducing unnecessary processing by returning early and we'll look at uh why to keep consistency between front end validation and backend validation then there is Transformations for example type casting converting string to number or number to string because uh in query parameters or path
parameters whatever we receive is a string but let's say we are expecting an ID field which is a number so before we send it to our handlers we have to convert that string into a number so that step is called a transformation which we have to take care of in our validation Pipeline and different date formats for example the front end might send a different format or we might be expecting a time stamp so that also has to taken care of in the validation pipeline then there is normalization for example converting an email to lower case or trimming white space from a
string or adding country code to a phone number these are called normalizations then there is sanitization for security issues for example we have to sanitize a string that is submitted by the user to prevent attacks like SQL injection then there is complex validation Logic for example relationships let's say User submitted a form and it has two Fields one is password and another is confirm password so we have to check whether the two strings are the same or not so that is a relationship based validation then there is conditional validation so let's
say in the form there are two Fields one is partner name and the second is married which is a Boolean true or false so a partner field might only be required if the married is true so that is a conditional validation so these kinds of checks we have to do then there is chain validation like converting a string to lower case then roving special characters and then checking its length then we'll look at error handling in validation like sending meaningful error messages to front end so that the user can fix them and aggregating all
validation errors in one response for client side display or Aus skating error messages so instead of saying invalid password we'll have to say invalid credentials to prevent different types of attacks then we will see how to gracefully handle failed Transformations for example an invalid Json and a failed date conversion and how to let the user know in a meaningful message then we'll look at the performance trade-offs of validation and how to optimize it by returning early avoiding redundant validations then our next topic is going to be middleware we'll look at what is a
middleware and when to use them what are the common use cases of middlewares the role of a middleware in a request cycle for example a pre-quest middleware or a post response middleware the flow of middlewares for example techniques like chaining a middleware is executed in a sequence passing control to the next middleware until request reaches its final Handler how to order middleware is appropriate for example we have to go in this order we have to log the request we have to check whether the user is authenticated or not we have to do validation and we have to do route
handling and then we have to do error handling so this order matters in middleware flow we'll see how the next function Works in middleware and exiting middleware early how middleware can SE circuit the request pipeline by handling 404 errors we'll look at some of the common middlewares like security middlewares which add security headers like X content type or strict Transport Security or content security policy or middlewares which add appropriate course headers to every single request or response and middleware to avoid csrf
attack middleware to rate limit then we have authentication middleware to reuse the route protecting logic across our apps then we have logging and monitoring middlewares for request logging or structur logging for observability or easier debugging in production then we have error handling middlewares which catches and form mats application Level errors for consistent API responses then we have compression or performance related middlewares which basically compresses response bodies to reduce the size of data sent over the networks then
we have data passing middleware passing incoming request bodies like Json URL encoded forms and file uploads handles multiplatform data for the file uploads then we'll look at the performance and scalability aspect of middleware like what are the best practices to keep middlewares lightweight and efficient ensuring middleware is applied in the correct order or how how middleware order can affect the performance and security of the application the next topic is going to be request context request context basically means the metadata that is often passed through
application middlewares controllers and services it is kind of a request coped State right the state is only valid for that request so here we'll explore the life cycle of a request maintaining State for the duration of a request sharing data across different layers of the application without coupling how context provides a temporary request scoped State we look at what are the different components of a request context the request metadata for example the HTTP method the URL headers the
query parameters and the body and there is the session and user information for example in the authentication middleware we fetch the user information and then we add it to the request context so for that request scope the users information is injected into the context then we have tracking and loging information like unique request IDs or Trace IDs then we have request specific data like custom data in injected during the request life cycle like caching data permission checks we'll look at what are the use cases for example authentication rate limiting tracing logging we'll explore the connection between
middlewares and request contexts we'll see what are the different types of timeouts the request timeouts custom timeouts cancellation signals we'll see what are the best practices like keeping it lightweight to prevent memory overhead ensuring context data is cleaned up after request life cycle to prevent memory leaks avoiding tightly coupling components through context or over relying on it for passing data then we'll move to handlers and controllers the MVC pattern and what handlers and controllers and services the responsibilities of all of them and reducing code with middleware then we have centralized error handling in
handlers and consistent success and error message formats and how to implement them in controllers then we'll look at the different types of crud operations like how cud operations map HTTP methods and what are the common apis associated with each method for example post method is usually used for creation and submissions and the status code is usually 2011 created or a400 if it's a bad request and get requests are usually associated with fetching a list of resources or fetching a single resource and we have put and Patch to
update resources and delete to delete resources we look at how to implement pagination and how to implement a search API how to do sorting and how to do filtering and we'll see what are the best practices for example strict validation consistent response formatting limiting payload redacting sensitive Fields error handling authentication and authorization then we'll explore what is the restful architecture and what are the best practices for implementing rest apis the principle of Designing apis around resources and sticking to sttp semantics
and best practices for filtering and pagination uh what are the different types of versioning like URI versioning header versioning query string media type we'll see how to design apis with open API spec in mind you'll see content negotiation capturing exceptions and providing meaningful messages supporting client side caching e taxs and optimizing large requests and responses after that we'll move on to a very important topic which are databases in databases we'll look at relational and non-relational what are the differences and to when to use which we look at some
of the theoretical Concepts like acid and cap theorem we'll take a look at basic quering and joints and database design best practices like schema design indexing we look at different optimization methods like query optimization caching connection pooling then we have data Integrity like constraints and validations transactions and concurrency we'll see how ORS work whether to use an orm what are the tradeoffs and we'll look at what are database migrations that we'll move on to business logic layer which is also
called BL what is the role of it what are the different layers of a request cycle for example we have validation layer we have routing we have middlewares and we have handl and controls which all of them fall under presentation layer because they they deal with users data whether it is accepting users data or sending a user data so those are part of a presentation layer after that we have business logic layer which is the middle one which deals with our Core Business logic and after that we have the data access layer
which deals with databases perform squaring or inserts or deletions and business logic layer uses the data access layer behind the scenes we'll look at different design principles like separation of concerns single responsibility open close dependency inversion what are the components of a business logic layer for example we have Services we have domain models which represent core entities like a user or an order then we have business tools then we have business validation logic we'll look at service layer design best practices we'll look at how to handle
errors properly and how to propagate those errors from our service layer to our presentation layer after that we have caching we'll discuss what is the need of caching and how it differs from database persistence what are the different types of caching we have memory caching browser caching database caching and what is the need of client side caching and server side caching the different caching strategies for example cash aside right through right behind or right back read through the different cash eviction strategies for example lru
lfu TTL and fifo need for cash invalidation like manual Cash invalidation Time To Live invalidation or event bash invalidation the different levels of caching level one which is in memory level two which is Network distributed and there is is the hierarchical caching which combines level one and level two caching strategies where frequently used data is stored in a fast small cach which is the level one cache and the less frequently used data is stored in a slower or large cach which is the level two cache we'll see how caching for web apps looks like
that is caching static assets or caching API responses using headers we'll see how to cach with databases for example query caching like storing the results of heavy joints in redis and look at Cash hit and cash Miss ratio and how to optimize them after that we'll move on to transactional emails what are the use of them what are the common use cases the anatomy of a transactional email the subject the preheader the body header main content CTA footer and how to personalize with different Dynamic
parameters then we have task queuing and scheduling what are the common use cases for example queuing might be used for sending emails or processing image files and third party API integration like Payment Processing or web hooks or offloading heavy computation like badge processing for example a user clicks a button to clear all my data so to clear all the users data we have to call we have to execute different queries for different tables to clear all the users data and that might take some time so
instead of blocking the request we return the response instantly and we trigger a background job by pushing into the task Cube we'll look at scheduling what are the use cases for example for example running database backups recurring notifications and reminders data synchronization or maintenance related issues for example clearing logs or caches the different components of a task Q There is the producer Q consumer broker backend the flow of a task dependency for example it might be chain dependency or it might have parent child
relationship we look at task groups executing multiple task concurrently and waiting for all of them to complete at the same time we look at how to handle errors and Implement retries in task use we'll look at task prioritization and and rate limiting for example giving importance to task like Payment Processing before you process task like sending notifications after that we'll move on to elastic search why do we use elastic search and how does it work behind the scenes the different techniques that are used for example inverted index term frequency and inverse document frequency segments and
shards what are the use cases of elastic search for example providing a type ahead experience or log analytics or social media search for example full text search for user profiles posts comments we'll see how to create and manage indexes we'll see how to search and query different types of searching basic searching full text search relevance scoring we'll see how to optimize search performance by tweaking text versus keyword Fields understanding analyzers and boosting and pagination we'll take a look at some of the advanced search patterns for example
filtering aggregation fuzzy search then we'll see how kibana works and how to use elastic search in a user friendly way and different best practices for example defining field mappings explicitly optimizing the number number of shards indexing data in batches and avoiding wild cards then we have error handling the different types of errors in our apps could be syntax errors runtime errors logical errors and different error handling strategies for example fail safe or fail fast graceful degradation or prevention of Errors different practices for error handling
for example catching early not swallowing errors custom error types failing gracefully logging errors and using stack traces we'll look at how Global error handlers work we'll see how to appropriate handle user facing errors like providing friendly error messages and providing actionable feedbacks we'll see the importance of monitoring and logging in error handling and different tools like Sentry or elk stack and different error alerts like email based alerts and slack based alerts after that we have config management what is
exactly config management and how does it help with flexibility and decouples environment specific settings from application logic and what are the use cases for example example different environments using config management safely managing sensitive data such as API Keys database passwords and private certificates dynamically enabling and disabling features without changing codebase what are the best practices of config management different types of configs for example static configs like DB credentials and API end points
Dynamic configs like feature Flags rate limits and sensitive configs like credentials tokens secrets and different sources of configs for example it could be EnV file or Json or yaml and what are the differences between using environment variables versus command line Flags versus static files after that we have logging monitoring and observability a very important topic we'll see what are the differences between logging tracing monitoring and observability the different types of logging like system logging application access security logs the different
levels of logs like debug info one error fatal and we'll see the difference between structured logging and un structure logging and the best practices for logging like centralized logging log rotation and retention contextual and meaningful logs and avoiding sensitive data like passwords and API Keys then we will look at monitoring different types of monitoring like infrastructure monitoring application performance monitoring uptime monitoring the different tools that are used in monitoring like Prometheus grafana and how to manage alerts and notifications
by defining thresholds creating alerts and avoiding alert fatigue by only creating action alerts and ensuring that alerts are meaningful and necessary then we'll take a look at observability the three pillars of observability which are logs metrics and traces the best practices around it the security and compliance of log management after that we'll move on to graceful shutdown why do we need graceful shutdown and how does it work behind the scenes what are the different use cases for example you might need it when server restarts or or scaling in Cloud environments or
microservices or long running jobs how it works like signal handling sigor signant and sill signals what are the different steps of graceful shutdown for example it starts with capturing a signal and then it stops accepting requests then it completes an inflight requests and then it closes external resources like database connections or any open files Etc and at last it terminates the app after that we'll move on to security the different aspects of security in a backend code base avoiding different security attacks like SQL injection no SQL injection xss csrf
broken authentication insecure deserialization and principles of a secure software design for example least privilege defense in depth fail secure defaults separation of Duties security by Design then we'll look at the importance of input validation and sanitization and rate limits and content security policy course and same side cookie and the importance of monitoring events after that we have scaling and performance the different metrics of performance like response time resource utilization identifying bottlenecks caching and database optimization for
example avoiding n plus1 query problems and ensuring proper use of joints and using lazy loading where appropriate then we have using database indexes to speed of read operations on frequent VAR Fields like indexing foreign keys or search Fields how to process data in batches to minimize database load and improve performance for large data sets how to avoid memory leaks like closing file handles database connections or cleaning of after a long process minimizing Network overhead by reducing payload size and using compression we'll
look at how to do performance testing and profiling we look at some of the best practices for writing performant code like focusing on writing clear and maintainable code first without premature optimization and writing modular code to make it easier to optimize individual components without affecting the entire system ensuring that if a particular resources under load are unavailable the system degrades gracefully without crashing and how to offload non critical tasks like sending emails or logging to background processes or task use to free of
resources for more critical operations then we have concurrency and parallelism what are the difference between concurrency and parallelism and how concurrency helps in IO bound task and how parallelism helps in CPU bound tasks then we have object storage and large files we'll look at some of the common use cases when we use object storage like awss 3 and how to manage large files with chunking and streaming and we'll look at multiart file uploads then we have realtime backend systems where we take a look at web sockets or servers
and events and pubs of architecture to that we have testing and code quality here we take a look at different types of testing unit testing integration testing end to end testing functional testing regression testing performance testing load and stress testing user acceptance testing security testing we'll take a look at what is test driven development how to automate test in cicd environments how to manage code quality with external linting and formatting tools and what are the measures of code quality and coverage like quality
metrics like cyclomatic complexity which measures complexity of a function by counting the number of possible paths through the code and we have maintainability index which basically quantifies how easy it is to maintain a code based on the complexity lines of code and other factors then we'll take a look at a very interesting set of principles which is called 12 Factor app after that we'll move on to open API standards what is the need of this standards and why should we stick to it what are the benefits of it what are the use cases like documentation Automation
and the ecosystem surrounding it like Swagger Corden postmen and what is the history the Swagger to open API transition and what are the different versions that are currently active and what are the key concepts of open API documents for example there is API pass the request and response definition there is parameters there is schemas and what is the structure of an OPN API document there is metadata there is paths there is components there is security definitions and there is responses we'll see what are the new features of open API 3.0 and 3.1 what
are the tools surrounding open API for example Swagger UI Cen Postman what are the best practices like avoiding duplication and sticking to standards we'll look at a very interesting development method which is API First Development where you define your open API standard or write your open API spec first and then you start creating the apis after that we'll move on to web hooks what are the use cases of web hooks by like sending notification third party Integrations what are the differences between API versus web hook for the same use case for example for
API we might have to use polling which is client side initiated compared to web hooks which is pushing is server initiated what are the key components of web hooks for example the web hook URL event triggers payload HTTP method the response handling what are the best practices surrounding web hooks like web hook signature verification like using http PS and quick response retry logic logging how to test web hooks with enro real world use cases like STP Payment Processing GitHub web hook slack Discord TWU Etc and at last we'll take a look at
what are some devops Concepts and backend engineer should be familiar with for example some of the Core Concepts like continuous integration continuous delivery continuous deployment the devops practices like infrastructure is code config management Version Control the different tools surrounding devops for example creating containers with Docker or orchestrating container with kubernetes and cicd pipelines and how to scale your service horizontal scaling versus vertical scaling and different
deployment strategies like red green deployment rolling deployment Etc and that's about it this is all the concepts that we are going to cover in the next 30 or 40 videos so stay tuned