Skip to main content

Validations and transformations for backend engineers

validations and transformation this video is just about keeping a set of things a set of rules or guidelines in mind while you are designing your apis these are mostly related to data integrity and security that's where validations and Transformations come into play so let's understand where exactly do we use it this we have already discussed in a typical backend architecture we have different different layers of execution so the bottom layer

is mostly called the repository layer which mostly deals with database connections and database uh query executions and insertions deletions and everything uh related to data it could be a traditional relational database or it could be redis or any other kind of database a some kind of persistent storage and above that we have our service layer service layer and this layer typically deals with executing

your business logic different different things that you might do in a typical API execution right so it could be calling one or two repository methods and sending a notification to different different devices and sending an email to users and storing data and making a web hook call etc etc service basically defines all the functionality that an API call usually has a typical service method calls one or more methods from

repository layer to interact with the databases above that we have our we have controller layers okay and in this layer uh it calls whatever the method that is defined in the service layer which executes all the business logic it calls the itory methods uh and it sends email notifications whatever everything that an API is expected to do and all the data that the API is expected to return so all of that responsibility falls upon the service layer and the controller

layer calls the service layer whatever method that is associated with that particular API it calls that method and whatever data is returned from that service method is returned to the user in that HTTP connection the reason we separate out controller layer and the service layer is because we want to keep the HTTP related stuff in a different layer HTTP related stuff as in uh what error code to return or what uh success

code to return to the client and what what format the data is expected to be returned in that API and what are the validations that we need to do that we'll see soon and all this data related stuff and all the HTTP logic related stuff we want to separate out into a different layer that's why we keep the controller layer it deals with all the data that comes from clients and all the data that goes out to the client and internally it calls the service layer

and service layer internally might call the repository layer and it might not depending on the requirements and that's how a typical API call looks like it it reaches the controller layer and controller layer interacts with the service layer service layer interact with the repository layer and whatever data is returned from the service layer is returned to the user whoever is making the API call this is a very simplified execution call now where exactly do we have validations and

transformation ideally validations and Transformations happen somewhere here at this point point the the point where the data that is sent by the client it is typically a Json payload and the point where the data reaches the server which is the control erer it first it goes to the route matching algorithm whatever route is matched it calls the respective controller method of that route whatever

controller method is assigned to to that route matching pattern now after the route is matched and before we start executing whatever business logic we have in the controller layer the first step that we do is the validations and Transformations at this point at exactly this point before any significant logic is run or executed in the controller layer before calling the service methods or before executing the business logic this is where we do validations and

transformations what is the idea behind validations and transformation why are we talking about it so imagine this is our server and we have different different clients all over the world different different users and all of them make API calls they make HTTP connections and they make API calls to the server now the idea behind validations and transformation is

whatever data whatever Json payload these clients are sending whatever query parameters whatever query parameters these clients are sending or whatever path parameters also any kind of data it can be Json payload it can be query parameters it can be path parameters it can be headers whatever all these kinds of data before they enter into our server the first entry point which uh as you know after the route is matched and it calls the associated controller

method before before that happens we want to make sure all this data the Json the query parameters the path parameters all these data are in the expected format they are in the expected format which means we want to make sure that whatever data this particular API needs the user is sending that data in the exact format that the API expects in which means let's say it's a Json payload and we want a field called name

we want to field call named and we have this requirement for this API we are expecting one Json payload in the request body and inside the Json we are expecting a field called name which is a string and there could be other restrictions also we want a string whose length is between five characters to 100 characters and it we can be as specific as possible but that's the basic idea now if if we have an API that has a

requirement like this we want to make sure that when the data reaches and before we start executing before we go into the service layer before we start calling the database methods we before we do anything significant we want to make sure that the structure of the data is in the exact format that we expect so when the Json reaches at this entry point we execute our validation and transformation pipeline so which is basically basically a middleware function or some kind of utility function that we use that we reuse in

every API and we provided a a particular schema that we want to validate against which makes sure it goes through all the fields of that Json so we are saying we are expecting field call name so it checks whether there is a field call name in that Json payload then it sends an error right away to the client that the name field is required which is not provided in the Json payload in the same way if the name field is present then it goes one level deep it checks the value of the name it checks whether the name field is a string or not whether we are

sending an array uh in the name field when the AP is expecting a string then it checks the type of the data if the data is string or not if it is string then it continues if it is not a string then it sends an error message to the client that they're expecting a string but you are sending an error you are sending a bullion whatever the data the client is sending out then if that value of that name field is string then it goes inside it checks the length of the

field uh the length of the value so our requirement our the Restriction that we have put on this field is the length should be five characters to 100 characters and if the condition does not satisfy if the client is sending a huge field if it is sending a whole paragraph then we immediately send an error that the length is too long similarly if the client is sending a two character string then we also send an error that the

expected length is not matched whatever the customized error message is now the whole idea is with this example we just made sure that this payload this data structure that this API needs to do whatever uh database operation it needs to do to execute what whatever business logic it uh needs to execute we just made sure that the data is at least in the expected format and the reason we do

it at the entry point is because we don't have any unexpected state or our system does not break under unexpected circumstances and by unexpected I mean let's imagine we did not have this validation Pipeline and the client sent this payload we the client sent this Json with name as value zero instead of a string it send the value name as zero now what happened as in our previous

demo as you know here we are doing validations let's say we did not do validations here so the data reached here the Json with this value name as zero the data reached here in the controller layer and since we are not doing validations the controller layer in turn call a service layer and It executes Whatever the business logic is and the service layer in turn calls a repository method we are uh creating a new document we are creating a new row

in the database for some kind of entity let's say it's a book name so we are expecting the name of the book and depending on that we are creating a new row in the database now because we did not do validation at the entry point of our server now what happens is the data reaches here up to the repository method and the repository method executes the database query it's some kind of insert query and it inserts this name into the new books table because we are creating

a new book with the name that the user has provided with now usually in relational databases we have data type constraints when we are creating a table we usually tell the database what the type of the data is going to be for that field now when creating the books table we must have done something like this we created a column and the name of the column is name it's a bit confusing but for the sake of this example the name of

the column is the string name and the data type for this is let's imagine it's a postgress uh database we can say the data type is text and it is not null okay and since this is the database level constraint for the data type which is a expecting a string in the postgress nomenclature it is expecting a text data type now since we are inserting a number data type since the client provided us with a number value we executed this

insertion operation with the value zero now what happens the database call fails of course postgress uh validates all the data types before doing any kind of operation and the database called fails depending on the error handling situation of the server the client will get a 500 which means internal server error which basically says something went wrong in the server something unexpected happened in the server now that is a very poor user experience to

have in a basic form API now this is why we want to make sure whatever data that the API needs that the service method needs we want to make sure at the entry point that it is of the exact same structure of the exact same restrictions that our business logic needs before we do anything significant with it so that if the data does not satisfy those constraints we can send an error code of

400 which means bad request by the client we can say the data that you have sent does not satisfy our constraint so you have to read try with the appropriate data formats that is the need of validations to show a simple example of how validations look like we have a server running in this address in Local Host and we are making an API call this is insomnia we can use this tool to

make API requests and we are making a post request to this endpoint / API / valid SL syntactic and in the body we are sending a empty Json payload now if we execute this we get this error which are a set of values in an array it says it the API is expecting three Fields one email one phone and one date and all of them are required and since we are sending an empty Json payload the API is throwing as an error right away that

these three fields are required and you are not providing them now if let's go ahead and provide these fields so we write email let's provide some random string for the email part and then we have phone the phone also let's do some number random number and then we have a date the date let's provide a date let's say a date 2025

11 5 01 11 and if we execute this we get two errors that the email is invalid invalid email format and in the phone field it's saying it's expecting a string but it received a number okay so let's go ahead and satisfy these constraints for the email structure we can just do test.com let's go ahead and now the email field is satisfied it just says

for the phone now for the phone let's make this string and if we run this the success code is 200 now there are different types of validations and these are not strictly the types there could be of course more number of types depending on the requirement but overally these are the three types that you will encounter most of the times the first one is syntactic what is syntactic IC validation in here we have validations like validating whether a

provided string is an email or not which means the validation algorithm whatever depending on the whatever language you using whatever library that you are using that validation logic that validation algorithm will check whether the provided string satisfies this particular structure emails have this particular structure that there is this initial value let's say this called First and there is this character special character which

is at the rate and there is a domain which typically follows some name followed by some top level domain which can be Orco doin anything there are like thousands of different types of domain name the email follows a typical structure like this so This validation algorithm will check whether this provider string for follows this structure or not if it doesn't it throws an error similarly validating phone numbers also follow into syntactic validation because that also has to

follow a particular structure it can be the country code followed by a 10 digigit number and it can be any country codes a list of country codes all over the world and depending on the country code how many digits follow the country code those types of valid they fall under syntactic validation in the same way we have let's say dates dates also have a particular structure and they also fall under syntactic validation whether a provided string falls into

this structure whatever structure that AP is expecting let's say it expects in this format the year then month then day so whether this string follows this structure or not these are the some examples of syntactic validation then we have semantic validation semantic validation basically means whether the provider data makes sense or not for example let's say we have a form for data birth in the front end and the user can provide it the data worth and the back end will save that in database

through the API call now we want to make sure that whatever data that user provides whatever date the user provides it makes sense in a semantic way which means user cannot provide a date which is in the future let's say today's date is 111 2025 and if the user provides a date is like 1301 2025 which does not make sense right your date of birth cannot be in future

so validating these kinds of patterns in data falls under semantic validation we want to make sure that the data whatever the data is provided by the user makes sense in a semantic way in the same way if we have a field called age which takes the age of the user and if the user provides an age let's say called 365 so our validation pipeline should reject this because an age 365 does not make sense at least not yet it

might in the future but as of now this a does not make sense in a semantic way so these kinds of validations follow under semantic validation now the third type is type validation type validation is our basic validation of validating whether the data type match or not so whether the particular field is a string or not whether it is a number or it is a bullion or it is an array or a custom Json payload and that Json can have

other primary data types like string number bullion array etc etc so these kinds of validations are called type validation so overall these are the different different kinds of validation that you will encounter uh while you are designing apis and as I said these are not things that you should that you need to understand or you need to remember these are the things that you just have to keep in your mind while you're designing your apis like how strict you

want to be with your data and how strict you want to be with your validation Pipeline and how specific you want to be and since the title of the video also contains the term transformation let's talk about transformation now as I said before reaching before reaching our controller layer whatever data that is coming from the user needs to go through this pipeline is called the validation and

transformation pipeline now validation we have already understood which basically means uh confirming whatever data that the a is expecting the user data satisfies that structure and validations can be as specific as possible and as lose as possible depending on the requirement now that is validation now transformation means we want to execute some operations on the data that is provided by the user

depending on the success or failure of the validation pipeline or any other default executions now that is a little complicated to get our heads around so let's see an example so in a typical pated API we send different different query parameters in the API in the query string so we have this API the end point is let's say slash bookmarks is slash bookmarks and we want to send parameters these ones page as two and limit as

let's say 20 these are called query parameters and since in the start of the video we talked about we should validate everything that is coming from the client before it reaches our service layer before we execute any significant logic now we have to validate our query parameters so this is how we can Define our validation logic we want to say that page is a field uh which will be coming in the query parameter which is a number

which is a number which should be greater than zero and smaller than depending on your requirement let's say for in this case it can be should be smaller than 500 same way limit limit can should be uh greater than zero and it should be smaller than say 10,000 at a time only 10,000 data can be returned from the server these are our requirements now coming back we are saying page is a number which has to satisfy this constraint same way limit

is a number which has to satisfy this constant but we already know that page and limit are query parameters and all the query parameters are strings by default when they reach our server from the users browser or from any client they are strings back by default because query parameter all the values in the query parameters are strings they don't have any other data type by default they

can satisfy the structure of any other data type but their value will always be string when they reach our server in this case what happens this is how we receive our data receive page as the string to and limit as the string 20 from this now because of our validation requirements this fails this fails because we are seeing we are expecting a number the first requirement is it should be number then we check the value

whether it is greater than zero whether it is smaller than 500 etc etc but it fails on the first requirement because it is a string and the API fails and the user will get an error that page should be a string uh page should be a number but we are receiving a string something like that depending on your error message syntax now in this case there was no error from the client because it is a query parameter of course all the

fields all the values of the fields will be in string format by default it is the server responsibility to transform this data into a into our expected type you can also say it is the service responsibility to cast the data type cast basically means forcing one data type to get converted into another data type so in this example we we have to cast a string data type into a number and then we can execute our validation

requirements so this phase is called a transformation because the data that we are getting from the client has to go through some kind of changes before we can execute our validations it can be other way also after validation we can execute some operations to transform that data into a structure which is expected by the server okay so this process of converting data into a desirable format is called transformation and usually we pair

validations and transformation in a single pipeline so that the data layer the input data layer all the logic in the input data layer stays in the same place and we don't have to look in different different places to find out what all the requirements are and what are the operations that we are executing on the data before doing any business logic all the requirements of our validations and all the operations that we are doing through transformation

stays in one place which is called the validation and transformation pipeline now let's look at a few apis to get the idea of different different flavors of validations and transformation that we just talked about we have already seen the syntactic validation in the demo that we saw earlier in this video we executed this API with different different values and we got errors and we kept fixing those errors and we got a successful response of 200 basically we had to provide a string which satisfies

an email structure we had to provide a phone number and we had to provide a date which has this this structure a valid date structure and after that the API was successful and this is an example of syntactic validation because we had to stick to particular structures of data it was email phone and date in this case it can be different different other kinds of data in the same way let's see what is semantic validations so let's execute this so we are getting this validation error and validations

also have this uh advantage that even if the client hits this API without any value they can get some kind of documentation with the errors that these are the fields that are expected in this API which the client is not providing so that is a very nice way of figuring it out if the API documentation is not available in the false place the error messages the validations error messages are a very nice way to figure out what are the expected fields and what are the

expected values in those fields that the API is demanding from you okay in this case we sent an empty payload and we got these errors so the API is expecting two Fields one as date of birth which says is required and the second one is age it says expected number is none okay now let's provide these values to this API so the first field is date of worth and

for this let's provide a date 1995 the month is June and some random date is 12 let's execute this and the date of birth was satisfied and now it says age is required so let's also provide an age age is let's say 43 and when we send this the API is successful and we are getting the same value that we have sent now since this demo is about semantic validation let's change

these values instead of 1995 let's make it the year 2026 as the date of birth and send this now we get this error that date of birth cannot be in the future this is what uh we had talked about earlier the semantic validation checks that whether the value of a field makes sense or not in a semantic way since the date of birth of a person cannot be in the future we are throwing this error for this field that

date of birth cannot be in the future in the same way let's revert this and let's make the age 430 instead instead of 43 let's make it 430 and then we get this that number must be less than or equal to 120 logical age for a person at this times is from 1 to to 120 even though a oneyear old uh won't be hitting our APA but still they're accepting ages from 1 to1 that's why it's called semantic validation we are checking whether the

values provided in different different fields make sense in a semantic way or not okay in this one let's hit this this API is expecting three Fields one password another is password confirmation and the third one is married or not okay now let's provide these values password system called random and passord information another string and for merit

mared say false and this let's hit this now this is another kind of validation check that we usually do so in most of the platforms we have a password fi and a confirm password field just for security regions and to check the Integrity of the data whether the user uh has provided valid data or not so the requirement is always that the password confirmation value should always match the password value right and since we

are sending different different strings in these two Fields that's why the API is returning the error that passwords don't match for this field password confirmation for the password field it is returning an error that string must contain at least eight characters that is the requirement of the password now let's fix this to satisfy the constraints we have six characters let's add one more and let's copy this and paste this here so that these two Fields

match and when we run this now it is true the IP is successful now what happens if we do m as true run this we get another error that the API is expecting another field called partner if the field married is true it says partner name is required when married is true so if we provide the value is true we also have to provide a name in the field partner because the user saying it is married let's provide another field

just called partner and some let run this and we get a successful response with all the fields that we have provided and in this particular example it is called complex because uh depending on the requirements of the service layer we can Define our validation constraints in the first one we are saying that two Fields should have the same value that's what this constraint was about in the second one we are seeing the partner field is

optional when married is false but if the married field is true the user also have to provide another field called partner with the name of the partner so we can also do these complex pipelines inside our validation pipeline when dealing with users data okay now moving on to our transformation example let's run this and it says email phone date it requires three Fields so let's provide three Fields let's just

copy this one from our earlier demo and and write it here let me send this okay and in order to see how the transformation Works in this particular API let's make this email change the cases of this email let's see a and in the and then when we run this we are getting this value now what I want to show here is the user is sending data in an unexpected format in a particular structure the structure is

of an email but inside that it has lower case and it has upper case and in the phone number also it is it does not provide the character plus before the phone number and and the date it provides this particular date structure and when we send this to server the server does some transformation it changes the data according to the requirements of the service layer and because of that even though we sent the value of the email with different different cases lower case and upper

case when we got the value back it was in all lower case because the server in the validation pipeline it transformed the value into lower case before doing any kind of executions on the data now that's the reason we got this email which is in all lower case in the same way for the phone field whatever value we sent in this field it added this character plus before the value and it sent it back to us same with the field it calculated in a different format and send it back this is some example of

transformation uh after the client send some data in the payload the server can do some kind of operations on the data to transform it into a different format for the convenience of the service layer or for the requirements of the service layer before performing any kind of executions on it in the same way let's run the example of type validations and when you run this we get we the IPA is expecting four Fields one is string field one is number field one

is arror field and one is Will so let's first provide all these uh values then we'll see what are the validation errors first one let's say string field string field and the string is something and let's just copy this and paste it this is a number field and this is an array field and this is a bullan field and let's hit this API now this error is gone now the number field is saying

expected number received string the array field is saying expected array received string same for the bullan field now this is an example of type validation that's why because we're showing a demo of how the server can enforce different different data types from the client so in order to fix these errors we make this some valid number let's say 10 and when you hit this that is gone we have an array and bullan errors this we can send an error we can send an array with value one and two and for the Boolean field we can send false

and now when we run this we are getting more errors which says inside the array field the element zero it is expected to be a string but we are sending numbers now to fix this let's make this strings string now when we run this we are getting all the values that we had provided the string field has string value the number field has number value the array field has array value but for each element of the array we can also

enforce different validation constants which in this case was each element should be of type string and in the last one it is a bullan type okay either true or false and the last thing that I want to bring into attention is this concept sometimes there are confusions and people sometimes do this mistake of replacing front end validation with backend validation okay this mistake happens because usually in a typical

platform we have forms and inside forms we do some kind of validation for example let's say we have this form we have this form and there is a field called name and it's an input box and the user has to type some name and if it's valid if the name satisfies whatever requirements that we have either the minimum or maximum number of characters or the data type matching whatever else it is if it is successful

then when the user clicks on the submit button we call the AP if the validation is not successful we show some errors in the form itself that this data type is does not satisfy our constraints and when it does satisfy the constraints we call the AP and the back end does whatever stuff that it needs to do right now the thing to keep in mind is for all interactions for all apis we need both the front end validation and the server side validation our backend validation because front end validation the reason

we use frontend validation is for ux which means user experience frontend validation is not for security or data Integrity it is for user experience we want to give some immediate feedback to the user that something is wrong with the data and the data that we are expecting does not match the data that you have provided it is not about security backend validation which we do that is for security and data integrity

that's the reason it does not matter what kind of validations your front end does whether it does or whether it doesn't because a typical server can have different different clients it can have a uh very convenient web app which can do different different types of validation but as we saw it can also have a API client based interface for example something like Postman or something like insomnia right there there is no concept of validation right because we are directly interacting with our apis there is no front-end interface

that acts as a proxy to our API which acts as a user interface right in that case there is no front- end validation at all so if the backend depends on the front end validation for security purpose or for the data Integrity purpose our server will break when the client changes that's why this is a very important concept when we are talking about validation and transformation it does not matter what the client is when you're designing your apis you have to

be as strict and as specific as possible in your server side validation logic without thinking of what the client validation will look like because as I said client validation is for the user experience but server validation is for security and data Integrity now for all the API examples that we saw earlier in the API client insomnia we also have a we also have a complimentary front end interface to deal with those apis for all the different different types

syntactic semantic type uh transformation and complex now let's see how front end and backend validations work together now inside this let's provide some email let's say some email let's provide wrong values first then we can see whether it works or not the phone number also let's provide some 1 2 3 4 5 and date we already have a date set Lector so we can choose any date now let's do submit in this case we did not

make any API calls yet because the front end detected that there is something wrong with this field and it is a front end side validation it is the client side validation now to satisfy this we have to make this a proper email we can WR this gmail.com and then when we do submit we do a proper API call and when we see the payload it is sending the this and in the preview it is getting this and it is a getting a 200 response is a success response this is how client side

validation and server side validation work together the client side validation provides a proper user experience provides immediate feedback to the client to the user and the server side validation performs different different Security checks and it is a mandatory validation which has the purpose of security and data Integrity but the front endend validation has the purpose of user experience and providing immediate feedbacks now that's pretty much all there is to talk about validations and transformation it's not

a very complex topic or a very large topic just a set of rules and guidelines to follow when you are designing your apas