SQL Server (also known as MS SQL or t-SQL) supports several data types, from rather standard ones (e.g. INT, CHAR), to others that are more complex (e.g. TABLE, HIERARCHYID).
Approximate Numerics
FLOAT
This data type is composed of a number that is not an integer, that is: it includes a fraction represented in decimal format. One of the main differences between FLOATs and INTEGERs is that this data type has more precision. It is defined as FLOAT[(n)], where n is the number of bits that are used to store the manitssa, with n<25 as single precission, and nā„25 as double precission.
SELECTCAST(3.141592653589793238asFLOAT) float;
float
3.141592653589793
About
Range: ā1.79E+308 to 1.79E+308
Precision: 15 digit
Memory: Depends on precision
Practical Tips
Decimal is preferred if the number stays below the maximum precision provided by the decimal, which is 38.
If storage space is main criteria, then use REAL (it only takes 4 bytes), provided the loss accuracy is ok.
FLOAT values are truncated when converted into any type of INTEGER.
Avoid using FLOAT equality checks (=), inequality (<>) checks etc, rounding of numbers, etc.
Avoid using FLOATs in application like financial applications where accuracy is important.
If you want to change a value from FLOAT to character types, it is better to use the STR function instead of CAST, for it allows better control on the format.
REAL
This data type is used for floating-point values, like FLOAT. However, REAL is a single precision floating point number, whereas FLOAT is a double precision floating point number. In short, if you double the precision, you double the bits of memory. REAL is equivalent to FLOAT(24).
SELECTCAST(3.141592653589793238asREAL) real;
real
3.1415927410125732
About
Range: ā3.40E+38 to -1.18E-38 for negative values, and 1.18E-38 to 3.40E+38 for positive values. The value 0 can also be stored.
Precision: 7 digit
Memory: 4 bytes
Practical Tips
It takes up less memory than FLOAT.
Both FLOAT and REAL are useful for scientific calculations.
REAL is less precise than FLOAT - but if precision is your concern, you should use Exact Numerics (DECIMAL is the safe choice).
If you want to change a value from REAL to character types, it is better to use the STR function instead of CAST, for it allows better control on the format.
Exact Numerics
TINYINT, SMALLINT, INT & BIGINT
These data types represent integer values. The main difference between them is the amount of bites they can be stored in.
Range: ā9,223,372,036,854,775,808 to 9,223,372,036,854,775,808
Memory: 8 bytes
Practical Tips:
All the data types above do the same job: storing integers. In this regard, however, you should choose the data type that suits your needs better, in order to save memory.
When INTEGERs are converted to character types, if the value is too big to fit into a the latter, t-SQL adds an asterisk * to indicate so.
INTEGERs larger than 2,147,483,647 are converted by default into DECIMAL instead of BIGINT - unless explicitly specified via CAST.
When using arithmetic operators (+, -, *, /, or %) to make implicit or explicit conversions from INT, SMALLINT, TINYINT, or BIGINT to FLOAT, REAL, DECIMAL, or NUMERIC, the rules applied by SQL Server to calculate the data type and precision of the results depends on whether the query includes automatic parameters:
If a query does not have automatic parameters, the value is first converted into NUMERIC(p,0).
If a query includes automatic parameters, the value is always converted into NUMERIC(10,0) before being converted to the final data type.
Therefore, similar expressions with or without automatic parameters can yield different results.
Pay particular attention with the operator /. Automatic opperators may affect not only the precision, but also the value.
NUMERIC & DECIMAL
These data types are used to store numbers that have fixed precision and scale.
The precision is the maximum total number of decimal digits that will be stored, both to the left and to the right of the decimal point.
It has a range from 1 to 38, with a default precision of 38.
TheĀ scale is the number of decimal digits that will be stored to the right of the decimal point.
It has a range from 0 to p (precision).
It can only be specified if the precision is specified.
By default, the scale is zero.
Each combination of precision and scale is considered by SQL Server as a different data type: e.g. DECIMAL(3,1) and DECIMAL(3,0)considered different.
A constant value with a decimal separator is automatically converted into NUMERIC, with the minimmum precision and scale necessary. For example, 11.141 is converted into NUMERIC(5,3).
Converting from DECIMAL or NUMERIC to FLOAT or REAL can entail a loss of precision.
Converting from INT, SMALLINT, TINYINT, FLOAT, REAL, MONEY or SMALLMONEY to DECIMAL or NUMERIC can entail an overflow.
By default, SQL Server rounds up the value if a number is converted to DECIMAL or NUMERIC with a lower precision and scale.
If the option SET ARITHABORT is ON, SQL Server flags an error if an overflow occurs.
The loss of precision and scale is not enough to trigger an error.
MONEY & SMALLMONEY
These data types are used to represent monetary values. MONEY values correspond to 8-byte DECIMAL values, rounded up to 4 decimals after the point. SMALLMONEY corresponds to MONEY, but it is stored in 4 bytes.
MONEY and SMALLMONEY are good data types to display results o
About:
MONEY:
Range: ā922,337, 203, 685,477.5808 to +922,337, 203, 685,477.5807
Memory: 8 bytes
SMALLMONEY:
Range: ā214,478.3648 to +214,478.3647
Memory: 4 bytes
Practical Tips:
You should use a dot . to separate partial currency units , e.g. 7.53 can specify 7 dollars and 53 cents.
MONEY and SMALLMONEY are good data types to display results to the user, but they are not recommended for calculations due to their lack of accuracy and intrinsic inflexibility.
DECIMAL is the recommended data type in order to store and work with currencies. MONEY and SMALLMONEY rounds up the values, and thus the results may be affected by errors accumulating over time.
It is not necessary to include the currency type between quotes '. In fact, note that SQL Server cannot store any information regarding the currency symbol - it merely stores the numeric value.
If you want to use a currency in particular, you can do so with the function FORMAT(), specifying the accuracy ('Cn' where n is an integer that represents the decimal part) and the culture argument for the currency - you can learn more about this here.
CHAR[(n)] represents a fixed-length string of single-byte characters, where n is the length of the string in bytes. It can also be written as CHARACTER(n).
VARCHAR[(n)] describes a variable-length string of single-byte characters. In opposition to CHAR, the values in this data type are stored in their actual length.
In CHAR(n) and VARCHAR(n), if n is omitted the length of the string is assumed to be 1.
If n is smaller than the length of characters inside the string, it will be truncated (as in the second example above).
For single byte character coding sets like latin, the storage size is n bytes, and the number of characters that can be stored is n as well.
For multibyte character coding sets, the storage size is n but the number of characters that can be stored is smaller.
If n is not specified, its default value when using the functions CAST and CONVERT is 30.
The objects using CHAR or VARCHAR are assigned the default collation of the database, unless a specific collation is assigned via the function COLLATE.
Use CHAR when the size of the column data entries are consistent.
Use VARCHAR when the size of the column data entries vary considerably.
Use VARCHAR(max) when the size of the column data entries vary considerably and the length of any given string can be greater than 8000 bytes.
If a character expression is converted to another character expression of a different data type or size, i.e. from VARCHAR(15) to CHAR(5), or CHAR(10) to CHAR(5), the collation of the input value is assigned to the converted value.
For any noncharacter expression converted to a character data type, the default collation of the current database is assigned to the converted value.
Character expressions that are being converted to an Approximate Numeric data type can include optional exponential notation, of the form E+/- and a number.
Character expressions that are being converted to an Exact Numeric data type must include digits, a decimal point, and an optional sign (+ / -). note, however, that leading blanks are ignored, and that comma separators - such as the thousands separator - are not allowed in the string.
Character expressions being converted to MONEY or SMALLMONEY data types can include an optional decimal point and dollar sign $. Comma separators are allowed.
When an empty string is converted to INT, its value becomes 0.
When an empty string gets converted to DATE, its value becomes the default value for date - i.e. 1900-01-01.
NCHAR & NVARCHAR
NCHAR[(n)] stores fixed-length strings of Unicode characters. The main difference with CHAR is that each character of the NCHAR data type is stored in 2 bytes. Thus, the maximmum number of characters in a column in NCHAR is 4000.
NVARCHAR[(n)] stores variable-length strings of Unicode characters. The difference between NVARCHAR and VARCHAR is exactly the same as with CHAR and NCHAR: NVARCHAR takes two bytes of storage per character, whereas VARCHAR only takes one.
SELECTCAST (25515asNCHAR) nchar;
nchar
25515
SELECTCAST (446173as NVARCHAR) nvarchar;
nvarchar
446173
About:
NCHAR
Range: 0 to 4000 chars
Memory: 2*n bytes
NVARCHAR
Range: 0 to 4000 chars
Memory: 2*n bytes + 2 bytes
Practical Tips:
In NCHAR(n) and NVARCHAR(n), if n is omitted the length of the string is assumed to be 1.
If n is not specified with the CAST function, it takes the default value of 30.
Use NCHAR when the size of the column data entries are consistent.
Use NVARCHAR when the size of the column data entries vary considerably.
Use NVARCHAR(max) when the size of the column data entries vary considerably and the length of any given string can be greater than 4000 byte-pairs.
In NCAHR(n) and NVARCHAR(n), the n does not define the number of characters, but the string length in byte-pairs.
To convert NCHAR and NVARCHAR values to other data types, see the discussion regarding CHAR and VARCHAR data types.
Temporal Data Types
DATETIME & SMALLDATETIME
DATETIME is used to specify a date and a time - plain and simple! - with each one of these values being stored separately as integers in 4 bytes. The time part has a precision of three-hundredths of a second. In contrast, SMALLDATETIME stores each value in 2 bytes and has a precision of minutes.
[H]H: One or two digit hour (valid values from 00 to 23)
[M]M: One or two digit minutes (valid values from 00 to 59)
[S]S: One or two digit seconds (valid values from 00 to 59)
[.DDD]: Up to six fractional digits (microsecond precision for SMALLDATETIME only)
Practical Tips:
This data type defines a date and a time with fractional seconds based on a 24-hour clock.
All values in DATETIME are rounded to increments of .000, .003, or .007 seconds.
If you convert a DATE value to DATETIME or SMALLDATETIME, the date part is copied and the time part is set to 00:00:00.000.
If you convert a TIME value to DATETIME, the date part is set to 01/01/1900 and the time part is copied. For SMALLDATETIME, the fractional seconds are set to 0.
If the fractional precision of TIME is greater than three digits, DATETIME truncates it.
If you convert a SMALLDATETIME value to DATETIME, the seconds and fractional seconds are set to 0.
If you convert from DATETIMEOFFSET to DATETIME or SMALLDATETIME, date and time components are copied, but the time zone is truncated. For DATETIME, if the fractional precision of the original value is greater than three digits it is truncated. For SMALLDATETIME, fractional seconds are set to 0.
DATE & TIME
To store the DATE or TIME separately, DATETIME is rather inconvenient. Thus, you can use any one of these data types to store each value separately.
DATE values can be specified by four, six, or eight digits.
A 6 or 8 digit string is always interpreted by default as ymd.
Both the month and the day must always be defined by two digits.
A 4 digit string is always interpreted by default as a year.
The default date format is set according to the language configuration. To change it, use SET LANGUAGE and SET DATEFORMAT.
If possible, try to use four-digit years.
To separate the day, month, and year values, you may use slash marks /, hyphens -, or periods ..
Acceptable formats forDATEare:
mdy
[m]m/dd/[yy]yy
[m]m-dd-[yy]yy
[m]m.dd.[yy]yy
myd
mm/[yy]yy/dd
mm-[yy]yy/dd
[m]m.[yy]yy.dd
dmy
dd/[m]m/[yy]yy
dd-[m]m-[yy]yy
dd.[m]m.[yy]yy
dym
dd/[yy]yy/[m]m
dd-[yy]yy-[m]m
dd.[yy]yy.[m]m
ymd
[yy]yy/[m]m/dd
[yy]yy-[m]m-dd
[yy]yy-[m]m-dd
The format ydm is not supported by DATE.
TIME:
An hour value of 0 represents the hour after midnight, regardless of whether AM is specified. Thus, if the hour is 0, PM cannot be specified.
By default, our values from 01 through 11 represent the hours before noon regardless of whether AM or PM is specified. If AM is specified it represents the hours before noon. If PM is specified it represents the hours after noon.
An hour value of 12 represents the hour that starts at noon. If AM is specified, the value represents the hour that starts at midnight. If PM is specified, the value represents the hour that starts at noon.
Hour values from 13 through 23 represent hours after noon. The values also represent the hours after noon when PM is specified. AM cannot be specified when the hour value is from 13 through 23.
An hour value of 24 is not valid. To represent midnight, use 12:00 AM or 00:00.
Milliseconds can be preceded by either a colon :, or a period ..
If a colon is used, the number means thousandths-of-a-second.
If a period is used:
A single digit means tenths-of-a-second
Two digits mean hundredths-of-a-second
Three digits mean thousandths-of-a-second.
Acceptable formats forTIMEare:
hh:mm[:ss][:fractional seconds][AM][PM]
hh:mm[:ss][.fractional seconds][AM][PM]
hhAM[PM]
hh AM[PM]
DATETIME2
This data type stores high precission date and time data. It can also be defined for variable lengths depending on the requirement.
It is not sensitive and, thus, does not preserve the time zone offset.
Fractional second precision can be defined by the user.
When a DATETIME2 value is CAST to a VARBINARY value, an additional byte is added to the VARBINARY value to store precision.
implicit conversions from DATETIME to DATETIME2 data types show improved accuracy by accounting for the fractional milliseconds, resulting in different converted values.
Use explicit casting to DATETIME2 datatype whenever a mixed comparison scenario between DATETIME and DATETIME2 datatypes exists.
Conversions from string literals to DATE and TIME types are permitted if all parts of the strings are in valid formats. Otherwise, a runtime error is raised.
DATETIMEOFFSET
This data type, in contrast with all the types discussed before, captures the time zone offset portion. Besides the memory space needed, it has the same properties as DATETIME2.
DATETIMEOFFSET provides time zone compatibility for those applications implemenbted globally.
A time zone offset specifies the zone offset from UTC for a TIME or DATETIME value.
This is represented as [+|-] hh:mm:
hh ranges from 00 to 14 to represent the number of hours in the time zone offset.
mm ranges from 00 to 59 to represent the number of additional minutes in the time zone offset.
+ and - are the indicate whether the time zone offset is added or subtracted from the UTC time to obtain the local time. Including them is obligatory.
The range of time zone offset is from -14:00 to +14:00.
Binary Strings
BINARY & VARBINARY
These data types allow storing binary values. BINARY works best when the values of the column data entries are consistent, with the length of such values ranging from 1 to 8000 bytes. On the contrary, VARBINARY allows considerable variations between different values in the data. They store bit strings, represented in the internal format of the system. Therefore, they are entered using hexadecimal numbers.
Memory: n bytes for BINARY and the actual length of the data entered +2 bytes for VARBINARY
Properties:
The default length is 1 when n isn't specified.
The default length is 30 when n isn't specified via CAST.
Changing from one binary type to another will result in truncation if the target type does not have enough space.
When converting string data to binary of smaller length, the data is truncated on the right
If the column data entries exceeds 8,000 bytes, it is possible to use VARBINARY(MAX).
To transform a string to a number, we need to use the CAST or CONVERT functions - implicit conversion is not allowed.
Conversions from string data types to BINARY or VARBINARY data types of unequal length, SQL Server truncates the data on the right. This happens with CHAR, VARCHAR, NCHAR, NVARCHAR, BINARY, VARBINARY, TEXT, NTEXT, and IMAGE.
With other data types, the data is truncated on the left. It may also be padded via hexadecimal zeros.
Conversion of data to BINARY or VARBINARY is recommended if binary data is the best way to move it. However, make sure the binary type has a large enough size, and to convert it back to the original data type in the same version of SQL Server.
You can convert integers to BINARY or VARBINARY. However, if you convert it back, the value will differ if truncation has occurred.
BIT
This is a data is used for specifying the Boolean data type with three possible values: 1, 0, and NULL.
SQL Server converts a string value TRUE to 1 and FALSE to 0
It optimises the storage of BIT columns, so if a table has 8 or fewer columns, it stores them as 1 byte, from 9 to 19 as 2 bytes, and so on.
The string values TRUE and FALSE can be converted to BIT values: TRUE is converted to 1 and FALSE is converted to 0.
IMAGE
Microsoft recommends using VARBINARY(MAX) instead of IMAGE for storing a large amount of data in a single column since IMAGE will be retired in a future version of MS SQL Server.
Others
TABLE
Is a special data type used to store a result set for processing at a later time. Its use is recommended for temporarily storing a set of rows that are returned as the table-valued function result set. To declare variables of the type TABLE, use DECLARE @local_ variable as TABLE
Practical Tips:
A TABLE variable behaves like a local variable with a well-defined scope that can be used in the function, stored procedure, or batch in which it's declared.
A TABLE can be used in SELECT, INSERT, UPDATE, and DELETE statements.
TABLE variables are automatically cleaned up at the end of the function, stored procedure, or batch in which they're defined.
Transactions involving TABLE variables last only for the duration of an update on the TABLE variable.
Assignment operation between TABLE variables isn't supported.
Table variables can't be altered after creation.
ROWVERSION
ROWVERSION is a data type that conveys automatically generated, unique binary numbers within a database. It is highly recommended to version-stamp table rows.
Practical Tips:
ROWVERSION tracks the relative time within a database, not the actual time.
It is just an incrementing number and, therefore, cannot preserve DATE or TIME - to record a date or time use DATETIME2.
A table can have only one ROWVERSION column.
If a row with a ROWVERSION column is modified or inserted, the incremented database rowversion value is inserted in the ROWVERSION column.
The ROWVERSION value is incremented with any update statement, even if no row values are changed.
TIMESTAMP is a deprecated synonym for ROWVERSION.
HIERARCHYID
This data type is used to store an entire hierarchy, and it is implemented asa Commin Language Runtime (CLR) user-defined type that comprises several system functions for creating and operating on hierarchy nodes. A value of the HIERARCHYID data type represents a position in a tree hierarchy
Practical Tips:
HIERARCHYID encodes information about a single node in a hierarchy tree by encoding the path from the root of the tree to the node.
Such path is logically represented as a sequence of node labels of all children visited after the root.
The representation starts with a slash.
A path that only visits the root is represented by a single slash.
For levels below the root, each label is encoded as a sequence of integers separated by dots.
Each level is followed by a slash.
Data Precedence
But what happens if an operator combines two or more expressions containing different data types? In such case, all of the data types with a lower priority are converted into the data type with the higher priority. This is also known as implicit conversion.
The data precedence in SQL Server is the following:
data types defined by the user (higher priority)
sql_variant
xml
datetimeoffset
datetime2
datetime
smalldatetime
date
time
float
real
decimal
money
smallmoney
bigint
int
smallint
tinyint
bit
ntext
text
image
timestamp
uniqueidentifier
nvarchar (including nvarchar(max))
nchar
varchar (including varchar(max))
char
varbinary (including varbinary(max))
binary (lower priority)
Note that this established priority is also used when optimising indexes of different data types.
Truly collaborative analytics platform
Count redefines how data teams plan, build and share their analysis. It's Miro, but for data.
Discover the most flexible SQL editor
Count redefines how analysts plan, build, and share their analysis. It's Miro, but for data.
Become a SQL Expert
Get five free SQL lessons from Count's Head of Data, Taylor Brownlow.