SQL Server DataTypes

Datatypes in SQL

DEFINITION : 
 

A data type is an attribute that defines the type of data that a column, variable, or parameter can store. It determines the kind of operations that can be performed on that data.For example, a column with a data type of “integer” can store whole numbers, such as 1, 2, 3, but not decimals or non-numeric values. Similarly, a column with a data type of “varchar” can store character strings of variable length, such as names or addresses. 

Types of data types :  

Numeric data types : Numeric data types are used to store numeric values such as integers, decimals, or floating-point numbers. Common numeric data types include INT, BIGINT, SMALLINT, DECIMAL, FLOAT, and REAL. 

  • INT : INT (or INTEGER) is a commonly used data type for representing whole numbers. It has a range of -2,147,483,648 to 2,147,483,647 (or -231 to 231-1). It takes up 4 bytes of storage space. INT is suitable for most situations where whole numbers are used, such as counting or indexing. 
  • BIGINT : BIGINT is a data type for representing very large whole numbers. It has a range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (or -263 to 263-1). It takes up 8 bytes of storage space. BIGINT is appropriate for situations where extremely large whole numbers need to be represented, such as in financial or scientific calculations. 
  • SMALLINT : SMALLINT is a data type for representing small whole numbers. It has a range of -32,768 to 32,767 (or -215 to 215-1). It takes up 2 bytes of storage space. SMALLINT is appropriate for situations where only small whole numbers need to be represented, such as in a lookup table. 
  • DECIMAL: DECIMAL (or NUMERIC) is a data type for representing fixed-point numbers. It allows for a user-specified number of decimal places and can represent both positive and negative numbers. It takes up between 5 and 17 bytes of storage space, depending on the precision and scale of the data. DECIMAL is appropriate for situations where precise calculations are needed, such as in financial or scientific applications. 
  • FLOAT: FLOAT is a data type for representing floating-point numbers. It can represent very large or very small values, with a precision of up to 15 digits. It takes up 4 or 8 bytes of storage space, depending on the implementation. FLOAT is appropriate for situations where approximate calculations are needed, such as in scientific or engineering applications. 
  • REAL: REAL is similar to FLOAT, but it is a 4-byte data type that has a precision of only 7 digits. It is appropriate for situations where storage space is limited, and precision is not critical. 

Character data types : Character data types are used to store strings of characters, such as names or addresses. Common character data types include CHAR, VARCHAR, and TEXT. 

  • CHAR is a fixed-length character data type, meaning that it always takes up a specific amount of storage space. For example, if a column is defined as CHAR(10), it will always occupy 10 characters of storage, even if the actual value stored in the column is shorter. CHAR is useful for storing values that have a consistent length, such as postal codes or phone numbers. 
  • VARCHAR, on the other hand, is a variable-length character data type, meaning that it can take up varying amounts of storage space. For example, if a column is defined as VARCHAR(10), it can store values up to 10 characters long, but it will only use as much storage space as necessary to store the actual value. VARCHAR is useful for storing values that may vary in length, such as names or addresses. 
  • TEXT is another variable-length character data type, but it can store much larger amounts of text than VARCHAR. TEXT is useful for storing large amounts of data, such as long-form descriptions or documents. 

Date and time data types : Date and time data types are used to store dates and times. Common date and time data types include DATE, TIME, DATETIME, TIMESTAMP, and INTERVAL. 

  • DATE : This data type is used to store only the date, without any time or time zone information. For example, the date “April 3, 2023” could be stored in a DATE column as ‘2023-04-03’. 
  • TIME: This data type is used to store only the time, without any date or time zone information. For example, the time “9:30 AM” could be stored in a TIME column as ’09:30:00′. 
  • DATETIME : This data type is used to store both the date and time, without any time zone information. For example, the datetime “April 3, 2023, 9:30 AM” could be stored in a DATETIME column as ‘2023-04-03 09:30:00’. 
  • TIMESTAMP : This data type is used to store both the date and time, with time zone information. TIMESTAMP data type is used to store a precise moment in time with accuracy up to microsecond. For example, the timestamp “April 3, 2023, 9:30 AM PST” could be stored in a TIMESTAMP column as ‘2023-04-03 09:30:00.000000 -08:00’. 
  • INTERVAL: This data type is used to store a duration of time, such as a number of days or hours. For example, an INTERVAL column could store the duration of an event, such as “3 hours” or “2 days”. 

Binary data types : Binary data types are used to store binary data, such as images or audio files. Common binary data types include BLOB, BYTEA, and VARBINARY. 

  • BLOB (Binary Large Object) : This data type is used to store large amounts of binary data, such as images or audio files. For example, a BLOB column could store an image file of a person’s face, which can be retrieved and displayed later. The maximum size of a BLOB can vary depending on the database system being used. 
  • BYTEA: This is a binary data type used in PostgreSQL to store binary data as a variable-length array of bytes. For example, a BYTEA column could store a PDF document or a ZIP archive. The maximum size of a BYTEA value is 1 GB. 
  • VARBINARY: This is a binary data type used in some database systems to store variable-length binary data. For example, a VARBINARY column could store a JPEG image or a WAV audio file. The maximum size of a VARBINARY value can vary depending on the database system being used. 

Boolean data types : Boolean data types are used to store true/false or yes/no values. Common boolean data types include BOOLEAN, BIT, and BOOL. 

  • BOOLEAN : This data type is used to store true/false values. In some database systems, such as PostgreSQL and MySQL, boolean values are represented as “true” or “false”. In other database systems, such as Oracle, boolean values are represented as “Y” or “N”. For example, a BOOLEAN column could store whether a product is in stock (true) or out of stock (false). 
  • BIT: This data type is used to store binary values of 0 or 1. In some database systems, a BIT data type can be used to represent boolean values, where 0 represents false and 1 represents true. For example, a BIT column could store whether a user has opted in to receive marketing emails (1) or not (0). 
  • BOOL: This data type is used in some database systems, such as SQLite, as a shorthand for BOOLEAN. BOOL data type also stores true/false values in the database. 

Choosing correct data types for a column 

Choosing the appropriate data type is a critical step in designing a database schema. The choice of data type affects the storage and retrieval of data, and it can prevent data truncation or overflow errors. 

When selecting a data type for a column, variable, or parameter in your database, it is essential to consider the nature of the data and the operations that will be performed on it. For example, a numeric data type may be appropriate for storing quantities or financial data, while a character data type may be suitable for storing names, addresses, or descriptions. 

The choice of data type can also affect the amount of storage space required for each value. For instance, a numeric data type that uses fewer bytes may be more efficient for storing large datasets, while a character data type that allows for variable-length strings may be better suited for storing text fields of varying length. 

Another important consideration is the potential for data truncation or overflow errors. If a data value exceeds the range of the data type assigned to a column, it may be truncated, resulting in a loss of data. On the other hand, if the value is too large to be stored in the allocated memory, it may result in an overflow error. 

To prevent data truncation or overflow errors, it is essential to choose a data type that has a sufficient range to accommodate the expected values for the column. For example, if you expect a column to store values up to a maximum of 10,000, then you may want to select a data type that can accommodate values of at least that size.