Hive Data Type
The Hive Data Types are classified into major categories, which are discussed below:
Primitive Data Type in Hive
Primitive Data is categorized into four types, which are listed below:
- Numeric Data Type
- Date/Time Data Type
- String Data Type
- Miscellaneous Data Type
Numeric Data Type in Hive
Numeric Data is divided into two types:
- Integral Data Type
- Floating Data Type
Integral Data Type:-
Integral Data Type are as follow-
Type | Size | Range |
TINYINT | 1 Byte signed Integer | -128 to 127 |
SMALLINT | 2 Byte signed Integer | 32, 768 to 32, 767 |
INT | 4 Byte signed Integer | –2,147,483,648 to 2,147,483,647 |
BIGINT | 8 Byte signed Integer | -9,223,372,036,85,775,808 to 9,223,372,036,854,775,807 |
Floating Data Type:-
Floating Data Type are as follow-
Type | Size | Range |
Float | 4 Byte | Single-precision floating-point number |
Double | 8 Byte | Double-precision floating-point number |
Decimal | 17 Byte | Arbitrary-precision signed decimal number |
Date/Time Data Type
The Hive data types in this category are:
- TimeStamp
- Date
- Interval
TimeStamp
It is used for nanosecond precision and denoted by YYYY-MM-DD hh:mm:ss format.
Date
The date value is used in the form YYYY-MM-DD to determine a particular year, month, and day. It does not provide the time of the day. However, the range of Date type lies between 0000--01--01 to 9999--12--31.
Interval
The table is shown below:
It displays the timestamp values and the method to cast Date format string.
Cast Type | Result |
Cast (date as date) | Any data value |
Cast (date as string) | Date can be formatted to 'YYYY-MM-DD.' |
Cast (string as date) | Midnight of the year/month/day of the date value in such cast type are returned as a timestamp. |
Cast (date as timestamp) | It returns an associated date value, when the string is in the form of ' YYYY-MM-DD ' If given format does not matches the string value, NULL will be returned. |
Cost (timestamp as date) | The timestamp year/month/day will return a date value. |
String Data Type
String Data is divided into three types:
- String
- Varchar
- char
String
When enclosing characters, either single or double quotes must be used.
Varchar
Maximum brace length is defined, and up to 65355 bytes are allowed.
Char
The char is a type of fixed length, with a maximum length of 255.
Miscellaneous Data Type:-
Different types of data support both Boolean and binary types of data.
- Boolean
- Binary
Boolean:-
The Boolean stores the value either true or false.
Binary:-
It is defined as an array of bytes.
Complex Data Type
Complex Data is divide into three types:
- Array
- Map
- Struct
Array
Array is defined as the collection of similar data types. The value of such data types are indexable using the zero-based Integer.
Syntax: - ARRAY<data_type>
Example: - array (6, 7)
Map
It is a collection of key-value pairs. The Keys can be primitives, values, or any data type. The keys and values for a specific map must be of the same type.
Syntax: - MAP <primitive_type, data_type>
Example: - MAP (‘m’, 8, ‘n’, 9).
Struct
It is defined as the collection of named fields. The field can be of different types.
The Struct is similar to the STRUCT present in C language).
Syntax:-STRUCT <col_name: data_type [COMMENT col_comment]…>
Example: - struct (‘m’, 11.0), named-struct (‘col1’, ‘m’, ‘col2’ ‘1’, ‘col3’, ‘1.0’)
Union
Union type can hold any data type, which can be one of the specified data types.
The Union data type is similar to the Unions in C.
Syntax: - UNIONTYPE<data_type, data_type …>
Example: - create_union (5, ‘b’, 60)