Hive Data Type

The Hive Data Types are classified into major categories, which are discussed below:

Hive Data Type

Primitive Data Type in Hive

Primitive Data is categorized into four types, which are listed below:

  • Numeric Data Type
  • Date/Time Data Type
  • String Data Type
  • Miscellaneous Data Type

Numeric Data Type in Hive

Numeric Data is divided into two types:

  • Integral Data Type
  • Floating Data Type

Integral Data Type:-

Integral Data Type are as follow-

Type Size Range
TINYINT 1 Byte signed Integer -128 to 127
SMALLINT 2 Byte signed Integer 32, 768 to 32, 767
INT 4 Byte signed Integer –2,147,483,648 to 2,147,483,647
BIGINT 8 Byte signed Integer -9,223,372,036,85,775,808 to 9,223,372,036,854,775,807
     

Floating Data Type:-

Floating Data Type are as follow-

Type Size Range
Float 4 Byte Single-precision floating-point number
Double 8 Byte Double-precision floating-point number
Decimal 17 Byte Arbitrary-precision signed decimal number

Date/Time Data Type

The Hive data types in this category are:

  • TimeStamp
  • Date
  • Interval

TimeStamp

It is used for nanosecond precision and denoted by YYYY-MM-DD hh:mm:ss format.

Date

The date value is used in the form YYYY-MM-DD to determine a particular year, month, and day. It does not provide the time of the day. However, the range of Date type lies between 0000--01--01 to 9999--12--31.

Interval

The table is shown below:

It displays the timestamp values and the method to cast Date format string.

Cast Type Result
Cast (date as date) Any data value
Cast (date as string) Date can be formatted to 'YYYY-MM-DD.'
Cast (string as date) Midnight of the year/month/day of the date value in such cast type are returned as a timestamp.
Cast (date as timestamp) It returns an associated date value, when the string is in the form of ' YYYY-MM-DD '  If given format does not matches the string value, NULL will be returned.
Cost (timestamp as date) The timestamp year/month/day will return a date value.

String Data Type

String Data is divided into three types:

  • String
  • Varchar
  • char

String

When enclosing characters, either single or double quotes must be used.

Varchar

Maximum brace length is defined, and up to 65355 bytes are allowed.

Char

The char is a type of fixed length, with a maximum length of 255.

Miscellaneous Data Type:-

Different types of data support both Boolean and binary types of data.

  • Boolean
  • Binary

Boolean:-

The Boolean stores the value either true or false.

Binary:-

It is defined as an array of bytes.

Complex Data Type

Complex Data is divide into three types:

  • Array
  • Map
  • Struct

Array

 Array is defined as the collection of similar data types. The value of such data types are indexable using the zero-based Integer.

Syntax: - ARRAY<data_type>
Example: - array (6, 7)

Map

 It is a collection of key-value pairs. The Keys can be primitives, values, or any data type. The keys and values for a specific map must be of the same type.

            Syntax: - MAP <primitive_type, data_type>
            Example: - MAP (‘m’, 8, ‘n’, 9).

Struct

 It is defined as the collection of named fields. The field can be of different types.

The Struct is similar to the STRUCT present in C language).

            Syntax:-STRUCT <col_name: data_type [COMMENT col_comment]…>
            Example: - struct (‘m’, 11.0),  named-struct (‘col1’, ‘m’, ‘col2’ ‘1’, ‘col3’, ‘1.0’)  

Union

 Union type can hold any data type, which can be one of the specified data types.

The Union data type is similar to the Unions in C.

            Syntax: - UNIONTYPE<data_type, data_type …>
            Example: - create_union (5, ‘b’, 60)