Complex Data Type In Hive
- anydataflow
- May 1, 2020
- 2 min read
Updated: Jul 20, 2021
Hive: A data warehouse tool on top of Hadoop HDFS (hadoop distributed file system) support most of the useful or I should say SQL like data types.
There are following Data Type supported by hive:
Numeric
Strings
Date/Timestamp
Boolean
Complex
I am considering that you already know hive tool and have worked at least used all the type which are mentioned in above except COMPLEX data types. Please comment in this blog if you have any question regarding above data type's or you want to see an example on them.
Hive supports 3 type of complex data type which are:
Map : It is (key,value) structure.
Array: It is collection of same type of data, which can be any of the above mentioned data type's
Struct: It is like record which keeps a set of named fields that can be any primitive data type, like: {"id":"101", "name":"xyz", "mobile":"12341234"} which can be accessed as: columnName.fieldName
Here I am showing you how to use COMPLEX data type in hive:
Hive Create table statement:
hive> create table t1Complex(id int, name string, mobile array<string>, subject map<string,int>, department array<map<string,string>>, teacher struct<math:string,science:string>) row format delimited fields terminated by '\t' collection items terminated by ',';
Here,
t1Complex is table name.
id, name are primitive datatypes.
mobile is Array data type
subject is Map data type
department is more complex mean Map inside Array
teacher is Struct data type
Now see the description of the table, to understand the structure clearly:
hive> desc t1Complex;
OK
id int
name string
mobile array<string>
subject map<string,int>
department array<map<string,string>>
teacher struct<math:string,science:string>
Time taken: 0.024 seconds, Fetched: 6 row(s)
Let's put one record's in it and see the how it store's data in the hive table. I don't have any existing table so i am trying to put data item's manually:
hive> insert into t1Complex select 101, 'anyone', array('1234567890','1234567891'), map('math',501), array(map('c1','10000'),map('c2','11000')), named_struct('math','availabe','science','Not available') from (select '123') x;
hive> select * from t1Complex;
OK
101 anyone ["1234567890","1234567891"] {"math":501} [{"c1":"10000"},{"c2":"11000"}] {"math":"availabe","science":"Not available"}
Time taken: 0.045 seconds, Fetched: 1 row(s)
Let me break the value's which came above for easy to understand in below.
id int
101
name string
anyone
mobile array<string>
["1234567890","1234567891"]
subject map<string,int>
{"math":501}
department array<map<string,string>>
[{"c1":"10000"},{"c2":"11000"}]
teacher struct<math:string,science:string>
{"math":"availabe","science":"Not available"}
Thanks for visiting us. Please subscribe the blog and let us know if you have any challenge in any issue in big data ecosystem, we will try to come up with solutions THANKS..
Comments