Question:- Consider a scenario , we have table in hive containing one column as INT and one column as ARRAY . Display all values as one on one mapping/ horizontally.
Solution:
1. Below is the data set to explain the example. '\t' is the field delimiter and Control+B is the collection items delimiter.
2. Create a managed table in the database using below query on the hive shell:-
hive > CREATE TABLE arrays(emp_id INT, dep_id ARRAY<STRING>, address_id ARRAY<STRING>) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY '\002';
hive > LOAD DATA LOCAL INPATH '/<pathtofile>/array.txt' overwrite INTO TABLE arrays;
4. To check the data has been loaded successfully run below query:-
hive > select * from arrays;
Below output will be produced-
So far we have successfully created our data. Now before moving forward explanation of Lateral View keyword is required.
Lateral view is used with user-defined table generating functions such as
explode()
. explode() is built in table generate function that produces multiple rows for one row as input.
Use Case 1: Query to print department id horizontally with employee id
hive > select emp_id,dep_id from arrays LATERAL VIEW EXPLODE(dep_id) arrays as dep_id;
output:-
Use Case 2: Query to print multiple columns horizontally with employee id
hive > select emp_id,dep_id,address_id from arrays LATERAL VIEW EXPLODE(dep_id) arrays as dep_id LATERAL VIEW EXPLODE(address_id) arrays as address_id;
output:-
Use Case 3: As you can see that row with emp_id 4 is eliminated in output produced this is because that was having null in address_id. To get that row as well we need to use LATERAL View with OUTER.
hive
> select emp_id,dep_id,address_id from arrays LATERAL VIEW
EXPLODE(dep_id) arrays as dep_id LATERAL VIEW OUTER EXPLODE(address_id) arrays
as address_id;
output:- As you can see that employee id with 4 is coming as well.
Input file is attached with the post.
I hope you like my explanation. Please leave a comment if you like it.