Databricks Guide
Array Functions
- Array Union: returns array without duplicates between two arrays
SELECT array_union(array(1, 2, 2, 3), array(1, 3, 5));
[1,2,3,5]
- Array Distinct: same as array union but for one array (Collect set also works)
SELECT array_distinct(array(1, 2, 3,NULL, 3));
[1,2,3,NULL]
- Array Intersect: Finds the common elements between two arrays
>SELECT array_intersect(array(1, 2, 3), array(1, 3, 3, 5));
[1,3]
- Array Compact: Removes Nulls from an array
SELECT array_compact(array(1, 2,NULL, 3,NULL, 3));
[1, 2, 3, 3]
- Flatten: combines arrays together
SELECT flatten(array(array(1, 2), array(3, 4)));
[1,2,3,4]
- Filter: Filters an array based on a lamda expression
SELECT filter(array(1, 2, 3), x -> x % 2 == 1);
[1,3]
- Transforms: Transforms each individual array element based on lamda expression
- Set Operators
- Union All: all the rows
- Union Distinct: does not contain any duplicate rows (the default)
- INTERSECt: if it appears in both
- EXCEPT SELECt or MINUS: Makes exceptions in the queries
TBLEProperties
- can be used as part of the SQL definition statement to give info on the table
- Syntax:
set TBLEProperties