Joins in a Pipeline

Updated 

Joining two different datasets involves combining the data from both sets based on a common field or set of fields. It's important to note that when joining two different data sets, we ensure that the common field has the same data type and format in both sets, and that the field is unique and doesn't contain any duplicates. Additionally, you should always review the joined data to ensure that the merge was successful and that the resulting data makes sense.

Types of Joins

Term

Description

Visual

Outer Join

Outer joins allow you to exclude columns you don't want from both tables by keeping only columns unique to the left and right table.

Screen Shot 2018-03-21 at 1.15.53 PM.png

Inner Join

Outer joins allow you to keep matching columns only from both tables.

Screen Shot 2018-03-21 at 1.16.01 PM.png

Left Join

Left joins allow you to keep all columns in the left table and exclude unwanted columns from the right table.

Screen Shot 2018-03-21 at 1.16.06 PM.png

Right Join

Right joins allow you to keep all columns in the right table and exclude unwanted columns from the left table.


Screen Shot 2018-03-21 at 1.16.11 PM.png