Now that we have explored querying data from a single table, we must also consider the fact that more often than not, the columns that we require can lie in different tables. To query data from multiple tables, we will need to use the JOIN function. A more detailed explanation of JOINs can be found here but essentially, the JOIN function aims to pull out data from different tables by linking the tables via a common column. A visual representation of this concept can be seen in the diagram below. As shown in the diagram, by linking Table 1 and Table 2 via the common column of ”Product ID”, we will be able to obtain columns X, Y, A and B in the same output. This will not be achievable if we did not use the “JOIN” function.
*Note : The names of the 2 “common columns ” do not have to be the same as long as the values within the columns can be matched. This is important because the same column may be named differently in different tables but they actually refer to the same thing. For example, “Product ID” may be named as “Product No.” in another table but they refer to the same thing.
There are 4 basic types of JOINS that we typically use, namely the INNER, OUTER, LEFT and RIGHT JOINs.
A practical example of using JOINs could be an online store having different tables to store logistics and product information. To find the performance of the product in terms of sales,views,clicks etc. you will query from the product information table. However, after the product generates sales, you will also need to ensure that the logistics side of things run smoothly. Thus, you will also need to query data from the logistics table for information such as dispatch time etc. To get information on both the product and logistics performance, you will need to use the JOIN function to get the full set of information needed from both tables.
In such a case, the query will be something like this :
SELECT product_name a, product_name b, product_views a, product_orders a, product_ship_time b
FROM sales_table a
LEFT JOIN logistics_table b ON a.product_name = b.product_name
In this query, what we’re trying to do is to really query product orders, product views and the product shipment time from 2 different tables that are linked via the product name column.
* note the use of variables such as a and b to define the tables. This will save you time in referencing to the tables, especially when you have more complex queries.
The general syntax for using JOIN would be as such :
SELECT column_name(s)
FROM table_name1
JOIN table_name2 — Note : The JOIN here can be INNER/OUTER/LEFT/RIGHT JOIN
ON table_name1.column_name = table_name2.column_name
For the JOIN to work, the values in the column from table1 and table2 must match (ie a value in Column x of Table A must be present in Column x of Table B). You can think of it as similar to the VLOOKUP Function in Excel, where you can try to match values in a column from multiple spreadsheets.
To test your understanding on SQL JOINs, you can attempt a quiz here.