SQL Data Retrieval

We will be using Production schema from AdventureWorks 2012 database on SQL Server 2012 for all our examples. Click on the picture for schema diagram.

Production Schema

Lets start with an example by constructing a basic Select statement from some of the table in the picture above..

SELECT  Product.Name ,
        ProductID ,
        ProductLine,
        ListPrice,
        AVG(ListPrice) AS Avg_ListPrice,
        StandardCost
FROM  AdventureWorks2012.Production.Product
WHERE ListPrice > = 100
AND StandardCost <= 300
AND ProductLine IS NOT NULL
GROUP BY ProductID,
         ListPrice,
         Product.Name,
         ProductLine,
         ListPrice,
         StandardCost
HAVING AVG(ListPrice) < 500
ORDER BY ProductID ASC,
         Product.Name ASC,
         ListPrice DESC

 

Complete Basic Select SQL Statement

The above example was for selecting the data from a single table. The tables will most likely be in normalized form when working on either RMDBS database with dimensional modeling or OLAP cubes and since the data would reside on a large number of normalized tables; while retrieving data, a SQL query needs to be constructed that will join different tables related by Primary keys and Foreign keys. I covered Normalization upto 3NF in this post.

Each table is created/designed to have at least a unique primary key (PK) for every row and the tables can be related to other tables by using Foreign Key(FK) as shown in the example below.

Primary Key and Foreign Key Relation

 

ProductID is the Primary Key in Product table and ProductSubcategoryID is one of the Foreign key. You can see that ProductSubcategory has ProductSubcategoryID as its Primary key and these two tables can be joined through the Primary Key – Foreign key join.

Let’s join Product and ProductSubcategory tables. Note how we applied an inner join on the third row. Right Outer join, Left Outer join or Outer Joins can also be applied depending on what result set you are trying to retrieve.

SELECT  Product.Name , ProductID , ProductLine , Product.ProductSubcategoryID
FROM  AdventureWorks2012.Production.Product AS Product
INNER JOIN AdventureWorks2012.Production.ProductSubcategory AS ProductSubCat
ON Product.ProductSubcategoryID = ProductSubCat.ProductSubcategoryID

 

Resultset 1 basic join

Let’s understand SubQuery concept.  Take tables “Product”, “BillofMaterials” and “UnitMeasure” for selecting products that are of subcategory code “12” and the products should be in “Each”. There are 38 unit measure codes in “BillofMaterials” table  and the Name of those unit measure codes are in the table “UnitMeasure”. So lets say you don’t want to look up the Name from the UnitMeasure table manually to find out the code(for “Each” in this case)  that you want to filter the result set for. We can write a subquery that will get the measure code for selected measure code name.

In short, we will need to add another query (subquery) in the where clause conditions that will retrieve records for “Each” and feed that into the main query’s Where clause condition.

SELECT  Product.Name ,
        ProductID ,
        ProductLine ,
        Product.ProductSubcategoryID,
BillofMaterial.UnitMeasureCode,
BillofMaterial.PerAssemblyQty
FROM    AdventureWorks2012.Production.Product AS Product
        INNER JOIN AdventureWorks2012.Production.BillOfMaterials AS BillofMaterial
ON Product.ProductID = BillofMaterial.BillOfMaterialsID
WHERE ProductLine IS NOT NULL
AND ProductSubcategoryID = 12
AND UnitMeasureCode = (SELECT UnitMeasure.UnitMeasureCode
FROM AdventureWorks2012.Production.UnitMeasure
WHERE UnitMeasure.Name = 'Each' )

Let’s learn about couple of operators -UNION, UNION ALL, EXCEPT, EXCEPT ALL, INTERSECT, INTERSECT ALL etc.

UNION operator will combine data from two or more tables and UNION ALL will retain the duplicate rows.

SELECT ProductSubcategoryID
      ,Name
  FROM AdventureWorks2012.Production.ProductSubcategory
  WHERE ProductSubcategoryID IS NOT NULL
  union
  SELECT ProductSubcategoryID
         ,[Name]
   FROM AdventureWorks2012.Production.product
   WHERE ProductSubcategoryID is NOT NULL

Union Query Example

EXCEPT operator will give results from the first half of the SQL but not from the Tables that are after EXCEPT operator.

Example:

SELECT ProductSubcategoryID
      ,Name
  FROM AdventureWorks2012.Production.ProductSubcategory
  WHERE ProductSubcategoryID IS NOT NULL
  EXCEPT
  SELECT ProductSubcategoryID
         ,[Name]
   FROM AdventureWorks2012.Production.product
   WHERE Name  IN ('Forks','Gloves','Handlebars','Headsets','Helmets')
Result for EXCEPT Operator

INTERSECT operator will give  only the matching rows from the queries.

Example.

SELECT  Product.Name ,
        ProductID ,
        ProductLine,
ListPrice,
StandardCost,
FinishedGoodsFlag
FROM  AdventureWorks2012.Production.Product
WHERE StandardCost> 1000
INTERSECT
SELECT  Product.Name ,
        ProductID ,
        ProductLine,
ListPrice,
StandardCost,
FinishedGoodsFlag
FROM  AdventureWorks2012.Production.Product
WHERE FinishedGoodsFlag = '1'
ORDER BY Name, ProductID

 

      AND ProductLine= 'M'
Intersect SQL Example

While using these operators, the select list should have equal and identical fields in terms of datatype and data length. Also note in the query above for INTERSECT, we put order by clause in the end after both the SQL statements.

 

In the above section, we saw the structure of SELECT Statement, how to join multiple tables, operators and building a subquery.

Let’s explore Nested Table expression that is very frequently used while writing Stored Procedures or Views. By using this expression we are creating a temporary view from a query that is defined in the FROM Clause. This helps when you are trying to apply some aggregations on a set of data and use the result set as a temporary table. This temporary table takes the spot of an actual table name in the FROM clause and acts like a virtual table.

We’ll use the Human Resource schema for all the examples in this post. See how department information is stored in Department table and the history for each employee’s department is stored in “EmployeeDepartmentHistory” Table. Note the usage of Start and End date that basically tells which department id an employee belonged to at any point in history. Details of employees are as you can see stored in the “Employee” table.

Human Resource Schema

Before we construct a nested query, let us take a peek at the data for each of the tables we’ll be using in our final query.

**This is partial result set for Employee Table.

SELECT
LoginID
 ,VacationHours
,SickLeaveHours
,OrganizationLevel, SalariedFlag
FROM AdventureWorks2012.HumanResources.Employee

Employee Table

Partial data for EmployeeDepartmentHistory table.

Resultset for EmpDeptHistory

Data from Department table

Department table resultset

 

In order to construct a nested query,  let’s make up a scenario where we want to find the Total of Average vacation hours by salaried and non salaried employees across the company.

We’ll also add some rules to our query.

  • Results should consist of record for only those employees who took more than 20 Sick Leave hours in the selected time frame.
  • Result should include employees from Sales, Purchasing, Marketing or Engineering department only.

 

Here is the final query which we will dissect and understand after the jump.

SELECT SUM(AVG_VacationHours) AS TotalOfAvgVacHours, AVG(TotalTimeOff) AS AVGTotalTimeOff, SalariedFlag
FROM (
SELECT
AVG(VacationHours) AS AVG_VacationHours
,SUM(SickLeaveHours+VacationHours) AS TotalTimeOff
,OrganizationLevel
,EmpHistory.DepartmentID AS DeptID
,Department.Name
,SalariedFlag
FROM AdventureWorks2012.HumanResources.Employee AS Employee 

INNER JOIN AdventureWorks2012.HumanResources.EmployeeDepartmentHistory AS EmpHistory
ON Employee.BusinessEntityID = EmpHistory.BusinessEntityID
INNER JOIN AdventureWorks2012.HumanResources.Department AS Department
ON Department.DepartmentID = EmpHistory.DepartmentID
WHERE HireDate between '01-05-2005' AND GETDATE()
AND SickLeaveHours >20
AND Department.Name IN ('Sales','Purchasing','Marketing','Engineering')
GROUP BY OrganizationLevel, SalariedFlag, EmpHistory.DepartmentID, Name
)
AS
Temp_Table
GROUP BY Temp_Table.SalariedFlag

Result:
Final Result for Nested Query

We start by selecting the items we want for forming our virtual table, the result of which will be used in the From Clause of the main query. The SQL in the nested expression is used in place of an actual table and meets our rules that I mentioned earlier.

In the nested query, we’ll do an Average of  vacation hours and Sum up Sick Leave days with Vacation days for employees hired between 01-05-2005 to date. We’ll also pick the departments in this query and only those employees who took more than 20 hours sick leave.

Here is the result set for the nested query.

Nested Query Result set

Now this is the data of a virtual table that will be used in the FROM clause of main query. In the main query, we will sum up all the averages for vacation hours and group it by Salaried Flag.

The end result tells us how many average vacation hours salaried employees and unsalaried employees(contractors) took. The query also tells us the average time off for both categories of salaried and non salaried employees.

Here is the result set again.

Final Result for Nested Query

 

 

Leave a Reply