Table of Contents

Wikibooks SQL Exercise: Employee management

Load the Libraries

Create Tables

Pandas and Spark dataframes

data: Departments

data: Employees

data: combined data

merge two dataframes using pyspark

spark dataframe attributes

Create SQL table from spark dataframes

Global Temporary View

Temporary views in Spark SQL are session-scoped and will disappear if the session that creates it terminates. If you want to have a temporary view that is shared among all sessions and keep alive until the Spark application terminates, you can create a global temporary view. Global temporary view is tied to a system preserved database global_temp, and we must use the qualified name to refer it, e.g. SELECT * FROM global_temp.view1.

Exercise Questions

Select the last name of all employees.

Select the last name of all employees, without duplicates.

Select all the data of employees whose last name is "Smith".

Select all the data of employees whose last name is "Smith" or "Doe".

Select all the data of employees that work in department 14.

Select all the data of employees that work in department 37 or department 77.

Select all the data of employees whose last name begins with an "S".

Select the sum of all the departments' budgets.

Select the number of employees in each department (you only need to show the department code and the number of employees).

Select all the data of employees, including each employee's department's data.

Select the name and last name of each employee, along with the name and budget of the employee's department.

Select the name and last name of employees working for departments with a budget greater than 60,000.

Select the departments with a budget larger than the average budget of all the departments.

Select the names of departments with more than two employees.

Select the name and last name of employees working for departments with second lowest budget.

Add a new department called "Quality Assurance", with a budget of 40,000 and departmental code 11. Add an employee called "Mary Moore" in that department, with SSN 847-21-9811.

Reduce the budget of all departments by 10%.

Reassign all employees from the Research department (code 77) to the IT department (code 14).

Delete from the table all employees in the IT department (code 14).

Delete from the table all employees who work in departments with a budget greater than or equal to 60,000.

Delete from the table all employees.

Time taken