ETF2121/5912: Individual Assignment
Due Wednesday, 4 September 2024, 11.55pm via Moodle
In this assignment, you will learnβ¦
able to use SQL effectively to
Write SQL queries to extract data from a database
Use SQL to perform data cleaning and transformation
Use SQL to perform data analysis and derive insights
able to use Power BI effectively to
Import and clean data in Power BI
Create calculated columns and measures
Design various types of charts and graphs
Build an interactive dashboard
Derive and present meaningful insights from data
This assignment constitute 20% of the total mark of this course.
Section 1 (10%): SQL π
Objective
Demonstrate your proficiency in SQL by writing queries to extract, clean, and analyze data from a given database.
Dataset: Student Accomodation
You are given the dataset here. Please upload the sql script into oracle live sql and continue to work from there. Please watch the video below on how to upload the script. You can proceed to answer the related question once all the tables are created successfully.
How to upload the dataset
The detail of the dataset as below:
Tables and Columns:
OWNER
- Columns:
OWNER_NUM (PK)
,LAST_NAME
,FIRST_NAME
,ADDRESS
,CITY
,STATE
,ZIP_CODE
- Columns:
PROPERTY
- Columns:
PROPERTY_ID (PK)
,OFFICE_NUM
,ADDRESS
,SQR_FT
,BDRMS
,FLOORS
,MONTHLY_RENT
,OWNER_NUM (FK)
- Columns:
OFFICE
- Columns:
OFFICE_NUM (PK)
,OFFICE_NAME
,ADDRESS
,AREA
,CITY
,STATE
,ZIP_CODE
- Columns:
SERVICE_CATEGORY
- Columns:
CATEGORY_NUM (PK)
,CATEGORY_DESCRIPTION
- Columns:
SERVICE_REQUEST
- Columns:
SERVICE_ID (PK)
,PROPERTY_ID (FK)
,CATEGORY_NUMBER (FK)
,OFFICE_ID (FK)
,DESCRIPTION
,STATUS
,EST_HOURS
,SPENT_HOURS
,NEXT_SERVICE_DATE
- Columns:
RESIDENTS
- Columns:
RESIDENT_ID (PK)
,FIRST_NAME
,SURNAME
,PROPERTY_ID (FK)
- Columns:
Notes: FK
stands for foreign key and PK
stands for primary key in the corresponding table.
Tasks:
- Based on the information given above, draft a schema diagram to show the relationship between all tables.
- Which properties have a monthly rent between $1000 and $2500?
- How many properties are there in each state?
- What is the total number of bedrooms in all properties?
- What is the average estimated and spent hours on service requests?
- How many properties are managed by each office?
- What is the average monthly rent of properties in each city?
- Which properties have service requests where the spent hours exceed the estimated hours?
- What is the percentage distribution of service requests by their status?
- Which owners own more than one property. List their owner number, last name, first name, and the total number of count.
Show your sql queries for each question and submit. Paste your sql command and the screenshot of your result in a .docx file.
Section 2 (10%): Power BI πΌ
π This is a competition assignment, the best top 3 get a mystery gift! π
You will be provided with a CSV file containing sales data for a hotel booking. The dataset includes information on sales transactions, products, customers, and locations.
Objective
Demonstrate your proficiency in data wrangling and visualization using Power BI by creating a comprehensive dashboard from this dataset. (Source: Hotel Booking Demand)
Create a comprehensive dashboard using Power BI to assist hotel management in making data-driven decisions. Your dashboard should provide insights into key performance indicators (KPIs) and trends that can help improve hotel operations, guest satisfaction, and revenue management.
Dataset:
You will be using the provided dataset which contains information on hotel bookings. The dataset includes the following columns:
hotel
: Type of hotel (Resort Hotel or City Hotel)is_canceled
: Booking cancellation status (0: Not Canceled, 1: Canceled)lead_time
: Number of days before the actual arrival date when the booking was madearrival_date_year
,arrival_date_month
,arrival_date_week_number
,arrival_date_day_of_month
: Information on the arrival datestays_in_weekend_nights
: Number of weekend nights (Saturday or Sunday) the guest stayed or booked to stay at the hotelstays_in_week_nights
: Number of week nights (Monday to Friday) the guest stayed or booked to stay at the hoteladults
,children
,babies
: Number of adults, children, and babiesmeal
: Type of meal bookedcountry
: Country of origin of the guestmarket_segment
: Market segment designationdistribution_channel
: Booking distribution channelis_repeated_guest
: Whether the guest is a repeated guest (0: No, 1: Yes)previous_cancellations
,previous_bookings_not_canceled
: Number of previous bookings that were canceled or not canceledreserved_room_type
,assigned_room_type
: Code of room type reserved and assignedbooking_changes
: Number of changes made to the bookingdeposit_type
: Type of deposit madeagent
,company
: ID of the travel agent and company that made the bookingdays_in_waiting_list
: Number of days the booking was in the waiting listcustomer_type
: Type of customer (Transient, Contract, Group, Transient-Party)adr
: Average Daily Raterequired_car_parking_spaces
: Number of car parking spaces requiredtotal_of_special_requests
: Total number of special requests made by the guestreservation_status
: Reservation last status (Canceled, Check-Out, No-Show)reservation_status_date
: Date at which the last status was set
Tasks:
Data Wrangling (40%)
- Import the CSV file into Power BI.
- Clean the data, addressing any inconsistencies, missing values, or errors.
- Create at least 3 calculated measures. Ensure that your calculated measures are meaningful and correctly implemented.
- Document your data wrangling process within the Power BI file.
Dashboard Creation (50%).
Create a dashboard with at least 5 different types of visualizations. Use appropriate chart types (bar charts, line charts, pie charts, etc.) to display the data effectively. Include slicers and filters to allow users to interact with the data (e.g., filter by hotel type, date range, market segment). Ensure that your visualizations are interactive and provide meaningful insights. For example, you can identify key performance indicators (KPIs) relevant to hotel management. Examples include: Occupancy Rate, Cancellation Rate, Average Daily Rate (ADR), Revenue Per Available Room (RevPAR), Number of Repeat Guests, Average Lead Time
Presentation and Usability (10%)
- Organize your dashboard logically and aesthetically.
- Provide clear titles and labels for all visualizations.
- Based on the visualized data, provide insights and actionable recommendations for hotel management. Add this to the last page of your dashboard that should serve as a report to highlight any trends, anomalies, or areas that require attention.
Submit one Power BI file (.pbix) showing your data preparation, calculated measures, and final dashboard.
Notes: This assignment requires you to submit 2 files: 1 .docx file (with sql and the result screenshot), 1 .pbix file.