Introduction

I have started reading The Data Warehouse Toolkit (3rd Edition) by Ralph Kimball to learn more about data warehouses. Being a long-time lurker on Reddit, I have seen many people recommending this book. Personally, I read and take notes blindly. By blindly, I mean highlighting long texts in the book without fully understanding them. To understand what I have read, I will apply the concepts which I have learnt.

Star Schema

Star schemas consist of a fact table and dimension tables. A fact table has two or more foreign keys that connect to the dimension tables’ primary keys. Having all the keys in the fact table that match their respective primary keys in the corresponding dimensional tables satisfy referential integrity.

Letterboxd

To illustrate this, I intend to do up a basic Entity-Relationship Diagram of a star schema. To do that, I will take reference of my “Films of A24” viz to build a star schema. I will also want to come up with an ERD diagram of an app that I frequently use: Letterboxd.

As I am a huge fan of the movies, I have been using Letterboxd to log all the films that I have watched. I have come up with a short case and an ERD diagram of how users review films on Letterboxd. You can follow me here.

My Letterboxd account

My Letterboxd account

Simulation Case

We are on a data team for Letterboxd that needs some insights of review data. The business process is the reviews made by users.

Users: A table that stores one record per use. Users are unique and contain extra information such as membership status, city, etc.

Reviews: Users can only write one review for a single film. Each film can have more than 1 review.

Film: This table stores all films available in Letterboxd. There are some extra information like the release date of the film.

Granularity: Number of film reviews by a user per day

Fact_table: Fact_review

Dimension_table: