Recommender System¶

date: 2025 may 30

References:

01: Connect to Kaggle to download data¶

download kaggle.json from kaggle

# Install Kaggle API
!pip install -q kaggle

from google.colab import files
# files.upload()

!mkdir -p ~/.kaggle
!mv kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

mv: cannot stat 'kaggle.json': No such file or directory

!kaggle datasets download -d zygmunt/goodbooks-10k --unzip

usage: kaggle [-h] [-v] [-W]
              {competitions,c,datasets,d,kernels,k,models,m,files,f,config}
              ...
kaggle: error: argument command: invalid choice: 'ratingss' (choose from 'competitions', 'c', 'datasets', 'd', 'kernels', 'k', 'models', 'm', 'files', 'f', 'config')

!ls

books.csv      ratings.csv	    sample_book.xml  tags.csv
book_tags.csv  regression_model.h5  sample_data      to_read.csv

# we use books.csv and ratings.csv

!cat ratings.csv | head -n 4

book_id,user_id,rating
1,314,5
1,439,3
1,588,5

!cat books.csv | head -n 4

id,book_id,best_book_id,work_id,books_count,isbn,isbn13,authors,original_publication_year,original_title,title,language_code,average_rating,ratings_count,work_ratings_count,work_text_reviews_count,ratings_1,ratings_2,ratings_3,ratings_4,ratings_5,image_url,small_image_url
1,2767052,2767052,2792775,272,439023483,9.78043902348e+12,Suzanne Collins,2008.0,The Hunger Games,"The Hunger Games (The Hunger Games, #1)",eng,4.34,4780653,4942365,155254,66715,127936,560092,1481305,2706317,https://images.gr-assets.com/books/1447303603m/2767052.jpg,https://images.gr-assets.com/books/1447303603s/2767052.jpg
2,3,3,4640799,491,439554934,9.78043955493e+12,"J.K. Rowling, Mary GrandPré",1997.0,Harry Potter and the Philosopher's Stone,"Harry Potter and the Sorcerer's Stone (Harry Potter, #1)",eng,4.44,4602479,4800065,75867,75504,101676,455024,1156318,3011543,https://images.gr-assets.com/books/1474154022m/3.jpg,https://images.gr-assets.com/books/1474154022s/3.jpg
3,41865,41865,3212258,226,316015849,9.78031601584e+12,Stephenie Meyer,2005.0,Twilight,"Twilight (Twilight, #1)",en-US,3.57,3866839,3916824,95009,456191,436802,793319,875073,1355439,https://images.gr-assets.com/books/1361039443m/41865.jpg,https://images.gr-assets.com/books/1361039443s/41865.jpg

!cat tags.csv | head -n 4

tag_id,tag_name
0,-
1,--1-
2,--10-

!cat book_tags.csv | head -n 4

goodreads_book_id,tag_id,count
1,30574,167697
1,11305,37174
1,11557,34173

Data Loading and Preparation

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import warnings

from tensorflow.keras.layers import Input, Embedding, Flatten, Dot, Dense, Concatenate
from tensorflow.keras.models import Model
from keras.models import load_model
import tensorflow as tf
# warnings.filterwarnings('ignore')

ratings = pd.read_csv('ratings.csv')
books = pd.read_csv('books.csv')

from sklearn.model_selection import train_test_split
train, test = train_test_split(ratings, test_size=0.2, random_state=42)

n_users = len(ratings.user_id.unique())
n_books = len(ratings.book_id.unique())

02: Model Architecture¶

Creates two input layers, one for book IDs and one for user IDs.
shape=[1] means each input is a single number (either book ID or user ID).

book_input = Input(shape=[1], name="Book-Input")
user_input = Input(shape=[1], name="User-Input")

Embedding Layers¶

Embedding Layers: Convert integer IDs into dense vectors of fixed size.
Why +1? This is because embedding layers require the input dimension to be at least max_id + 1 since IDs typically start at 1 (not 0). It ensures we have enough embedding slots for all possible IDs.
5: The second parameter is the embedding dimension (5 in this case), meaning each book/user will be represented by a 5-dimensional vector.

book_embedding = Embedding(n_books+1, 5, name="Book-Embedding")(book_input)
user_embedding = Embedding(n_users+1, 5, name="User-Embedding")(user_input)

Flatten Layers

Flatten: Converts the 2D embedding output (batch_size, 1, 5) to 1D (batch_size, 5) for the dot product operation.

book_vec = Flatten(name="Flatten-Books")(book_embedding)
user_vec = Flatten(name="Flatten-Users")(user_embedding)

Dot Product

Dot Product: Computes the similarity between user and book embeddings.
axes=1 means we're doing a dot product along the embedding dimension (5).

prod = Dot(name="Dot-Product", axes=1)([book_vec, user_vec])

Model Compilation

Creates the model with two inputs and one output (the predicted rating).

Uses Adam optimizer and mean squared error loss (common for regression tasks).

model = Model([user_input, book_input], prod)
model.compile('adam', 'mean_squared_error')

03: Training and Saving¶

Checks if a trained model exists, loads it if yes, otherwise trains a new one.

Saves the trained model to disk for future use.

if os.path.exists('regression_model.h5'):
    model = load_model('regression_model.h5')
else:
    history = model.fit([train.user_id, train.book_id], train.rating,
                       batch_size=64, epochs=5, verbose=1)
    model.save('regression_model.h5')

WARNING:absl:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.

4: Making Recommendations¶

Gets all unique book IDs

Creates an array with the same user ID repeated for all books

Converts to tensors (required by TensorFlow)

Gets predictions for all books

Sorts predictions to get top recommendations

def make_recommendations(user_id=1, model=model, n_recommendations=5):
    # 1. Get all unique book IDs from the dataset
    book_data = np.array(list(set(dataset.book_id)))

    # 2. Create array with the target user_id repeated for each book
    user = np.array([user_id for _ in range(len(book_data))])

    # 3. Convert to TensorFlow tensors (required for model input)
    user_tensor = tf.convert_to_tensor(user, dtype=tf.int32)
    book_tensor = tf.convert_to_tensor(book_data, dtype=tf.int32)

    # 4. Get predicted ratings for all books
    predictions = model.predict([user_tensor, book_tensor], batch_size=128)
    predictions = np.array([a[0] for a in predictions])  # Flatten predictions

    # 5. Sort books by predicted rating (highest first)
    recommended_book_ids = (-predictions).argsort()[:n_recommendations]

    # 6. Get book details from books.csv
    recommended_books = books[books['id'].isin(recommended_book_ids)]

    return recommended_books

ratings.head(2)

books.head(2)

# Example usage
recommended_books = make_recommendations(user_id=1)
print(recommended_books)

79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[8945 7638 8881 7946 4777]

user_id = ratings.user_id.iloc[0]
# Get recommendations for user 314
user_314_recs = make_recommendations(user_id=user_id)

# Display results nicely
print(f"\nTop 5 Recommendations for User 314:")
print(user_314_recs[['id', 'title', 'authors']].to_string(index=False))

79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step

Top 5 Recommendations for User 314:
  id                                            title                                   authors
7638                                   The Algebraist                             Iain M. Banks
8335                                           Ripper Isabel Allende, Oliver Brock, Frank Wynne
8881                      Benediction (Plainsong, #3)                                Kent Haruf
8945          The Next Accident (Quincy & Rainie, #3)                              Lisa Gardner
9079 Like a Fire Burning (The Work and the Glory, #2)                            Gerald N. Lund

Q: explain why we need adam and mse, for this dot product code? cant we just do like 'np.dot'¶

Great question! Let me explain why we use Adam optimizer and Mean Squared Error (MSE) in this recommendation system, and why we can't just use np.dot directly.

Why Adam Optimizer?¶

Adaptive Learning Rates:
- Adam automatically adjusts learning rates for each parameter
- In our embedding layers (user and book vectors), different dimensions may need different update rates
- Classic SGD would use the same learning rate for all parameters
Momentum-like Behavior:
- Adam combines the benefits of two other optimizers (AdaGrad and RMSProp)
- Helps navigate the loss landscape more efficiently than plain SGD
Well-suited for Embeddings:
- Embedding layers often benefit from adaptive optimization
- Different users/books may need different update magnitudes
Default Choice:
- Adam works well out-of-the-box for most deep learning tasks
- Requires less tuning than SGD with momentum

Why Mean Squared Error (MSE)?¶

Regression Task:
- We're predicting ratings (continuous values), not classifications
- MSE is the standard loss function for regression problems
Punishes Large Errors:
- MSE squares the errors, so bad predictions are penalized more heavily
- Important for recommendation quality - we really don't want to recommend books the user would hate
Aligns with Rating Prediction:
- When we say "this user would rate this book 4.5 stars", we want to minimize how wrong we are
- Directly matches our objective

Why Not Just Use `np.dot`?¶

Learning vs Calculation:
- np.dot would just calculate a static dot product
- Our goal is to LEARN good embeddings that make the dot product predict ratings well
- The magic happens in the embedding layers that are trained via backpropagation
End-to-End Training:
- The Dot layer in Keras is differentiable
- Allows gradients to flow back through the network to update embeddings
- np.dot wouldn't connect properly to the training process
Model Abstraction:
- Keras layers manage all the underlying tensor operations
- Handles batching, GPU acceleration, etc. automatically
- Would have to reimplement all this manually with np.dot

The Big Picture¶

What makes this system work isn't the dot product itself, but the trained embeddings:

The model learns:
- 5-dimensional user preferences (what genres/themes they like)
- 5-dimensional book characteristics (what genres/themes they contain)
During training:
- Adam optimizer adjusts these embeddings
- MSE measures how well their dot products match actual ratings
- Over time, similar users and books cluster in embedding space
For prediction:
- High dot product = good match between user preferences and book characteristics
- The dot product is just the mathematical operation that combines them

Alternative Approach¶

You could theoretically do something like:

# Initialize random embeddings
user_embeddings = np.random.rand(n_users+1, 5)
book_embeddings = np.random.rand(n_books+1, 5)

# Manual dot product prediction
def predict(user_id, book_id):
    return np.dot(user_embeddings[user_id], book_embeddings[book_id])

# Manual training loop
for epoch in range(epochs):
    for user_id, book_id, rating in ratings:
        prediction = predict(user_id, book_id)
        error = rating - prediction
        # Manually update embeddings using some learning rate
        user_embeddings[user_id] += lr * error * book_embeddings[book_id]
        book_embeddings[book_id] += lr * error * user_embeddings[user_id]

But this would be:

Much slower (no GPU acceleration)
Harder to implement correctly
Less flexible for model changes
Missing many optimizations that Adam provides

The Keras implementation gives us all this functionality in just a few lines of clean, maintainable code.

This recommendation system is a Collaborative Filtering model, specifically a neural network-based implementation of collaborative filtering. Here's why:

Key Characteristics of This System¶

Pure Interaction Data:
- Only uses user-book ratings (interaction data)
- Doesn't use any content features about books (like genre, author) or users (like demographics)
Embedding-Based Approach:
- Learns latent factors (embeddings) for users and books
- These embeddings represent learned patterns from the rating data
Dot Product Architecture:
- Predicts ratings based on similarity between user and book embeddings
- Classic collaborative filtering approach

Why Not Content-Based?¶

A content-based system would:

Use features about the items (book content, metadata)
Create user profiles based on the content of items they've liked
Recommend items similar to what the user has liked based on content features

This system does none of those things - it only looks at who liked what, not why they might have liked it.

Collaborative Filtering Variant¶

This is specifically a neural collaborative filtering system with these properties:

Characteristic	This System	Traditional CF
Uses embeddings	✓	Sometimes
Neural network	✓	✗
Matrix factorization	Neural version	Classic SVD
Handles cold start	✗	✗

How It Works¶

User Embedding: Represents user's preferences in latent space
- Learned from their rating history
- Similar users end up close in embedding space
Book Embedding: Represents book characteristics in same latent space
- Learned from how users have rated it
- Similar books end up close together
Recommendation: Finds books whose embeddings are close to the user's embedding
- Measured by dot product (cosine similarity)

Advantages Over Traditional CF¶

Non-Linear Relationships: Neural nets can learn more complex patterns
Flexibility: Easy to extend (e.g., add more layers)
Feature Combination: Can later add content features if needed

When This Works Best¶

When you have lots of interaction data
When you want to discover latent relationships
When item content is hard to quantify

The system would become a hybrid model if you added content features (like book genres or descriptions) to the neural network. But in its current form, it's pure collaborative filtering.

	id	book_id	best_book_id	work_id	books_count	isbn	isbn13	authors	original_publication_year	original_title	...	ratings_count	work_ratings_count	work_text_reviews_count	ratings_1	ratings_2	ratings_3	ratings_4	ratings_5	image_url	small_image_url
0	1	2767052	2767052	2792775	272	439023483	9.780439e+12	Suzanne Collins	2008.0	The Hunger Games	...	4780653	4942365	155254	66715	127936	560092	1481305	2706317	https://images.gr-assets.com/books/1447303603m...	https://images.gr-assets.com/books/1447303603s...
1	2	3	3	4640799	491	439554934	9.780440e+12	J.K. Rowling, Mary GrandPré	1997.0	Harry Potter and the Philosopher's Stone	...	4602479	4800065	75867	75504	101676	455024	1156318	3011543	https://images.gr-assets.com/books/1474154022m...	https://images.gr-assets.com/books/1474154022s...