marchel@sys:~$ exec ./projects/healthy-lifestyle-rec --model=lightfm,svd --data=usda-api
// PROJECT_DETAIL — RECOMMENDER_SYSTEM · COLLABORATIVE_FILTERING · FOOD_AI
Healthy Lifestyle
Recommendation System
LightFM WARP + Surprise SVD
LightFM  ·  Surprise
Python  ·  Pandas
USDA Food Data Central
2024
Collaborative Filtering LightFM WARP Loss Surprise SVD USDA API Interaction Matrix Matrix Factorization Python · Pandas · NumPy RMSE: 0.7184
PROJECT_OVERVIEW.md
Sistem rekomendasi makanan sehat berbasis Collaborative Filtering menggunakan data nutrisi dari USDA Food Data Central API. Dua pendekatan model dibandingkan: LightFM dengan WARP loss (hybrid matrix factorization yang menggabungkan user-item interaction dengan content features) dan Surprise SVD (Singular Value Decomposition klasik untuk collaborative filtering). Dataset berisi interaksi user–makanan dengan rating 1–5, difokuskan pada makanan ramah diabetik dari USDA database.
SURPRISE SVD RMSE
0.7184
LIGHTFM EPOCHS
30
RATING SCALE
1 – 5
DATA SOURCE
USDA API
── SYSTEM PIPELINE ──
PIPELINE.arch
01
USDA API
Fetch food data: query "diabetes-friendly", pageSize=20, Foundation + SR Legacy datasets
02
Interaction Matrix
Simulate user-food ratings (1–5) across 10 users × N food items via random sampling
03
LightFM
WARP loss, 30 epochs, 2 threads. Maps users & items ke latent space
04
Surprise SVD
Train/test split 80/20, SVD matrix factorization, RMSE evaluation
05
Top-K Recs
Rank semua food items per user, return Top-5 rekomendasi dengan score/predicted rating
LOSS FUNCTION
WARP
Weighted Approx. Rank Pairwise
SVD ALGORITHM
Funk SVD
via Surprise library
USERS × FOODS
10 × N
simulated interaction data
── INTERACTIVE RECOMMENDATION DEMO ──
RECOMMENDER_ENGINE.demo
Simulasi sistem rekomendasi. Pilih profil user dan model, lalu generate rekomendasi makanan sehat.
User Profile
Model
Top-K Results 5
RECOMMENDATION_OUTPUT.dat
Model: LightFM (WARP)  |  User: user_1
0 items
// Tekan GENERATE_RECOMMENDATIONS untuk memulai...
Visualisasi interaction matrix (user × food). Nilai = rating 1–5, warna lebih terang = rating lebih tinggi.
Perbandingan rekomendasi dari kedua model untuk user yang sama. Tekan COMPARE MODELS untuk update.
LIGHTFM (WARP)
SURPRISE (SVD)
Profil nutrisi makanan teratas yang direkomendasikan berdasarkan data USDA Food Data Central.
Protein Carbs Fat Fiber
── MODEL ARCHITECTURE ──
LIGHTFM.arch
Prediction Score
ŷui = qu · pi + bu + bi
WARP Loss (Weighted Approx. Rank Pairwise)
L = u rank(u,i) · hinge(1 − ŷui + ŷuj)
LossWARP
Epochs30
Threads2
Latent dim10 (default)
OutputRanking score (unbounded)
SURPRISE_SVD.arch
Rating Prediction (Funk SVD)
ui = μ + bu + bi + qiT pu
Objective (SGD Minimization)
L = (rui − r̂ui + λ(||qi||² + ||pu||²)
AlgorithmFunk SVD
Train/Test80% / 20%
RMSE0.7184
Rating scale1 – 5
OutputPredicted rating (1–5)
── CODE IMPLEMENTATION ──
IMPLEMENTATION.py
import requests import pandas as pd, random # Fetch food data from USDA API def fetch_food_data(query, page_size=20): params = { "api_key": api_key, "query": query, "dataType": ["Foundation", "SR Legacy"], "pageSize": page_size } response = requests.get(base_url, params=params) return response.json()["foods"] food_data = fetch_food_data("diabetes-friendly", page_size=20) # Build interaction matrix interactions = pd.DataFrame({ "user_id": [random.choice(user_ids) for _ in ratings], "food_id": [random.choice(food_ids) for _ in ratings], "rating": ratings })
from lightfm import LightFM from lightfm.data import Dataset # Prepare LightFM dataset dataset = Dataset() dataset.fit( interactions["user_id"].unique(), interactions["food_id"].unique() ) # Build interaction matrix interactions_matrix, weights_matrix = dataset.build_interactions( [(row["user_id"], row["food_id"], row["rating"]) for _, row in interactions.iterrows()] ) # Train with WARP loss model = LightFM(loss="warp") model.fit(interactions_matrix, epochs=30, num_threads=2) # Predict top-K for user scores = model.predict(user_idx, item_indices) top_k = sorted(zip(food_ids, scores), key=lambda x: -x[1])[:5]
from surprise import Dataset, Reader, SVD from surprise.model_selection import train_test_split from surprise import accuracy # Prepare Surprise dataset reader = Reader(rating_scale=(1, 5)) data = Dataset.load_from_df( interactions[["user_id", "food_id", "rating"]], reader ) # Train / test split & fit SVD trainset, testset = train_test_split(data, test_size=0.2) algo = SVD() algo.fit(trainset) # Evaluate predictions = algo.test(testset) accuracy.rmse(predictions) # → RMSE: 0.7184 # Get Top-K for user recs = [(item, algo.predict(user_id, item).est) for item in items_to_predict] top_k = sorted(recs, key=lambda x: -x[1])[:5]
── LIGHTFM vs SURPRISE SVD ──
MODEL_COMPARISON.tbl
ASPECT LIGHTFM (WARP) SURPRISE (SVD)
Paradigm Hybrid (CF + Content) Pure Collaborative Filtering
Loss Function WARP (ranking-based) MSE (rating prediction)
Output Type Unbounded score (ranking) Predicted rating 1–5
Cold Start Handled (via item features) Problematic
Evaluation AUC, Precision@K RMSE: 0.7184
Training Speed Fast (multi-threaded) Fast (SGD)
Best For Ranking tasks, sparse data Rating prediction, dense data
SYSTEM_LOG.out
// Tekan GENERATE_RECOMMENDATIONS untuk memulai...