Table of Contents

Description

Question 01

Hints:

  1. Remove subjects with existing diabetes for this analysis.
  2. You can consider incident diabetes as a binary outcome and use logistic regressions; or if you want to try survival analysis,you can use Cox regressions utilizing the provided time-to-event information.
  3. Investigate each individual features and biomarkers and use p-values to justify your findings.
  4. For blood biomarkers, use transformation that is robust when there are outliers present.
  5. Demonstrate your findings with visualization.

Quesiton 02

Hints:

  1. Select the relevant blood biomarkers as features for your classifier.
  2. Select and train a ML model to make predictions.
  3. Evaluate your predictive model with ROCAUC.

Question 03

Hints:

  1. Use the subset of subjects who developed incident diabetes for your unsupervised learning.
  2. You can choose to use all or only relevant biomarkers for clustering.
  3. Select one approach to identify clusters of these subjects.
  4. Identify top blood biomarkers that contributed to the clustering.

Import the modules

Load the data

Modelling logistic regression using statsmodels

Get p-values

Model Evaluation

Time Taken