COMS 4771, Machine Learning

COMS 4771 is a graduate-level introduction to the statistical principles and algorithmic paradigms of machine learning (ML). Broadly speaking, ML is concerned with the tasks of learning models from data, generalizing to unseen scenarios, and solving problems without explicit instructions. We will focus mostly on supervised learning, including both classical and deep learning methods. We will also see how ML is used in various applications in NLP, vision, robotics, etc. throughout the course.

Course Objectives

  • Identify, describe, and formulate a typical machine learning problem.
  • Become proficient in the mathematical language of machine learning: Linear algebra, optimization, probability and statistics.
  • Perform assessment and selection among different models for a given problem.
  • Formulate and analyze the solutions for linear regression and its generalizations.
  • Implement classical algorithms for linear classification, e.g. logistic regression and discriminant analysis.
  • Implement classical algorithms for nonlinear classification, e.g. decision trees and kernelized methods.
  • Implement unsupervised learning algorithms for clustering analysis and dimensionality reduction.
  • Understand and implement the general framework for deep learning.
  • Understand modern applications of deep learning, e.g. in vision and natural language.

Prerequisites

  • Python proficiency
  • Linear algebra
  • Multivariable calculus
  • Probability and/or statistics

General List of Topics

  1. Optimization and statistical foundations
  2. Nearest neighbors
  3. Linear regression
  4. Shrinkage methods
  5. Basis expansions
  6. Kernel smoothing methods
  7. Model selection
  8. Logistic regression
  9. Discriminant analysis
  10. Support vector machines
  11. Decision trees
  12. Ensemble methods
  13. Gradient boosting
  14. Clustering
  15. Principal components analysis
  16. Dimensionality reduction
  17. Neural networks
  18. Training neural networks
  19. Convolutional neural networks
  20. Attention and transformers