Machine learning

This course is listed in USOS as: Machine Learning WFAIS.IF-N019.0 (Uczenie Maszynowe), 60 hours 6 ECTS.


Will start in summer semester 2017/2018


Will be specified later


Students of 2nd level studies


Introduction to Data Science

Course outcomes

Students after finishing this course should have an understanding of different types of Machine Learning methods (supervised vs. unsupervised) as well as different machine learning tasks: regression, classification, clustering. They should also gain knowledge of the most popular methods used for each task. They should be able choose the appropriate model, train it validate and deploy. Specifically, we expect students after finishing this course to be able to:

  • Choose the appropriate model for the task at hand. Those models will include:
    • Regression (Linear regression, Ridge & Lasso Regression)
    • Classification (SVM, Naïve Bayes, Logistic Regression, Decision Trees, Random Forests, Boosted Decision Trees, Nearest neighbors)
    • Clustering (K-Means, DBSCAN, Hierarchical)
    • Recommender systems (Collaborative filtering)
    • Deep feed forward neural network
  • Clean and validate the data.
  • Prepare the data for training, validation and testing.
  • Transform the data if necessary (feature engineering).
  • Reduce the dimensionality of the data by PCA or LDA
  • Train the model.
  • Validate the model (cross validation).
  • Evaluate and tune model performance using e.g. grid search.
  • Use the distributed ML frameworks such as Apache Spark.
Additionally, students will obtain working knowledge of numerical optimization methods such as gradient descent.


We will continue to use the Python tool chain, notably the scikit-learn library and PySpark.


The students will be required during the period of the course to carry out at least two machine learning projects and present and justify their results for assessment.