11-10-2017 The Data Ninja competition: the official launch of the competion (Friday, Oct. 13, 13.30, room 122 BT)
11-10-2017 The challenge has begun :)

The aim and the scope of the challenge

The aim of the course: To learn how to deal with real-life massive data (or to win a data mining competition).

The scope of the course: Application of data mining algorithms to real-life massive data:

Main information about the course

Time and place


Schedule of lectures

11-10-2017 The mining massive data sets challenge [pdf]
13-10-2017 Launching of the "Data Ninja" competition (a talk by Tomasz Gramza from OLX) [pdf]
Introduction to Artificial Intelligence by Amazon Web Services [pdf]
18-10-2017 Some words about the "Data Ninja" competition (a talk by Tomasz Gramza and Arkadiusz Robiński from OLX)
25-10-2017 Performance measures in the Data Ninja Challenge [pdf]
08-11-2017 Student's talk: Multi-label classification (Krzysztof Martyn) [pdf]
Student's talk: Word2Vec (Magdalena Dzięcielska) [pdf]
15-11-2017 Student's talk: Some simple approaches to face recognition (Piotr Majorczyk)
22-11-2017 The 2nd solution of the last year Data Ninja competition (Michał Kempka, Marek Wydmuch) [pdf]
29-11-2017 Student's talk: Deep reinforcement learning (Aleksandra Kobus and Maciej Uniejewski) [pdf]
Student's talk: Face detection algorithms (Zofia Kulus)
06-12-2017 Presentation of students' reports: [two-1]
Presentation of students' reports: [iswd-2]
13-12-2017 Presentation of students' reports: [iswd-1]
20-12-2017 Student's talk: Open AI Gym (Bartosz Prusak, Grzegorz Latosiński) [pdf]
Student's talk: Learning from imbalanced data (Małgorzata Janicka) [pdf]
03-11-2018 Discretization and calibration [pdf]
10-11-2018 NIPS 2017 Test-of-time award talk [link]
17-01-2018 Presentation of students' reports
24-01-2018 Presentation of students' reports

The Challenge

The Data Ninja competition

For more information check:


Project's meetings

Schedule of the project's meetings

The schedule of the project's meetings can be found here.


Report: data organization/processing: 45 points (min. 50%)
Report: predictive models : 45 points (min. 50%)
Students' talk : 10 points
Top score in the challenge : 10 points


90% 5.0
80% 4.5
70% 4.0
60% 3.5
50% 3.0