Krzysztof Dembczyński

My research interests span the fields of machine learning and decision support. In particular, I was working on decision rule models, boosting and preference learning. Currently, my main research activity concerns multi-label classification and structured output prediction.

I am an assistant professor at Poznań University of Technology (Poland), in the laboratory of Intelligent Decision Support Systems headed by Prof. Roman Słowiński.

My Curriculum Vitae can be found here.

Some News

An invited talk at the UGent Data Science Seminar in Ghent, Belgium, June 13, 2019:

The slides from the talk Label tree algorithms for extreme classification can be found here.
The NIPS paper on extreme classification:

Our recent paper with Marek Wydmuch, Kalina Jasinska, Michail Kuznetsov, and Robert Busa-Fekete has been accepted for NIPS 2018. The arxiv version of the paper can be found here.
Talk about cross-device identification of users at IT Research Workshop at IFIP WCC, 21 September 2018:

The slides from the talk can be found here.
Tutorial on multi-target prediction at ECMLPKDD 2018 in Dublin:

With Willem Waegeman and Eyke Hüllermeier we gave a tutorial on multi-target prediction. The slides from the tutorial can be found here.
The Dagstuhl seminar on Extreme Classification, 15-20 July 2018:

With Sami Bengio, Thorsten Joachims, Marius Kloft, and Manik Varma we organized the Dagstuhl Seminar on Extreme Classification.
The WebConf workshop on Extreme Multi-Label Classification for Social Media, 23 April 2018:

With Akshay Soni, Aasish Pappu, and Robert Busa-Fekete from Yahoo! Research we organize a workshop at the prestigous Web Conference on Extreme Multi-Label Classification. For more details check the workshop webpage.
Tutorial on Extreme Multi-Label Classification at European Conference on Information Retrieval:

With Rohit Babbar with gave a tutorial at ECIR: [tutorial homepage].
Presentation at the Seminars of the Intelligent Decision Support Systems Lab:

With Kalina Jasińska and Marek Wydmuch we gave a talk for the IDSS lab about probabilistic label trees: [pdf].
A lecture given during the Pre-doc Summer School on Learning Systems on Monday, July 3, 2017, at ETH Zürich.

Slides from the lecture: [pdf].
A lecture at Adam Mickiewicz University in Poznań

Slides from my talk on extreme multi-label classification given in a series of open lectures on "Mutlivariate statistical methods for engineering" [pdf].
The 3-year project "Consistent and scalable learning algorithms for structured output prediction" financed by the National Science Centre (NCN) has been finished.

The main achievements of the project are:
- A new extreme classficiation algorithm for multi-label problems, referred to as Probabilistic Label Trees [icml paper] [code].
- A theoretical analysis of complex performance measures [MLJ paper]
- An analysis of Fagin' threshold algorithm in a wide spectrum of machine learning problems [DAMI paper]
- A theoretical analysis of probabilistic classifier trees [ECML PKDD paper]
- An online algorithm for F-measure maximization [NIPS paper]
An invited talk at the TFML 2017 conference

Slides from my talk on Extreme Zero-Shot Learning [pdf].
The Best Paper Award for a paper with Wojciech Kotłowski at the Asian Conference on Machine Learning 2015:

The paper, entitled Surrogate regret bounds for generalized classification performance metrics, can be found here

Tutorials

Tutorial on multi-target prediction at ECMLPKDD 2018 in Dublin:

With Willem Waegeman and Eyke Hüllermeier we gave a tutorial on multi-target prediction. The slides from the tutorial can be found here.
Tutorial on Extreme Multi-Label Classification at European Conference on Information Retrieval 2018:

With Rohit Babbar with gave a tutorial at ECIR: [tutorial homepage].
A lecture given during the Pre-doc Summer School on Learning Systems on Monday, July 3, 2017, at ETH Zürich.

Slides from the lecture: [pdf].
Tutorial on Multi-Target Prediction at ALT/DS 2013:

I gave a tutorial on Multi-Target Prediction at the Discovery Science 2013 conference (co-located with Algorithmic Learning Theory 2013).The slides from the talk are available here.
Tutorial on Multi-Target Prediction at ICML 2013:

We gave a tutorial on Multi-Target Prediction at ICML 2013. The videos from the tutorial are available on techtalks.tv.

Selected publications

Extreme F-Measure Maximization using Sparse Probability Estimates
Kalina Jasinska, Krzysztof Dembczyński, Robert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, Eyke Hüllermeier
Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 2016. JMLR: W&CP, 48 1435-1444, 2016
On the Bayes-optimality of F-measure maximizers
Krzysztof Dembczyński, Willem Waegeman, Arkadiusz Jachnik, Weiwei Cheng, Eyke Hüllermeier
Journal of Machine Learning Research, 15 3333-3388, 2014
On loss minimization and label dependence in multi-label classification
Willem Waegeman, Krzysztof Dembczyński, Weiwei Cheng, Eyke Hüllermeier
Machine Learning, 88 5-45, 2013
Listed in Notable Computing Books and Articles 2012 by ACM Computing Reviews
Learning monotone nonlinear models using the Choquet integral
Ali Fallah Tehrani, Weiwei Cheng, Krzysztof Dembczynski, Eyke Hüllermeier
Machine Learning, 89 183-211, 2013
ENDER: a statistical framework for boosting decision rules
Krzysztof Dembczyński, Wojciech Kotłowski, Roman Słowiński
Data Mining and Knowledge Discovery, 21 52-90, 2010
Optimizing the F-measure in multi-label classification: Plug-in rule approach versus structured loss minimization
Krzysztof Dembczyński, Arkadiusz Jachnik, Wojcieh Kotlowski, Willem Waegeman, Eyke Hüllermeier
International Conference on Machine Learning (ICML 2013) 2013
An analysis of chaining in multi-label classification
Krzysztof Dembczyński, Willem Waegeman, Eyke Hüllermeier
European Conference on Artificial Intelligence (ECAI 2012) 294-299, 2012
Best Paper Award
Consistent multilabel ranking through univariate losses
Krzysztof Dembczyński, Wojciech Kotłowski, Eyke Hüllermeier
International Conference on Machine Learning (ICML 2012) 2012
An exact algorithm for F-measure maximization
Krzysztof Dembczyński, Willem Waegeman, Weiwei Cheng, Eyke Hüllermeier
Advances in Neural Information Processing Systems 25 (NIPS 2011) 2011
Bipartite ranking through minimization of univariate loss
Krzysztof Dembczyński, Wojciech Kotłowski, Eyke Hüllermeier
International Conference on Machine Learning (ICML 2011) 2011
Bayes optimal multilabel classification via probabilistic classifier chains
Krzysztof Dembczyński, Weiwei Cheng, Eyke Hüllermeier
International Conference on Machine Learning (ICML 2010) 2010

You can find more my publications on Google Scholar.

Teaching

Decision-theoretic machine learning (for graduate students, summer 2018/2019): http://www.cs.put.poznan.pl/kdembczynski/lectures/dtml/
Processing of Massive Datasets - Short Course (summer 2018/2019): http://www.cs.put.poznan.pl/kdembczynski/lectures/pmds-sc/
Mining of Massive Datasets (winter 2018/2019): http://www.cs.put.poznan.pl/kdembczynski/lectures/mmds/
The Mining of Massive Datasets Challenge (winter 2018/2019): http://www.cs.put.poznan.pl/kdembczynski/lectures/mmds-challenge/
Processing of Massive Datasets (winter 2018/2019): http://www.cs.put.poznan.pl/kdembczynski/lectures/pmds/

Software

extremeText (XT)

extremeText is a our new implementation of probabilistic label trees built upon the fastText package. It significanlty improve fastText on multi-label problems. The github repository: https://github.com/mwydmuch/extremeText.
Extreme Multi-Label Classification (XMLC)

Extreme Multi-Label Classification (XMLC) deals with multi-label problems with hundreds of thousands of labels. Implementation of Probabilistic Label Trees (PLT) suited for this kind of problems can be found here: https://github.com/busarobi/XMLC.
Probabilistic Classifier Chains:

Probabistitic Classifier Chains (PCC) are a learning method for multi-label classification problems. You can find the code in the GitHub repository: https://github.com/multi-label-classification/PCC.
Ensembles of Decision Rules:

There are two fast and accurate boosting algorithms for learning decision rule models. For multi-class problems use MLRules (Maximum Likelihood Rule Ensembles), while for regression problems use RegEnder (regression ensemble of decision rule).

Krzysztof DembczyńskiAssistant Professor

Some News

Tutorials

Selected publications

Teaching

Software

Krzysztof Dembczyński
Assistant Professor