Daryna Dementieva

Ph.D.

Room: FMI 01.05.60
E-Mail: [email protected]
Address: TUM - Fakultät für Informatik, Boltzmannstr. 3, 85748 Garching

Research Interests

During my PhD, my main focus was my research was different aspects of NLP for Social Good (NLP4SG). Specifically, I had a focus on fake news detection and texts detoxification. Moreover, I explred multilingual and cross-lingual possibilities or these tasks. You can check my whole thesis "Methods for Fighting Harmful Multilingual Textual Content" here. The main ideas of the projects are presented below:

Multiverse: Fake News Detection with Multilingual Evidence We propose a new method for fake news detection that compare the piece of news with the others publised all accross the world in different languages. The hypothesis is if the news is fake, it will be not so widespread accross media or the fact will be different, there can be contradictions. On the other hand, the true news will be repeated with the same key facts all accross the world. Such cross-lignual news comparison can help the users to see the different views on the situation and thibk about the piece o news more critically. Right now, we are developing solution for Russian invasion into Ukraine case. The main publication about this technology can be found here.

Multiverse Graphical Abstract

Text Style Transfer: Text Detoxification Case Text Style Transfer task is the task of transferring the style of text from one into another. Specifcally for the toxicity case, we can work on the task text detoxification -- transferring style of text from toxic into non-toxic. Before, we colleced a parallel corpus for addressing this problem by seq2seq models. Still, there are opened questions such as how to deal with more severe types of toxicity, how to estimate uncertanity of the model on unknown samples, how to generate counter speech to hate and toxic speech. The main publication about this technology can be found here.

Multiverse Graphical Abstract

Also, together with PhD students, I particiapte in eXplainable AI (XAI) and Legal NLP projects development.

As a result, if you want to do a Guided Research or a Thesis, we can talk about the following topics:

fake news detection and all relevant subtasks;
toxic, hate, sexism, bias speech detection and mitigation;
explainable NLP;
Ukrainian NLP;
multilingual and cross-lingual language modeling;
you can also suggest your topic of interest for NLP4SG!

OpenSource Contribution

Before, I contributed to s-nlp groud. More specifically,:

ParaDetox: English parallel corpus for texts detoxification;
roberta_toxicity_classifier: English toxicity classification model;
bart_detox: English SOTA for texts detoxification;
RuParaDetox: Russian parallel corpus for texts detoxification;
ruT5_detox: Russian SOTA for texts detoxification;
xlmr_formality_classifier: Multilingual model for texts formality classification;
Multiverse: Code for Multiverse feature extraction;

Now, I am one of the admins at tum-nlp.

Talks

Check out the recent talk about my research at MunichNLP!

Supervision:

I supervise the following students:

Adam Rydelek (Masters) "Sexism in Social Media -- Explainable Detection Using Deep Learning Models" (ongoing)
Daniel Schroter (Masters) "Discovering Human Values Behind Arguments -- A Case for Transfomer-based Models and Explainable AI" (ongoing)
Yen-Yu Chang (Masters) "Exploration of Approaches to Counter Hate Speech: The Case of Sexist Speech" (ongoing)
Yuliana Poliakova (Bachelors) "Detecting the Persuasion Techniques in Online News" (ongoing)

Publications

To my Google Scholar profile

Dementieva, D., Moskovskiy, D., Logacheva, V., Dale, D., Kozlova, O., Semenov, N., & Panchenko, A. (2021). Methods for detoxification of texts for the russian language. Multimodal Technologies and Interaction, 5(9), 54.

Logacheva, V.*, Dementieva, D.*, Ustyantsev, S., Moskovskiy, D., Dale, D., Krotova, I., ... & Panchenko, A. (2022, May) ParaDetox: Detoxification with Parallel Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 6804-6818).(* equal contribution)

Moskovskiy, D., Dementieva, D., & Panchenko, A. (2022, May). Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop (pp. 346-354).

Dementieva, D., Kuimov, M., & Panchenko, A. (2022). Multiverse: Multilingual Evidence for Fake News Detection. arXiv preprint arXiv:2211.14279.

Dementieva, D., & Panchenko, A. (2021, August). Cross-lingual evidence improves monolingual fake news detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop (pp. 310-320).