Sofya (Sofi) Dymchenko 🐭

Hi, salut, привет! I am a last-year Ph.D. student at University Grenoble Alpes and INRIA (France), I work with Bruno Raffin in Datamove team. I am a member of Melissa project which develops a framework for efficient large-scale deep surrogate training on HPC systems.

My thesis research question is:

“How to make training of data-driven surrogates more data-efficient, i.e., obtaining accurate surrogate with fewer generated simulations?”

The cost of generating training data from physical solvers is a major bottleneck in scientific deep learning, it especially limits the scaling of deep surrogate models application to complex real-world problems. Compared to classical deep learning, with solvers we have the power to create data, but with such power comes the question: “Which data to generate?”. Current standard practice is to create the data by uniformly sampling input parameters of a solver. Is it really the best way? It is my thesis’s goal to close this gap. I developed an active learning method that reduces the number of training data simulations required by choosing more informative input parameters based on surrogate training loss. For more details, see publications in the list below and/or the recording of my presentation at SC 2024 [youtube-link].

After defending my thesis (February-March 2026), I plan to continue working in the field of AI for Science. I want:

to work back-to-back in a team of passionate researchers and engineers,
to apply my skills and knowledge to solving real-world scientific cases,
to stay connected to the scientific community by publishing and presenting the results, and
to keep learning and developing myself in this fast-evolving field.

Do not hesitate sending me a message to: sofya(dot)dymchenko(at)inria(dot)fr.

#openscience #inclusivescience #phdwithadhd

my publications

Loss-driven sampling within hard-to-learn areas for simulation-based neural network training

Sofya Dymchenko and Bruno Raffin

In MLPS 2023 - Machine Learning and the Physical Sciences Workshop at NeurIPS 2023 - 37th conference on Neural Information Processing Systems, Dec 2023

PDF
MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning

Sofya Dymchenko, Abhishek Purandare, and Bruno Raffin

In AI4S 2024 - 5th Workshop on artificial intelligence and machine learning for scientific applications, Nov 2024

PDF Video Slides
Poster: Loss-driven sampling for online neural network training with large scale simulations

Sofya Dymchenko and Bruno Raffin

Sep 2023

Poster

PDF