Sofya (Sofi) Dymchenko 🐭

prof_pic.jpg

Hi, salut, привет! I am a last-year Ph.D. student at University Grenoble Alpes and INRIA (France), I work with Bruno Raffin in Datamove team. I am a member of Melissa project which develops a framework for efficient large-scale deep surrogate training on HPC systems.


My thesis research question is:

“How to make training of data-driven surrogates more data-efficient, i.e., obtaining accurate surrogate with fewer generated simulations?”

The cost of generating training data from physical solvers is a major bottleneck in scientific deep learning, it especially limits the scaling of deep surrogate models application to complex real-world problems. Compared to classical deep learning, with solvers we have the power to create data, but with such power comes the question: “Which data to generate?”. Current standard practice is to create the data by uniformly sampling input parameters of a solver. Is it really the best way? It is my thesis’s goal to close this gap. I developed an active learning method that reduces the number of training data simulations required by choosing more informative input parameters based on surrogate training loss. For more details, see publications in the list below and/or the recording of my presentation at SC 2024 [youtube-link].


After defending my thesis (February-March 2026), I plan to continue working in the field of AI for Science. I want:

  • to work back-to-back in a team of passionate researchers and engineers,
  • to apply my skills and knowledge to solving real-world scientific cases,
  • to stay connected to the scientific community by publishing and presenting the results, and
  • to keep learning and developing myself in this fast-evolving field.

Do not hesitate sending me a message to: sofya(dot)dymchenko(at)inria(dot)fr.

#openscience #inclusivescience #phdwithadhd

my publications

  1. nips23.png
    Loss-driven sampling within hard-to-learn areas for simulation-based neural network training
    Sofya Dymchenko and Bruno Raffin
    In MLPS 2023 - Machine Learning and the Physical Sciences Workshop at NeurIPS 2023 - 37th conference on Neural Information Processing Systems, Dec 2023
  2. sc24.png
    MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning
    Sofya Dymchenko, Abhishek Purandare, and Bruno Raffin
    In AI4S 2024 - 5th Workshop on artificial intelligence and machine learning for scientific applications, Nov 2024
  3. poster.png
    Poster: Loss-driven sampling for online neural network training with large scale simulations
    Sofya Dymchenko and Bruno Raffin
    Sep 2023
    Poster