End-to-End Optimization Keynote at DeepLearn 2022

The DeepLearn School Series, now reaching its seventh edition, offers insight into artificial intelligence and applications in a week-long course, featuring a significant number of top instructors. This edition, currently being held at the Technical University of Lulea in northern Sweden, features the following:

– Sean Benson, Netherlands Cancer Institute
-Thomas Breuel, Nvidia
– Hao Chen, Hong Kong University of Science and Technology
– Janlin Chen, University of Missouri
– Nadya Chernyavskaya, CERN
– Efstratios Gavves, University of Amsterdam
– Quanquan Gu, University of California, Los Angeles
– Jiawei Han, University of Illinois Urbana-Champaign
-Awni Hannun, Zoom
– Tin Kam Ho, IBM Thomas J. Watson Research Center
– Timothy Hospedales, University of Edinburgh
– Shih-Chieh Hsu, University of Washington
– Tatiana Likhomanenko, Apple
– Othmane Rifki, Spectrum Labs
– Mayank Vatsa, Indian Institute of Technology Jodhpur
– Yao Wang, New York University
– Zichen Wang, Amazon Web Services
– Alper Yilmaz, Ohio State University

In addition to the talks offered by the above academics and industry specialists, the organizers have invited two keynote speakers to discuss some cutting-edge developments: Elaine O. Nsoesie of Boston University, and yours truly. So I’m spending a few days near the Arctic Circle, and this afternoon I’ll be giving my main lecture. What am I going to talk about?

The title of my lecture is easy to guess: “Designing Experiments Optimized for Deep Learning: Challenges and Opportunities“. Below I will give some highlights of the material, and hopefully I can later link a registered version to the three readers who are interested (to you three: if I fail to link it, please report tell me about it by e-mail).

My conference

As I will be talking to students from quite different backgrounds, who come to learn about deep learning for many different disciplines, I will have to introduce the topic of particle physics. I have some introductory slides where I explain what the Standard Model is, what we do at the LHC, how we detect particles and how we make sense of them. The slide below summarizes this last element. It’s supposed to be animated (a bit at a time) but here you can see it all at once:

Next, I discuss how deep learning is currently being used in particle physics, and how the topic I want to discuss raises the bar a bit higher:

I address the needs of this new area of ​​research by explaining that building successful detectors is very complex, and to do this we rely on experience and some longstanding paradigms that have helped us over the past 50 years or so, but which do not aim for optimality – rather they seek robustness and redundancy. So, moving away from these paradigms, we can imagine gaining a lot by continuously scanning the very high-dimensional parametric space of possible design choices. An important part of it is to define all aspects of the problem with differentiable functions. It may sound naïve, but it’s a great exercise in forcing yourself to state what your goals are – something that, if left unaddressed, won’t produce the desired results.

I then digress to discuss an optimization task I undertook on my own two years ago, when I was able to prove that an experiment in design could be better designed – for this To do this, I had to create a quick simulation of the whole experiment and physics and then study the different choices. I proved that the relevant merit factor could be improved by a factor of two without increasing the cost, and now the collaboration adopts many of my suggestions.

I then discuss in detail how the problem can be formalized in general, with some mathematics. I’ll spare you that, and just tease you with one of the slides where I explain how optimality can be searched for in the configuration space.

I then show an example of how the above recipe can be implemented. It is an excellent job from some members of MODE who work for the LHCb experiment (Ratnikov, Derkach, Boldyrev, Ustyuzhanin). They created a generative adversarial network emulation of EM showers in the calorimeter, in order to insert this into a differentiable pipeline optimizing the geometry of the device.

Then I discuss a project that I initiated with some colleagues from Padua, Louvain-la-Neuve and elsewhere. The topic is density map inference of unknown volumes by diffusion tomography. You follow a muon through the volume and understand the density of matter it encountered based on the amount of scattering it resisted. Optimizing “simple” detectors that do this is less complex than doing it for a collider, but the challenges are similar, and solving easy use cases allows us to focus on the real obstacles, gain confidence and build a library of solutions.

Finally, I look at the future of particle physics and point out some challenges ahead.

That’s it – Class due in 1 hour so I have to hurry.

Angela C. Hale