Dr Onkar Dabeer, Tata Institute of Fundamental Research, India
The talk has two parts, both involving pooling of data from different sources to improve the estimation task at hand. In particular, I will emphasize the modeling aspect in both parts, which may be of interest in physical sciences.
1. In e-commerce, we often have access to ratings given by users for many of the items they have bought/experienced. In collaborative filtering, we pool together rating data from different users about different items and use it to make item recommendations for users. We propose a mathematical model to study this problem, identify fundamental performance limits for the model, exhibit schemes that achieve these limits, and test their performance on real data.
2. We consider a collection of prediction experiments, where several experiments may share the same regression parameters (but we do know which experiments are similar). By pooling data across experiments, we hope to do better. In this talk, I will show an application of this framework and discuss some methods to solve the problem.
Extra-solar planets via a Bayesian multi-planet periodogram
Professor Phil Gregory, University of British Columbia, Canada
A remarkable array of new ground based and space based astronomical tools are providing astronomers access to other solar systems. Over 700 planets have been discovered to date including several super earths in the habitable zone. These successes on the part of the observers have spurred a significant effort to improve the statistical tools for analyzing data in this field.
I will describe a Bayesian multi-planet Kepler periodogram based on a new fusion Markov chain Monte Carlo algorithm which incorporates parallel tempering, simulated annealing and genetic crossover operations. Each of these features facilitate the detection of a global minimum in chi-squared in a multi-modal environment. By combining all three, the algorithm greatly increases the probability of realizing this goal.
The fusion MCMC is controlled by a unique two stage adaptive control system that automates the tuning of the proposal distributions for efficient exploration of the model parameter space even when the parameters are highly correlated. This controlled fusion MCMC algorithm is implemented in Mathematica using parallized code and run on an 8 core PC. It is designed to be a very general tool for nonlinear model fitting. The performance of the algorithm will be illustrated with some recent successes in the exoplanet field where it has facilitated the detection of a number of new planets.
From astrophysics to fusion plasmas: signal processing and system optimization analysis for ITER
Dr Duccio Testa, Ecole Polytechnique Fédérale de Lausanne, Switzerland
Efficient, real-time and unsupervised data analysis is one of the key elements for achieving scientific success in complex engineering and physical systems, of which three examples are the currently operating Joint European Torus (JET) and the soon-to-be-built International Thermonuclear Experimental Reactor (ITER) and the Square Kilometre Array (SKA) telescope.
There is a wealth of signal processing techniques that are being applied to data analysis in such complex systems, and here we wish to present some examples of the synergies that can be exploited when combining ideas and methods from different fields, such as astronomy and astrophysics and thermonuclear fusion plasmas.
One problem which is common to these subjects is the determination of pulsation modes from irregularly sampled time-series. We have used recent techniques of signal processing in astronomy and astrophysics, based on the Sparse Representations of Signals, to solve current questions arising in thermonuclear fusion plasmas. Two examples are the detection of magneto-hydrodynamic instabilities, which is now performed routinely in JET in real-time on a sub-millisecond time-scale, and the studies leading to the optimization of the magnetic diagnostic system in ITER. These questions have been solved formulating them as inverse problems, despite the fact that these applicative frameworks are extremely different from the classical use of Sparse Representations, on both the theoretical and computational points of view.
Requirements, prospects and ideas for the signal processing and real-time data analysis applications of this method to routine operation of ITER and of the SKA telescope will be discussed.
Finally, we will conclude with an example of a potential application of the Sparse Representation method to the analysis of electrical prospections (using the so-called Schlumberger diagram) in an Etruscan necropolis and in an Etruscan fortress town located close to Rome, both sites dating from around the fifth century BC.
P Blanchard, A Fasoli, J B Lister, Ecole Polytechnique Fédérale de Lausanne, Switzerland.
S Bourguignon, Institut de Recherche en Communications et Cybernétique, France
H Carfantan, Université de Toulouse, France
A Goodyear, Culham Centre for Fusion Energy, UK
G Vayakis, ITER organization, France
P Blanchard, Ecole Polytechnique Fédérale de Lausanne, Switzerland and JET-EFDA Close Support Unit, Culham Science Centre, UK
A Klein, formerly Massachusetts Institute of Technology, USA.
T Panis, formerly Ecole Polytechnique Fédérale de Lausanne, Switzerland
JET-EFDA contributors, see Appendix of F Romanelli et al, Nuclear Fusion 51 (2011), 094008 Proceedings of the 23rd IAEA Fusion Energy Conference 2010, Daejeon, Korea)
The Gruppo Archeologico Romano, Rome section of the Gruppi Archeologici d’Italia