ICIC Data Analysis Workshop

ICIC Data Analysis Workshop, September 11-13 2013 

Principled statistical methods for researchers

 

Venue: 

Imperial Centre for Inference and Cosmology (ICIC), Imperial College, South Kensington, London.  Workshop in Huxley Building, Room 311.

Dates:  11-13 September 2013

LATEST NEWS

We will have a wine reception on Wednesday after the afternoon session.  Details at the workshop.

Otherwise each day will finish at 5 p.m.

Summary

We will run a 3-day course/workshop on statistical methods and tools for data analysis, aimed at PhD students, postdocs and any staff interested in understanding Bayesian statistics and numerical techniques of data analysis.  The course plan is to combine morning lectures with problem sets and practical work in the afternoons. It will concentrate on setting down firm foundations of principled data analysis, but a feature of the workshop will be a substantial element of hands-on classes where participants will learn how to apply the ideas in practice.  It will be hosted by the Imperial Centre for Inference and Cosmology at Imperial College.

 

Background

Most researchers will at some point be required to perform some form of data analysis.  This may be anything from simple line-fitting, through parameter estimation, to complex and computationally-demanding sampling for model selection on large datasets. Anecdotal evidence suggests that many researchers are not well prepared for this, often doing the right thing incorrectly, or picking up an inappropriate statistical tool.   The purpose of this course is to provide understanding of principled data analysis, and experience of applying appropriate methods to data.

 

Preparation

We expect all participants to bring their own laptop, and to do a simple computational exercise in advance (in whatever language suits) to ensure they have appropriate software in place before the workshop starts.   The instructions are here:Preliminary Exercise.pdf. The data file of supernovae is here: SN.txt

Costs

There will be a small registration fee of £20 to cover refreshments. Participants will be responsible for all other travel and subsistence costs.  Inexpensive lunch is readily available on campus.

 

Registration

The number of places is limited, and places are offered on a first-come first-served basis. Please register here by August 30th. If you are unable to register because the workshop is full, please email Rachel Groom (r.groom@ null imperial.ac.uk) to be put on the waiting list, in case of dropouts or an increase in quota.

Draft Programme

Day 1  (Weds 11 September 2013)

  • 9.15 a.m. Registration, coffee and pastries. 311 Huxley Building
  • Start of Workshop 9.45 a.m.
  • Bayesian Foundations:
  • What is probability?
  • The Laws of Probability and Bayes’ Theorem
  • Priors
  • Parameter inference
  • Marginalization
  • Confidence intervals, credibility intervals
  • Problem class: Simple problems
  • Tutorial: day summary
  • End: 5 p.m.

 

Day 2 (Thurs 12 September 2013)

  • Bayesian Computation: Parameter Estimation and Sampling
    • Grid-based methods
    • Markov Chain Monte Carlo
    • Metropolis-Hastings algorithm
    • Convergence tests – Rubin-Gelman
    • Gibbs Sampling
    • Hamiltonian Monte Carlo
    • Case Study: Cosmic Microwave Background
    • Hands on: MCMC code from scratch.  Cosmology from the Supernova Hubble Diagram.
    • Tutorial: day summary
    • End: 5 p.m.

 

Day 3 (Fri 13 September 2013)

  • Why not p-values and reduced chisquared?
  • Model Comparison with Bayesian Evidence
  • Case study: is the Universe flat?
  • Hands on: model comparison calculations and computations &/or complete MCMC codes.
  • Tutorial: wrap up the workshop
  • 5 p.m. End of Workshop

 

Learning outcomes

 

At the end of the Workshop, the participants should be able to (non-exhaustive list):

  • Express stochastic problems in terms of fundamental probability and Bayes’ theorem.
  • Demonstrate by application to real data understanding of probability, inference, priors, posteriors, marginalisation, parameter estimation, hypothesis testing, model selection, sampling.
  • Code and apply a simple MCMC program to physical data.
  • Formulate model selection problems in a principled statistical framework, and be capable of executing some methods of solution.

 

Course Team

Prof Alan Heavens, Prof Andrew Jaffe, Dr Roberto Trotta (ICIC Physics); Dr Daniel Mortlock (ICIC Physics and Mathematics).

 

Point of contact: Professor Alan Heavens, Director, Imperial Centre for Inference and Cosmology, Blackett Laboratory, Prince Consort Road, London SW7 2AZ. Email a.heavens@ null imperial.ac.uk Tel. 0207 594 2930, or Rachel Groom (r.groom@ null imperial.ac.uk) 0207 594 7770.

Registration

Please register here by Aug 30th 2013.  If you are unable to register, please email Rachel Groom (r.groom@ null imperial.ac.uk) to be put on the waiting list, in case of dropouts or an increase in quota.

Lecturers

Prof Alan Heavens (ICIC Physics)

Prof Andrew Jaffe (ICIC Physics)

Dr Roberto Trotta (ICIC Physics)

Dr Daniel Mortlock (ICIC Physics and Mathematics).

 

List of Participants

Currently registered participants are listed here

Practical info

Local map 

Local restaurant suggestions

sthkencampus.pdf

 

Tuesday evening: we will be in the Queen's Arms, 30 Queens Gate Mews, Kensington, London, SW7 5QL,
close to Imperial.    See the Local map. People will be there from 6.30-9 at least, and possibly later. 

 

Travel

The Huxley Building is very close to the Royal Albert Hall in South Kensington.  You can enter through Blackett (No. 6 on map: sthkencampus.pdf) or Huxley (No. 13; most straightforward to enter from the Queen's Gate road side).  We will have signs from both entrances on Wednesday.  Nearest tube stops are South Kensington and Gloucester Road (10 mins walk), and there are buses which pass close by - see the map and the TfL website for details ( http://www.tfl.gov.uk/ ).  You may find the Journey Planner facility useful.  Note that it is worth buying a pay-as-you-go Oyster card if you are going to use the system at all - the fares are much cheaper than buying individual tickets.

From South Kensington Underground station: if it's raining, when you come through the barriers, you can turn right below ground and take the long tunnel from the station, which emerges pretty much at the bottom of the main map (attached).  Otherwise, it's preferable to go straight ahead and head up for the daylight, exit to street level immediately, turning right to get into Thurloe St (this is mostly pedestrianised now) - it's a more pleasant walk above ground.  

 

Lecture Handouts:

 

Alan Heavens:

Presentation: ICIC Data Analysis lectures.pdf

Case study: PopulationMean.pdf

Andrew Jaffe:

Presentation 1: Probability—MoreExamplesAndConcepts.pdf

Presentation 2: CMB_CaseStudy.pdf

Daniel Mortlock:

Presentation 1: param_est.pdf

Presentation 2: hypothesis_tests.pdf

Roberto Trotta:

Summary notes Day 3

Slides Day 3

 

Hands-On Handouts:

 

Day 0:

Preliminary Exercise.pdf

Data: SN.txt

Day 1:

Problems_Day1.pdf

Solutions_Day1.pdf

Day 2:

SN MCMC project.pdf

For general Universes:

Here LumDistRows.txt is a file with DL*(z) for different pairs of Om, Ov (matter, vacuum energy)

Schematically:

for Om=0 to 1 in steps of 0.01 (101 points)

  for Ov=0 to 1 in steps of 0.01

     A line of DL(z) for z=0 to 1.8 in steps of 1.8 (181 points), separated by spaces

  end

end

 

Day 3:

Model comparison exercices