**Statistics and Probability Session**

*Talks will be in MR4.*

**Thursday**

- 2.30-3.15: Tomas Juskevicius - Domination inequalities for unimodal distributions
- 3.15-4.00: Parkpoom Phetpradap - Large deviations for the range of a simple random walk
*Tea/Coffee*- 4.30-5.15: Jonathan Forster (keynote speaker) - Statistics: The Mathematics of Society (accessible talk)
- 5.15-6.00: Saverio Giuliani - A stochastic programming approach for strategic planning with activity based costing

**Friday**

*Plenary lecture, Tea/Coffee*- 11.00-11.45: Tamara Broderick - Treed Gaussian process models for classification
- 11.45-12.30: Helen Thornewell - Measuring the Vulnerability of Incomplete Block Designs to Observation Loss: Applications to Design Selection and Construction
*Lunch, Panel discussion, Tea/Coffee*- 4.00-4.45: Ben Roberts - Incentivizing Participation in Resource-sharing Networks

*Domination inequalities for unimodal distributions* - Tomas Juskevicius

We consider the problem of finding optimal upper bounds for the expectation of a convex function of a sum of independent unimodal random variables with known modes. The conditions are on the means and ranges of the summands. Maximizing with respect to admissible choices for the modes leads to optimal bounds when the modes are unknown. The results for the convex functions are then applied to obtain sharp inequalities for the tail probabilities. These inequalities are optimal up to some constant type factors and thus are sharper than the mainstream exponential inequalities. A reduction to the i.i.d. case is also provided. In particular, our general approach yields bounds for variances of unimodal distributions obtained earlier in a number of papers using ad hoc methods. All results extend to maximal inequalities and (super)martingale type dependence.

*Large deviations for the range of a simple random walk* - Parkpoom Phetpradap

Consider a d-dimensional simple random walk, let R_n be the number of points visited by the random walk up to time n. We prove that R_n/n satisfies a large deviation principle with rate n^{\frac{d-2}{d}} and explicitly given rate function.

*Statistics: The Mathematics of Society* (accessible talk) - Jonathan Forster (keynote speaker)

The field of Statistics has expanded hugely in scope and range of application from its original definition as pertaining to the condition of a state. Nevertheless, there is a continuing requirement for effective statistical models for social processes. I will show how Statistics provides a bridge between Mathematical Sciences and the Social Sciences, giving illustrations of the way in which Mathematics such as graph theory, hypergeometric functions and representation theory arises in the study of, for example, cohabitation, social survey data disclosure and migration.

*A stochastic programming approach for strategic planning with activity based costing* - Saverio Giuliani

The study of disciplines like Activity Based Costing (ABC), Stochastic Programming and the Resource Based View of the Firm (RBV) has been treated in deep from different authors and a large review is possible if considered separately. But what happens if we study the connections, pairwise or even all of them? Very few authors are facing such issues: an analysis of connections among ABC, RBV and mathematical programming has been developed in literature [5]: we are extending such approach to stochastic programming. The traditional model of ABC [1], provides the possibility to identify matrix resources- activities and activities-services. From the RBV we inherit the taxonomy of resources [5] and the evaluation of their features [4]. The proposed model involves a stochastic demand and a relative minimum level to satisfy. If decisor Öxes a priori the level of feasibility p we propose a chance constrained formulation. If we choose resources as Örst stage variables with budget constraints, and activities and products as second stage variables we have a two-stage formulation: penalties are necessary to evaluate a posteriori the probability of feasibility. Possible nonlinearities of the sustaining function can be solved in a linear way but general nonlinearities are more complex [3]. A real implementation of the models has been developed in an italian hospital and solved by means of Risk Solver Platform software assuming a normal distribution for demand. The Value of the Stochastic Solution [2] has shown a positive value where penalties are treated in a separate way.

*Treed Gaussian process models for classification* - Tamara Broderick

Recognizing the successes of treed Gaussian process (TGP) models as an interpretable and thrifty model for nonstationary regression, we seek to extend the model to classification. Both treed models and Gaussian processes (GPs) have, separately, enjoyed great success in application to classification problems. An example of the former is Bayesian CART. In the latter, real-valued GP output may be utilized for classification via latent variables under *M-1* regression GP (priors) for* M* classes which provide classification rules via a softmax function. This leads to two ways of combining trees with GPs for classification. We can partition the data set once and associate *M-1* GPs to each region of the partition, or we can form *M-1* separate full TGPs. We take the latter route in the interests of faster mixing and use a Bayesian model averaging scheme to traverse the full space of classification TGPs (CTGPs) via joint proposals for the tree topology *and* the GP parameters at the leaves. We explore schemes for efficiently sampling the latent variables, which is important to obtain good mixing in the expanded parameter space. Our proposed CTGP methodology is illustrated on a collection of synthetic and real data sets. We assess performance relative to existing methods and thereby show how CTGP is highly flexible, offers tractable inference, produces rules that are easy to interpret, and performs well out of sample.

Thats a joint work with Robert Gramacy

*Measuring the Vulnerability of Incomplete Block Designs to Observation Loss: Applications to Design Selection and Construction* - Helen Thornewell

Balanced Incomplete Block Designs (BIBDs) are optimal experimental designs. However, if observation loss occurs during the experiment, the properties of the design are changed and, in some cases, the eventual design can be disconnected, causing serious problems in the analysis. Therefore it is important to consider the vulnerability of a design to becoming disconnected through observation loss. The Vulnerability of a design considers the minimum number of observations, whose loss yields a disconnected design. These are defined as Minimum Rank Reducing Observation Sets (MRROSs). The vulnerability measure determines the minimum size, S , and total number, T , of MRROSs. New formulae have been derived to calculate the Vulnerability Measures and a program has been written to output the Vulnerability Measures of BIBDs. For some parameter classes, these formulae give fixed Vulnerability Measures, giving rise to a Pilot Procedure to check that the value of S exceeds any reasonable expectation of observation loss. For other parameter classes, the formulae depend on the particular design features and among non-isomorphic BIBDs with the same parameters, some designs are less vulnerable than others. The program helps to select the least vulnerable design within a set of competing designs. The construction of larger experimental designs by replicating smaller BIBDs is considered, e.g. a 2-replicate BIBD can be constructed from two exact replicates of a "building block" BIBD. Techniques are demonstrated for constructing the least vulnerable repeated BIBDs, by applying permutations to treatments in the subsequent replicates, using knowledge of the MRROSs of the "building block" BIBD.

*Incentivizing Participation in Resource-sharing Networks* - Ben Roberts

We investigate systems in which peers can both access and contribute to resources in a central pool. We think of ourselves as the system designer, deciding the rules by which peers can participate in the network. This should be done in such a way so that peers acting for their own benefit make socially efficient decisions about the sizes of their contributions. For example, free-riding should be prevented, since this can often make systems collapse.