1998 Lukacs Symposium abstracts
Categorical data analysis in the twenty-first century
Alan Agresti
University of Florida, Gainesville
As we approach the millennium, the state-of-the-art in categorical data analysis, as in all branches of statistics, is vastly different than at the start of this century. The variety of options for handling any particular problem continues to increase dramatically, and the applied statistician can easily become confused in having to choose among the options and understand the pros and cons of each. This article focuses on some of the primary developments of the past quarter century in categorical data analysis, the ambiguities in selecting a procedure, and the challenge of communicating methods and results to nonstatisticians who require analysis of such data. Specific topics to which we pay special attention include small-sample versus large sample inference, difficulties posed by discreteness, difficulties in handling repeated measurement and other forms of clustering, difficulties with model interpretation, and software availability and limitations.
Asymptotic theory for random permutations with applications to genetics
Jogesh Babu
Pennsylvania State University
In the last few decades, mathematical population geneticists have been exploring the mechanisms that maintain diversity in a population. Some geneticists believe that much of genetic diversity occurs mainly due to mutation and random fluctuations that are inherent in the reproductive process. Ewens (1972) established an approximation to the sampling distribution of a sample of genes from a population that was evolved over several generations, by a family of measures on the set of partitions of an integer. The derivation ignores the selective effects and assumes that there is no meaningful way of labelling the alleles. In this case the allelic partition contains all the information available in a sample of genes. Ewens formula can be used to test if the popular assumptions are consistent with data, and to estimate the parameters. The statistics that are useful in this connection will generally be expressed as functions of the sums of transforms of allelic partition. Such statistics can be viewed as functions of a process on the permutation group of integers. A functional limit theorem for such a partial sum process will be presented using ideas and concepts from Probabilistic Number Theory. Ewens sampling formula also arises in Bayesian statistics via mixtures of Dirichlet processes.
Start-up demonstration testing and applications
N. Balakrishnan
Mc Master University
In this talk, I shall first present the problem of ``Start-up demonstration testing'' as it was originally formulated by Hahn and Gage. Then, I shall present some statistical results associated with this testing method. Following that, some extensions and generalizations that have practical implications will be considered and the necessary statistical results will be presented. Finally, some connections of this problem to problems on ``runs, patterns and scans'' will be mentioned.
Testing with misspecified models: a review of the literature and problems
for the future
Anil Bera
University of Illinois
This century and the history of modern statistics started with Karl Pearson's(1900) goodness of fit test-one of the most important breakthroughs in science. The basic motivation behind this test was to see whether an assumed probability model adequately described the data on hand. Then over the first half of this century we saw the developments of some general principles of testing, such as, Jerzy Neyman and Egon Pearson's (1928) likelihood ratio test, Abraham Wald's test in 1943 and C.R. Rao's score test in 1947. All these tests were developed under the assumption that the underlying probability model is correctly specified. Trygve Haavelmo (1944) termed this underlying model as the priori admissible hypothesis, and probably he was the first to draw the attention to the consequences of misspecification of the priori admissible hypothesis on the standard hypotheses testing procedures. We will call this the type-III error. In this paper, we will deal with a number of ways an assumed probability model could be misspecified, and discuss how some of the standard tests could be modified to make them valid under various misspecifications. We will also discuss some adaptive tests which do not require explicit specification of the underlying probability model. Quite strangely, because of these developments, possibly now we will have less use of the Pearson's goodness-of-fit test. In a way, that measures the advancements in statistical testing in this century.
The importance of geometry in multivariate analysis and some applications
Carles M. Cuadras
University of Barcelona
Geometrical concepts, including distance functions between observations, geometric variabilities and proximity functions, are used to develop some new aspects of multivariate analysis. These include the influence of principal components in comparing populations, the detection of atypical observations in discrimination with mixed variables, and the construction of orthogonal expansions for a continuous random variable. Some illustrations are given using two well-known data sets.
Joint work with Josep Fortiana.
Statistical learning and visual recognition
Donald Geman
University of Massachusetts
Statistical models and methods of inference are common in pattern recognition and computer vision. Some are based on learning from examples (e.g., recursive partitioning, neural networks, risk minimization) and some on more formal models (e.g., Markov random fields, deformable templates).
We argue that ``off-the-shelf'' tools for induction and modeling are usually of limited value in visual recognition, for example detecting objects of interest in a complex scene. In particular, general theories of learning do not address the constraints and other issues - geometric and photometric invariance, spatial correlation (the ``statistics of natural images'') and computation - which make this problem special and difficult. There is simply too much to learn or too much to model.
Instead, we advocate highly constrained inductive learning, in which tradeoffs among error rates, computation and representation are taken into account in advance. The challenge is to ``hardwire'' these constraints in a manner that permits ``generalization'' from small learning sets. Experiments in object detection, such as finding faces in cluttered scenes, suggest that a binary tree of ``elementary tests'' is a transparent computational device for implementing this program.
Joint work with Yali Amit.
Probability models for objects and images of objects
Stuart Geman
Brown University
Context-free grammars are easily equipped with a probability distribution and have been used in limited image-processing applications. However, context-free grammars are too ``weak'' to capture global image structure. On the other hand, very little is known about how to put probabilities on the much stronger context- sensitive grammars. I will suggest an approach to fitting context- sensitive grammars with probabilities and I will show the results of some experiments in Bayesian image analysis based upon probabilistic context-sensitive grammars. I will also discuss the ``scaling'' property of natural images--the statistics of real scenes are very nearly invariant to changes of scale. Finally, I will point out a rather unexpected connection between the scaling property of natural images and certain invariance properties of probabilistic grammars.
The statistics of vision
Ulf Grenander
Brown University
It is well known that the development of probability/statistics has been accelerated by the demand from many application areas. For example, the needs of the insurance industry for an objective foundation for the calculation of premiums and premium reserves led to a theory of graduation of mortality tables in the setting of statistical estimation. This theory had its modest beginning in the 19th century but later led to more sophisticated probabilistic studies initiating the field of stochastic processes and large deviations among others. Or, to take another example, statistical signal processing, which strongly influenced statistical inference in stochastic processes and gave rise to the whole new field of information theory.
We shall argue that Computer Vision presents another application area with similar potential: it appears likely that this discipline with its obvious scientific and commercial implications will be fertile ground for new developments in statistical theory. Such a development has already started but it is in its infancy and needs the expertise of statisticians and probabilists for its further development.
To make this concrete we shall take a brief look at two central problems in Computer Vision. The first is: how do we estimate the underlying algebraic structure that should represent the scenes we are looking at and the prior probability measure that describes the variability in the scene. The second one concerns the statistical modeling of clutter, that is scenes without objects of interest, but serving as background to such objects.
These are only two out of many statistical problems in Computer Vision, but they are real problems, they are difficult, and they are likely to need new conceptual tools, and therefore they should be suitable for young researchers looking for serious challenges.
Minimum distance, M, and R-estimation in the linear model
T. P. Hettmansperger
Pennsylvania State University
Unweighted minimum distance estimators are considered for the linear model. They are derived from the unweighted Cramer-von Mises goodness-of-fit criterion and are shown to be M-estimators. We further show that R-estimators can be considered to be iterated M-estimators in this context. Analysis of the criterion function results in a new estimate of scale, a goodness-of fit-test, and a test for appropriateness of the linear model.
Diffusion models for neural activity: inference and computation
Satish Iyengar
University of Pittsburgh
Stochastic models of neural activity are a well developed application in biology. Diffusion models hold a prominent place because of the many synaptic inputs to a neuron, and because these models arise out of noisy versions of differential equations for the neural membrane's electrical properties. While the probabilistic aspects of such models have been well studied, inferential and computational procedures for them are not as well developed. In this paper, I outline the physiological background leading to these models. I then describe recent progress in parameter estimation and the computational problems that arise, especially for models that include a finite membrane time constant and reversal potentials.
Statistical estimation with quadratic loss: the role of dimension
Lucien Le Cam
University of California, Berkeley
Consider an experiment
=
given by probability measures on
and a function
. It is desired to estimate
.
Assume that
takes its values in a Hilbert space H and that the
loss function is the square of the norm. Further, let T be the convex
hull of the family
in the space
generated by
.
If unbiased estimates are desired and H is one dimensional, the minimax
risk is the supremum of the risks of one dimensional sub-problems in
T. If no unbiasedness requirements are imposed, the minimax risk is the
supremum of risks of two-dimensional problems in T. The situation
changes dramatically if the Hilbert space H is infinite dimensional. If
it is the Hilbert space formed by using the Hellinger distance on
, one can obtain lower and upper bounds on the minimax risk in terms of
a metric dimension related to Kolmogorov's metric entropy. However
this has been done effectively only under independence or near independence
restrictions.
Since an ``almost explicit'' formula for minimax risks is available in
the case of Hilbert valued
, the general problem is expected to be
solved early in the next Century.
Nonparametric and semiparametric estimating equations
Bing Li
Pennsylvania State University
In this talk I will first review the general theory of estimating equations and survey the recent nonparametric and semiparametric approaches to this subject. I will explain the issue of adaptivity, its advantages and its one-sidedness -- particularly, that adaptation, though it decreases the asymptotic variance, brings about noise that inflates the actual variance. On these grounds I will then introduce a class of semiparametric estimating equations derived not from the principle of adaptation but from the direct minimization of the asymptotic variance, subject to a roughness penalty. I will demonstrate by simulation the advantage of this method over the adaptive method for moderate sample sizes, say n=100. Finally, I will explore some other possibilities of this new principle.
Econometrics in the 21st century
G.S. Maddala
Ohio State University
The paper reviews the origins of the Econometric Society and Econometrica, the early collaboration between economists and statisticians, the subsequent rift between the two groups, the current problems of the Econometric Society, and developments in econometrics in the last two decades contrasted with the developments in the earlier years. It then speculates on the directions econometrics is going to take in the 21st century and outlines some avenues that it should take. Possible areas of collaboration between economists and statisticians are also outlined.
Stability theory for stochastic PDE's
V. Mandrekar
Michigan State University
We observe that the deterministic unstable systems are stabilized by the introduction of ``large'' noise. This occurs in the problems in ecology, and physics. To consider more realistic ``space-time'' systems we study the stability theory for stochastic PDE s. We present recent work and problems arising from it. Finally, we raise the question regarding ``largeness'' of noise in systems given by SDE s. A similar question for stochastic PDE s, properly formulated and solved can allow us to preserve ecological systems for the 21st century using randomness.
Second order corrections of the sequential bootstrap
P. K. Pathak
University of New Mexico
Rao, Pathak, and Koltchinskii (1997) have recently studied a sequential
approach to resampling in which resampling is carried out sequentially
one-by-one (with replacement each time) until the bootstrap sample contains
distinct observations from the
original sample.
In our previous work, we have established that the main empirical
characteristics of the sequential bootstrap go through, in the sense of
being within a distance of order
from those of the usual
bootstrap.
However, the theoretical justification of the second order correctness of
the sequential bootstrap is somewhat difficult.
It is the main topic of this investigation.
Among other things, we accomplish it by approximating our sequential scheme
by a resampling scheme based on the Poisson distribution with mean
and censored at X=0.
Joint work with G. J. Babu and C. R. Rao.
Statistical approaches to multiscale assessment of landscapes and
watersheds with satellite and synoptic multivariate geospatial data
G. P. Patil
Pennsylvania State University
When a natural landscape is represented by a series of categorical raster maps of increasingly finer resolution, a multiresolution characterization of spatial pattern can be obtained in which entropy is computed at each resolution, conditional on the preceding resolution. The series of entropy values is plotted as a function of scale (resolution), resulting in a multiresolution profile of fragmentation pattern in the landscape.
When a categorical raster map is available at a single resolution only, a series of degraded maps at increasingly coarser resolutions is generated and the fragmentation profile is computed for this series. An algorithm has been developed for obtaining the profile directly from the single resolution map without having to generate and store the coarser resolution maps.
A model is also developed for generating multiresolution categorical maps and a method is presented for directly computing the fragmentation profile from the parameters of the model. These model profiles provide benchmarks for comparing results obtained from raster maps of actual landscapes that are classified from satellite imagery. Examples show that characteristic landscape types give rise to characteristic features in their fragmentation (conditional entropy) profiles.
Estimation of parameters in nonlinear regression models
Shyamal Peddada
University of Virginia
In this talk we address the problem of estimation
of unknown parameter
in a nonlinear regression model
where the regression function is a ``smooth'' nonlinear
function of
. Two types of parameter spaces are
considered, namely, the p-dimensional Euclidean space
,
and
, where SO(p) is the special orthogonal group of
orthogonal matrices with determinant +1. This talk is
motivated by two important applications, growth of a boy
during puberty and the motion of polar ice.
We shall review some of the existing procedures and shall introduce
new ones. Limitations of the existing and the new procedures
are discussed. In the process we shall suggest several open
research problems which remain to be addressed.
Joint work with Juan Zhang and Alan Rogol.
A review of canonical coordinates for multivariate data reduction and an
alternative to correspondence analysis
C. R. Rao
Pennsylvania State University
A general theory of canonical coordinates is developed for reduction of dimensionality of multivariate data, assessing the loss information, and graphical display of multivariate data. Some theorems on matrix approximations which lead to a unified theory of multivariate techniques are introduced. The theory is applied to data in two way tables with variables in one category and populations in the other. Two types of data are considered, one with continuous measurements on the variables and another with frequencies of attributes. New biplots of populations and variables are introduced.
An alternative to correspondence analysis based on Hellinger distance instead of the usual chi-square distance is suggested. The new method has some attractive features.
Inference from sample survey data: some current issues
J.N.K. Rao
Carleton University
Traditional sample survey theory is largely based on the design-based approach. Alternative approaches to inference from survey data have also been advanced. Relative merits of the different approaches will be discussed. A conditional design-based approach is advanced. Role of model-based methods in small area estimation is discussed. Finally, quasi- score tests for use with survey data are developed.
Higher order asymptotics: costs and benefits
Nancy Reid
University of Toronto
The theory of higher order asymptotics provides a method for very accurate approximation of p-values and confidence limits in parametric inference. These approximations are not widely used in practice, for various reasons, including lack of knowledge, lack of software, and lack of appreciation. This talk will consider the pros and cons of higher order asymptotics, with a view to the practice and the theory of statistics.
Good approximations to Dirichlet processes
J. Sethuraman
Florida State University
The Dirichlet process introduced by Ferguson (1973)
has proved very useful in nonparametric Bayesian analysis. A
constructive definition of this process was given by Sethuraman
(1994) which makes many exact calculations very tractable.
However, one wishes to use hierarchical models leading to
mixtures of Dirichlets and we soon lose tractability of
calculations. Approximations to Dirichlet processes have to be
strong enough to lead to approximations of the resulting
posterior distributions. In this talk, I describe some
approximations to Dirichlet processes and show that they are
strong enough to lead to good enough approximations of the
resulting posterior distributions. The constructive definition of
the Dirichlet process in Sethuraman (1994) was given as the
infinite sum
, where the weights
and
the random variables
are random. In this talk, we show that
by choosing the weights in a different way and truncating the
series, the finite sum will be a good approximation to the
Dirichlet process.
Comparison of control vs. test treatments using distance optimality
criterion
Kirti R. Shah
University of Waterloo
We consider the problem of comparison of one control treatment
with a set of v test treatments
in various settings including the completely randomised design and one-and
two- way classification designs using distance optimality criterion
introduced by Sinha (1970). It turns out that the nature of the optimal
designs for this criterion is quite different from that for the usual A-,
D- and E- optimality criteria. Here, some complete classes of designs
have been identified.
Joint work with Nripesh K. Mandal and Bikas K. Sinha.
Some recent developments in hardware and software reliability
Nozer Singpurwalla
George Washington University
The author will describe some recent work in reliability, both for hardware and for software systems. With respect to the former, emphasis will be on probability models for multi-component systems subject to failure under dynamic and random environments. With respect to the latter, models for software reliability based on a concatenation of failure rates will be emphasized. In both cases, the role of subjective probability is paramount, and this it appears is a fruitful direction for future developments in reliability.
The analysis of subject specific agreement
David Sprott
University of Waterloo
Two raters independently assign n subjects to q
categories. The subject specific agreement between the two
raters is defined to be the extent to which the probability
of the assignment of a given subject to a category by one
rater depends upon, or is determined by, the category to
which the same subject is assigned by the other rater. A
measure
of subject specific agreement is proposed,
based on conditional probabilities and related to the log
odds ratio, for which there is a conditional likelihood
function. This last point is of paramount importance for
making quantitative statements of scientific inference. The
foregoing is compared with the use of the traditional measure
of agreement
. The proposed procedures are an example
of how the increasing power of computers, characteristic of
the late 20th and presumably 21st century, should affect
statistical methods and scientific inference.
Joint work with V. T. Farewell.
Some classes of fundamental problems in statistical
experimental design, multivariate analysis, and sampling theory
J. N. Srivastava
Colorado State University
In this paper we examine the information contained in an experiment, in the context of sensitivity, variance-optimality, and revealing power. Many classes of problems in model identification and estimation are detailed. Problems in Multivariate Analysis are considered, including "anti-meta analysis", and building expert systems (for human health) when the number of responses is potentially infinite. Sampling Theory problems are outlined in connection with the class of author's estimators which, because of its wide applicability and accuracy, is pre-eminent in the field.
Multivariate regression with singular covariance matrix
M. S. Srivastava
University of Toronto
Classical multivariate analysis assumes that the covariance matrix is nonsingular. In many problems of practical interest, such as in medical trials, there are large number of related response variables but fewer observations. This leads to singularity in covariance matrix. In this paper, we propose tests and estimates when the covariance matrix is singular for multivariate regression and growth curve models.
Joint work with Dietrich von Rosen.
Degenerate, conditional and increasing order U-statistics
Gabor J. Szekely
Bowling Green State University
Many characterizations of parametric and nonparametric families of distributions lead to statistical tests involving U-statistics. In degenerate, conditional or increasing order cases the asymptotic behavior of U-statistics provide a large variety of interesting unsolved problems. In the talk some new characterizations and the corresponding test statistics will be discussed. The related open questions in many cases have surprisingly attractive forms.
Regression modelling with fixed effects - missing values and other
problems
H. Toutenburg
University of Munich
The paper considers four problems in linear regression.
(i) The predictive performance of (possibly biased) restricted and
mixed regression estimators with respect to a stochastic target function
, with
a weight matrix is
investigated. Under MDEP-matrix superiority, application comes from
imputation for
missing values resulting in a biased mixed estimator.
(ii) The basic question of detecting a non-MCAR process is demonstrated using outlier detection methods. Especially the power of detecting a non-MCAR process is investigated using adaptations of Cooks distance, of DRSS and DXX in a simulation study.
(iii) The Gauss-Markov estimator b is the best unbiased estimator if the vector of disturbances u is multinormally distributed. Hence, if u is not multinormally distributed, there is a potential of nonlinear unbiased estimators that improve upon b.
(iv) Consider the general linear regression model. There exists a number
of conditions
under which OLSE and GLSE coincide. An open question is the following:
What is the
explicit form of a linear unbiased estimator
for
whose efficiency lies
between that of OLSE and GLSE in the sense that
, where
denotes the
Loewner ordering of n.n.d. matrices?
Joint work with A. Fieger, C. Heumann, V. K. Srivastava, and G. Trenkler.
Sketches on probabilistic dilation equation
Jacek Wesolowski
Warsaw University of Technology
The basic dilation equation (dile) has the form
where
's are some real (complex) constants, and uniqueness of a non-zero
solution is ensured by the condition
. Then
.
The solution f can be, in general, not a proper function but a Schwartz
distribution. A considerable interest in diles observed in recent years is
connected with the fact that they play a crucial role in wavelets
constructions - see, for instance, Strang (1989), or Heil and Colella (1994).
They are examples of two-scale difference equations studied thoroughly in
Daubechies and Lagarias (1991, 1992).
A special type of such equations was studied in a probabilistic context of infinite Bernoulli convolutions - see for instance Kershner and Wintner (1935) or Erdös (1940). Here we study a probabilistic dilation equation (prodile) which arises as a natural version of the standard dile, if written not for densities (they do not need to exist) but for the probability measures themselves:
where
is a probability distribution of a random variable X and
are positive constants and
. Here r is a given positive
constant which equals 2 in the basic case. In the case of symmetric Bernoulli
convolutions the parameters of the respective prodile are
and
the resulting distribution is singular if only
(Erdös, Wintner).
In the talk also other examples of prodiles will be presented allowing well known (ex. uniform) or not so widely used distributions (for instance the de Rham density). Some basic properties of prodile distributions (solutions of prodiles) will be revealed. Three main methods of getting approximate solutions of diles (cascade algorithm, Fourier technique, dyadic interpolation) will be completed with a Monte Carlo approach leading to approximate distribution functions or densities of respective probability measures. We believe that the family of prodile distributions can be an important and interesting object of investigations. One of the reasons is that it contains some families of singular distributions, which, though appear in a natural way in many problems, are, in general, difficult to analyze. Here the prodile representation could be very useful. The representation is connected with some mixture property of distributions under shifting and scaling. This may be also a source of new characteristic properties of some well known distributions. Also a possible application to hypothesis testing for a class of singular distributions will be presented.
References
- Daubechies, I., Lagarias, J. (1991), Two-scale difference equations: I.
Existence and global regularity of solutions. SIAM J. Math. Anal.
22, 1388-1410.
- Daubechies, I., Lagarias, J. (1991), Two-scale difference equations: II.
Local regularity, infinite products and fractals. SIAM J. Math. Anal.
23, 1031-1079.
- Erdös, P. (1940), On the smoothness properties of a family of symmetric
Bernoulli convolutions. Amer. J. Math. 62, 180-186.
- Heil, C., Colella, D. (1994), Dilation equations and the smoothness of
compactly supported wavelets. In: Wavelets. Mathematics and Applications
(J.J. Benedetto, M.W. Frazier, eds), CRC Press, Boca Raton.
- Kershner, R., Wintner, A. (1935), On symmetric Bernoulli convolutions.
Amer. J. Math. 57, 541-548.
- Strang, G. (1989), Wavelets and dilation equations: a brief introduction.
SIAM Rev. 31, 614-627.
Censoring and random truncation: a survey of
some recent developments in nonparametric inference for incomplete data
Grace Yang
University of Maryland
Product-limit estimates
of a distribution function F
play an essential role in the statistical analysis
of censored or randomly truncated data. There are used
either directly in statistical inference about F or indirectly
such as in regression analysis or other type of problems.
Prior to the publication of Gill (1983) on the weak convergence
of
for right censored data over the entire support of F,
properties of
were studied and known almost exclusively
on restricted intervals [0, b]
for which
These may be sufficient for computing
survival probabilities in biostatistics. In the case of regression analysis,
one typically imposes complicated assumptions to circumvent the ``tail''
problem. Not knowing the tail behavior of
has clearly limits its
application to other important statistical functions such as
sample moments and the mean residual life whose properties depend on
F over the entire support.
As an introduction, the product-limit estimate for censored and
randomly truncated data will be constructed by using a model
identification approach. Some recent developments on the
asymptotic properties of
, with special attention to the results
on the tail behavior of
will be highlighted. A randomization
technique that unifies the treatment of a discrete F in
a broad category of problems will be discussed.