next up previous
Next: About this document

1998 Lukacs Symposium abstracts

Categorical data analysis in the twenty-first century
Alan Agresti
University of Florida, Gainesville

As we approach the millennium, the state-of-the-art in categorical data analysis, as in all branches of statistics, is vastly different than at the start of this century. The variety of options for handling any particular problem continues to increase dramatically, and the applied statistician can easily become confused in having to choose among the options and understand the pros and cons of each. This article focuses on some of the primary developments of the past quarter century in categorical data analysis, the ambiguities in selecting a procedure, and the challenge of communicating methods and results to nonstatisticians who require analysis of such data. Specific topics to which we pay special attention include small-sample versus large sample inference, difficulties posed by discreteness, difficulties in handling repeated measurement and other forms of clustering, difficulties with model interpretation, and software availability and limitations.

Asymptotic theory for random permutations with applications to genetics
Jogesh Babu
Pennsylvania State University

In the last few decades, mathematical population geneticists have been exploring the mechanisms that maintain diversity in a population. Some geneticists believe that much of genetic diversity occurs mainly due to mutation and random fluctuations that are inherent in the reproductive process. Ewens (1972) established an approximation to the sampling distribution of a sample of genes from a population that was evolved over several generations, by a family of measures on the set of partitions of an integer. The derivation ignores the selective effects and assumes that there is no meaningful way of labelling the alleles. In this case the allelic partition contains all the information available in a sample of genes. Ewens formula can be used to test if the popular assumptions are consistent with data, and to estimate the parameters. The statistics that are useful in this connection will generally be expressed as functions of the sums of transforms of allelic partition. Such statistics can be viewed as functions of a process on the permutation group of integers. A functional limit theorem for such a partial sum process will be presented using ideas and concepts from Probabilistic Number Theory. Ewens sampling formula also arises in Bayesian statistics via mixtures of Dirichlet processes.

Start-up demonstration testing and applications
N. Balakrishnan
Mc Master University

In this talk, I shall first present the problem of ``Start-up demonstration testing'' as it was originally formulated by Hahn and Gage. Then, I shall present some statistical results associated with this testing method. Following that, some extensions and generalizations that have practical implications will be considered and the necessary statistical results will be presented. Finally, some connections of this problem to problems on ``runs, patterns and scans'' will be mentioned.

Testing with misspecified models: a review of the literature and problems for the future
Anil Bera
University of Illinois

This century and the history of modern statistics started with Karl Pearson's(1900) goodness of fit test-one of the most important breakthroughs in science. The basic motivation behind this test was to see whether an assumed probability model adequately described the data on hand. Then over the first half of this century we saw the developments of some general principles of testing, such as, Jerzy Neyman and Egon Pearson's (1928) likelihood ratio test, Abraham Wald's test in 1943 and C.R. Rao's score test in 1947. All these tests were developed under the assumption that the underlying probability model is correctly specified. Trygve Haavelmo (1944) termed this underlying model as the priori admissible hypothesis, and probably he was the first to draw the attention to the consequences of misspecification of the priori admissible hypothesis on the standard hypotheses testing procedures. We will call this the type-III error. In this paper, we will deal with a number of ways an assumed probability model could be misspecified, and discuss how some of the standard tests could be modified to make them valid under various misspecifications. We will also discuss some adaptive tests which do not require explicit specification of the underlying probability model. Quite strangely, because of these developments, possibly now we will have less use of the Pearson's goodness-of-fit test. In a way, that measures the advancements in statistical testing in this century.

The importance of geometry in multivariate analysis and some applications
Carles M. Cuadras
University of Barcelona

Geometrical concepts, including distance functions between observations, geometric variabilities and proximity functions, are used to develop some new aspects of multivariate analysis. These include the influence of principal components in comparing populations, the detection of atypical observations in discrimination with mixed variables, and the construction of orthogonal expansions for a continuous random variable. Some illustrations are given using two well-known data sets.

Joint work with Josep Fortiana.

Statistical learning and visual recognition
Donald Geman
University of Massachusetts

Statistical models and methods of inference are common in pattern recognition and computer vision. Some are based on learning from examples (e.g., recursive partitioning, neural networks, risk minimization) and some on more formal models (e.g., Markov random fields, deformable templates).

We argue that ``off-the-shelf'' tools for induction and modeling are usually of limited value in visual recognition, for example detecting objects of interest in a complex scene. In particular, general theories of learning do not address the constraints and other issues - geometric and photometric invariance, spatial correlation (the ``statistics of natural images'') and computation - which make this problem special and difficult. There is simply too much to learn or too much to model.

Instead, we advocate highly constrained inductive learning, in which tradeoffs among error rates, computation and representation are taken into account in advance. The challenge is to ``hardwire'' these constraints in a manner that permits ``generalization'' from small learning sets. Experiments in object detection, such as finding faces in cluttered scenes, suggest that a binary tree of ``elementary tests'' is a transparent computational device for implementing this program.

Joint work with Yali Amit.

Probability models for objects and images of objects
Stuart Geman
Brown University

Context-free grammars are easily equipped with a probability distribution and have been used in limited image-processing applications. However, context-free grammars are too ``weak'' to capture global image structure. On the other hand, very little is known about how to put probabilities on the much stronger context- sensitive grammars. I will suggest an approach to fitting context- sensitive grammars with probabilities and I will show the results of some experiments in Bayesian image analysis based upon probabilistic context-sensitive grammars. I will also discuss the ``scaling'' property of natural images--the statistics of real scenes are very nearly invariant to changes of scale. Finally, I will point out a rather unexpected connection between the scaling property of natural images and certain invariance properties of probabilistic grammars.

The statistics of vision
Ulf Grenander
Brown University

It is well known that the development of probability/statistics has been accelerated by the demand from many application areas. For example, the needs of the insurance industry for an objective foundation for the calculation of premiums and premium reserves led to a theory of graduation of mortality tables in the setting of statistical estimation. This theory had its modest beginning in the 19th century but later led to more sophisticated probabilistic studies initiating the field of stochastic processes and large deviations among others. Or, to take another example, statistical signal processing, which strongly influenced statistical inference in stochastic processes and gave rise to the whole new field of information theory.

We shall argue that Computer Vision presents another application area with similar potential: it appears likely that this discipline with its obvious scientific and commercial implications will be fertile ground for new developments in statistical theory. Such a development has already started but it is in its infancy and needs the expertise of statisticians and probabilists for its further development.

To make this concrete we shall take a brief look at two central problems in Computer Vision. The first is: how do we estimate the underlying algebraic structure that should represent the scenes we are looking at and the prior probability measure that describes the variability in the scene. The second one concerns the statistical modeling of clutter, that is scenes without objects of interest, but serving as background to such objects.

These are only two out of many statistical problems in Computer Vision, but they are real problems, they are difficult, and they are likely to need new conceptual tools, and therefore they should be suitable for young researchers looking for serious challenges.

Minimum distance, M, and R-estimation in the linear model
T. P. Hettmansperger
Pennsylvania State University

Unweighted minimum distance estimators are considered for the linear model. They are derived from the unweighted Cramer-von Mises goodness-of-fit criterion and are shown to be M-estimators. We further show that R-estimators can be considered to be iterated M-estimators in this context. Analysis of the criterion function results in a new estimate of scale, a goodness-of fit-test, and a test for appropriateness of the linear model.

Diffusion models for neural activity: inference and computation
Satish Iyengar
University of Pittsburgh

Stochastic models of neural activity are a well developed application in biology. Diffusion models hold a prominent place because of the many synaptic inputs to a neuron, and because these models arise out of noisy versions of differential equations for the neural membrane's electrical properties. While the probabilistic aspects of such models have been well studied, inferential and computational procedures for them are not as well developed. In this paper, I outline the physiological background leading to these models. I then describe recent progress in parameter estimation and the computational problems that arise, especially for models that include a finite membrane time constant and reversal potentials.

Statistical estimation with quadratic loss: the role of dimension
Lucien Le Cam
University of California, Berkeley

Consider an experiment tex2html_wrap_inline916 = tex2html_wrap_inline918 given by probability measures on tex2html_wrap_inline920 and a function tex2html_wrap_inline922 . It is desired to estimate tex2html_wrap_inline924 .

Assume that tex2html_wrap_inline926 takes its values in a Hilbert space H and that the loss function is the square of the norm. Further, let T be the convex hull of the family tex2html_wrap_inline932 in the space tex2html_wrap_inline934 generated by tex2html_wrap_inline936 .

If unbiased estimates are desired and H is one dimensional, the minimax risk is the supremum of the risks of one dimensional sub-problems in T. If no unbiasedness requirements are imposed, the minimax risk is the supremum of risks of two-dimensional problems in T. The situation changes dramatically if the Hilbert space H is infinite dimensional. If it is the Hilbert space formed by using the Hellinger distance on tex2html_wrap_inline946 , one can obtain lower and upper bounds on the minimax risk in terms of a metric dimension related to Kolmogorov's metric entropy. However this has been done effectively only under independence or near independence restrictions.

Since an ``almost explicit'' formula for minimax risks is available in the case of Hilbert valued tex2html_wrap_inline948 , the general problem is expected to be solved early in the next Century.

Nonparametric and semiparametric estimating equations
Bing Li
Pennsylvania State University

In this talk I will first review the general theory of estimating equations and survey the recent nonparametric and semiparametric approaches to this subject. I will explain the issue of adaptivity, its advantages and its one-sidedness -- particularly, that adaptation, though it decreases the asymptotic variance, brings about noise that inflates the actual variance. On these grounds I will then introduce a class of semiparametric estimating equations derived not from the principle of adaptation but from the direct minimization of the asymptotic variance, subject to a roughness penalty. I will demonstrate by simulation the advantage of this method over the adaptive method for moderate sample sizes, say n=100. Finally, I will explore some other possibilities of this new principle.

Econometrics in the 21st century
G.S. Maddala
Ohio State University

The paper reviews the origins of the Econometric Society and Econometrica, the early collaboration between economists and statisticians, the subsequent rift between the two groups, the current problems of the Econometric Society, and developments in econometrics in the last two decades contrasted with the developments in the earlier years. It then speculates on the directions econometrics is going to take in the 21st century and outlines some avenues that it should take. Possible areas of collaboration between economists and statisticians are also outlined.

Stability theory for stochastic PDE's
V. Mandrekar
Michigan State University

We observe that the deterministic unstable systems are stabilized by the introduction of ``large'' noise. This occurs in the problems in ecology, and physics. To consider more realistic ``space-time'' systems we study the stability theory for stochastic PDE s. We present recent work and problems arising from it. Finally, we raise the question regarding ``largeness'' of noise in systems given by SDE s. A similar question for stochastic PDE s, properly formulated and solved can allow us to preserve ecological systems for the 21st century using randomness.

Second order corrections of the sequential bootstrap
P. K. Pathak
University of New Mexico

Rao, Pathak, and Koltchinskii (1997) have recently studied a sequential approach to resampling in which resampling is carried out sequentially one-by-one (with replacement each time) until the bootstrap sample contains tex2html_wrap_inline952 distinct observations from the original sample. In our previous work, we have established that the main empirical characteristics of the sequential bootstrap go through, in the sense of being within a distance of order tex2html_wrap_inline954 from those of the usual bootstrap. However, the theoretical justification of the second order correctness of the sequential bootstrap is somewhat difficult. It is the main topic of this investigation. Among other things, we accomplish it by approximating our sequential scheme by a resampling scheme based on the Poisson distribution with mean tex2html_wrap_inline956 and censored at X=0.

Joint work with G. J. Babu and C. R. Rao.

Statistical approaches to multiscale assessment of landscapes and watersheds with satellite and synoptic multivariate geospatial data
G. P. Patil
Pennsylvania State University

When a natural landscape is represented by a series of categorical raster maps of increasingly finer resolution, a multiresolution characterization of spatial pattern can be obtained in which entropy is computed at each resolution, conditional on the preceding resolution. The series of entropy values is plotted as a function of scale (resolution), resulting in a multiresolution profile of fragmentation pattern in the landscape.

When a categorical raster map is available at a single resolution only, a series of degraded maps at increasingly coarser resolutions is generated and the fragmentation profile is computed for this series. An algorithm has been developed for obtaining the profile directly from the single resolution map without having to generate and store the coarser resolution maps.

A model is also developed for generating multiresolution categorical maps and a method is presented for directly computing the fragmentation profile from the parameters of the model. These model profiles provide benchmarks for comparing results obtained from raster maps of actual landscapes that are classified from satellite imagery. Examples show that characteristic landscape types give rise to characteristic features in their fragmentation (conditional entropy) profiles.

Estimation of parameters in nonlinear regression models
Shyamal Peddada
University of Virginia

In this talk we address the problem of estimation of unknown parameter tex2html_wrap_inline960 in a nonlinear regression model where the regression function is a ``smooth'' nonlinear function of tex2html_wrap_inline962 . Two types of parameter spaces are considered, namely, the p-dimensional Euclidean space tex2html_wrap_inline964 , and tex2html_wrap_inline966 , where SO(p) is the special orthogonal group of tex2html_wrap_inline970 orthogonal matrices with determinant +1. This talk is motivated by two important applications, growth of a boy during puberty and the motion of polar ice. We shall review some of the existing procedures and shall introduce new ones. Limitations of the existing and the new procedures are discussed. In the process we shall suggest several open research problems which remain to be addressed.

Joint work with Juan Zhang and Alan Rogol.

A review of canonical coordinates for multivariate data reduction and an alternative to correspondence analysis
C. R. Rao
Pennsylvania State University

A general theory of canonical coordinates is developed for reduction of dimensionality of multivariate data, assessing the loss information, and graphical display of multivariate data. Some theorems on matrix approximations which lead to a unified theory of multivariate techniques are introduced. The theory is applied to data in two way tables with variables in one category and populations in the other. Two types of data are considered, one with continuous measurements on the variables and another with frequencies of attributes. New biplots of populations and variables are introduced.

An alternative to correspondence analysis based on Hellinger distance instead of the usual chi-square distance is suggested. The new method has some attractive features.

Inference from sample survey data: some current issues
J.N.K. Rao
Carleton University

Traditional sample survey theory is largely based on the design-based approach. Alternative approaches to inference from survey data have also been advanced. Relative merits of the different approaches will be discussed. A conditional design-based approach is advanced. Role of model-based methods in small area estimation is discussed. Finally, quasi- score tests for use with survey data are developed.

Higher order asymptotics: costs and benefits
Nancy Reid
University of Toronto

The theory of higher order asymptotics provides a method for very accurate approximation of p-values and confidence limits in parametric inference. These approximations are not widely used in practice, for various reasons, including lack of knowledge, lack of software, and lack of appreciation. This talk will consider the pros and cons of higher order asymptotics, with a view to the practice and the theory of statistics.

Good approximations to Dirichlet processes
J. Sethuraman
Florida State University

The Dirichlet process introduced by Ferguson (1973) has proved very useful in nonparametric Bayesian analysis. A constructive definition of this process was given by Sethuraman (1994) which makes many exact calculations very tractable. However, one wishes to use hierarchical models leading to mixtures of Dirichlets and we soon lose tractability of calculations. Approximations to Dirichlet processes have to be strong enough to lead to approximations of the resulting posterior distributions. In this talk, I describe some approximations to Dirichlet processes and show that they are strong enough to lead to good enough approximations of the resulting posterior distributions. The constructive definition of the Dirichlet process in Sethuraman (1994) was given as the infinite sum tex2html_wrap_inline974 , where the weights tex2html_wrap_inline976 and the random variables tex2html_wrap_inline978 are random. In this talk, we show that by choosing the weights in a different way and truncating the series, the finite sum will be a good approximation to the Dirichlet process.

Comparison of control vs. test treatments using distance optimality criterion
Kirti R. Shah
University of Waterloo

We consider the problem of comparison of one control treatment tex2html_wrap_inline980 with a set of v test treatments tex2html_wrap_inline984 in various settings including the completely randomised design and one-and two- way classification designs using distance optimality criterion introduced by Sinha (1970). It turns out that the nature of the optimal designs for this criterion is quite different from that for the usual A-, D- and E- optimality criteria. Here, some complete classes of designs have been identified.

Joint work with Nripesh K. Mandal and Bikas K. Sinha.

Some recent developments in hardware and software reliability
Nozer Singpurwalla
George Washington University

The author will describe some recent work in reliability, both for hardware and for software systems. With respect to the former, emphasis will be on probability models for multi-component systems subject to failure under dynamic and random environments. With respect to the latter, models for software reliability based on a concatenation of failure rates will be emphasized. In both cases, the role of subjective probability is paramount, and this it appears is a fruitful direction for future developments in reliability.

The analysis of subject specific agreement
David Sprott
University of Waterloo

Two raters independently assign n subjects to q categories. The subject specific agreement between the two raters is defined to be the extent to which the probability of the assignment of a given subject to a category by one rater depends upon, or is determined by, the category to which the same subject is assigned by the other rater. A measure tex2html_wrap_inline994 of subject specific agreement is proposed, based on conditional probabilities and related to the log odds ratio, for which there is a conditional likelihood function. This last point is of paramount importance for making quantitative statements of scientific inference. The foregoing is compared with the use of the traditional measure of agreement tex2html_wrap_inline996 . The proposed procedures are an example of how the increasing power of computers, characteristic of the late 20th and presumably 21st century, should affect statistical methods and scientific inference.

Joint work with V. T. Farewell.

Some classes of fundamental problems in statistical experimental design, multivariate analysis, and sampling theory
J. N. Srivastava
Colorado State University

In this paper we examine the information contained in an experiment, in the context of sensitivity, variance-optimality, and revealing power. Many classes of problems in model identification and estimation are detailed. Problems in Multivariate Analysis are considered, including "anti-meta analysis", and building expert systems (for human health) when the number of responses is potentially infinite. Sampling Theory problems are outlined in connection with the class of author's estimators which, because of its wide applicability and accuracy, is pre-eminent in the field.

Multivariate regression with singular covariance matrix
M. S. Srivastava
University of Toronto

Classical multivariate analysis assumes that the covariance matrix is nonsingular. In many problems of practical interest, such as in medical trials, there are large number of related response variables but fewer observations. This leads to singularity in covariance matrix. In this paper, we propose tests and estimates when the covariance matrix is singular for multivariate regression and growth curve models.

Joint work with Dietrich von Rosen.

Degenerate, conditional and increasing order U-statistics
Gabor J. Szekely
Bowling Green State University

Many characterizations of parametric and nonparametric families of distributions lead to statistical tests involving U-statistics. In degenerate, conditional or increasing order cases the asymptotic behavior of U-statistics provide a large variety of interesting unsolved problems. In the talk some new characterizations and the corresponding test statistics will be discussed. The related open questions in many cases have surprisingly attractive forms.

Regression modelling with fixed effects - missing values and other problems
H. Toutenburg
University of Munich

The paper considers four problems in linear regression.

(i) The predictive performance of (possibly biased) restricted and mixed regression estimators with respect to a stochastic target function tex2html_wrap_inline998 , with tex2html_wrap_inline1000 a weight matrix is investigated. Under MDEP-matrix superiority, application comes from imputation for missing values resulting in a biased mixed estimator.

(ii) The basic question of detecting a non-MCAR process is demonstrated using outlier detection methods. Especially the power of detecting a non-MCAR process is investigated using adaptations of Cooks distance, of DRSS and DXX in a simulation study.

(iii) The Gauss-Markov estimator b is the best unbiased estimator if the vector of disturbances u is multinormally distributed. Hence, if u is not multinormally distributed, there is a potential of nonlinear unbiased estimators that improve upon b.

(iv) Consider the general linear regression model. There exists a number of conditions under which OLSE and GLSE coincide. An open question is the following: What is the explicit form of a linear unbiased estimator tex2html_wrap_inline1010 for tex2html_wrap_inline1012 whose efficiency lies between that of OLSE and GLSE in the sense that tex2html_wrap_inline1014 , where tex2html_wrap_inline1016 denotes the Loewner ordering of n.n.d. matrices?

Joint work with A. Fieger, C. Heumann, V. K. Srivastava, and G. Trenkler.

Sketches on probabilistic dilation equation
Jacek Wesolowski
Warsaw University of Technology

The basic dilation equation (dile) has the form

displaymath1018

where tex2html_wrap_inline1020 's are some real (complex) constants, and uniqueness of a non-zero solution is ensured by the condition tex2html_wrap_inline1022 . Then tex2html_wrap_inline1024 . The solution f can be, in general, not a proper function but a Schwartz distribution. A considerable interest in diles observed in recent years is connected with the fact that they play a crucial role in wavelets constructions - see, for instance, Strang (1989), or Heil and Colella (1994). They are examples of two-scale difference equations studied thoroughly in Daubechies and Lagarias (1991, 1992).

A special type of such equations was studied in a probabilistic context of infinite Bernoulli convolutions - see for instance Kershner and Wintner (1935) or Erdös (1940). Here we study a probabilistic dilation equation (prodile) which arises as a natural version of the standard dile, if written not for densities (they do not need to exist) but for the probability measures themselves:

displaymath1028

where tex2html_wrap_inline1030 is a probability distribution of a random variable X and tex2html_wrap_inline1034 are positive constants and tex2html_wrap_inline1036 . Here r is a given positive constant which equals 2 in the basic case. In the case of symmetric Bernoulli convolutions the parameters of the respective prodile are tex2html_wrap_inline1040 and the resulting distribution is singular if only tex2html_wrap_inline1042 (Erdös, Wintner).

In the talk also other examples of prodiles will be presented allowing well known (ex. uniform) or not so widely used distributions (for instance the de Rham density). Some basic properties of prodile distributions (solutions of prodiles) will be revealed. Three main methods of getting approximate solutions of diles (cascade algorithm, Fourier technique, dyadic interpolation) will be completed with a Monte Carlo approach leading to approximate distribution functions or densities of respective probability measures. We believe that the family of prodile distributions can be an important and interesting object of investigations. One of the reasons is that it contains some families of singular distributions, which, though appear in a natural way in many problems, are, in general, difficult to analyze. Here the prodile representation could be very useful. The representation is connected with some mixture property of distributions under shifting and scaling. This may be also a source of new characteristic properties of some well known distributions. Also a possible application to hypothesis testing for a class of singular distributions will be presented.

References

- Daubechies, I., Lagarias, J. (1991), Two-scale difference equations: I. Existence and global regularity of solutions. SIAM J. Math. Anal. 22, 1388-1410.
- Daubechies, I., Lagarias, J. (1991), Two-scale difference equations: II. Local regularity, infinite products and fractals. SIAM J. Math. Anal. 23, 1031-1079.
- Erdös, P. (1940), On the smoothness properties of a family of symmetric Bernoulli convolutions. Amer. J. Math. 62, 180-186.
- Heil, C., Colella, D. (1994), Dilation equations and the smoothness of compactly supported wavelets. In: Wavelets. Mathematics and Applications (J.J. Benedetto, M.W. Frazier, eds), CRC Press, Boca Raton.
- Kershner, R., Wintner, A. (1935), On symmetric Bernoulli convolutions. Amer. J. Math. 57, 541-548.
- Strang, G. (1989), Wavelets and dilation equations: a brief introduction. SIAM Rev. 31, 614-627.

Censoring and random truncation: a survey of some recent developments in nonparametric inference for incomplete data
Grace Yang
University of Maryland

Product-limit estimates tex2html_wrap_inline1044 of a distribution function F play an essential role in the statistical analysis of censored or randomly truncated data. There are used either directly in statistical inference about F or indirectly such as in regression analysis or other type of problems.

Prior to the publication of Gill (1983) on the weak convergence of tex2html_wrap_inline1050 for right censored data over the entire support of F, properties of tex2html_wrap_inline1054 were studied and known almost exclusively on restricted intervals [0, b] for which tex2html_wrap_inline1058 These may be sufficient for computing survival probabilities in biostatistics. In the case of regression analysis, one typically imposes complicated assumptions to circumvent the ``tail'' problem. Not knowing the tail behavior of tex2html_wrap_inline1060 has clearly limits its application to other important statistical functions such as sample moments and the mean residual life whose properties depend on F over the entire support.

As an introduction, the product-limit estimate for censored and randomly truncated data will be constructed by using a model identification approach. Some recent developments on the asymptotic properties of tex2html_wrap_inline1064 , with special attention to the results on the tail behavior of tex2html_wrap_inline1066 will be highlighted. A randomization technique that unifies the treatment of a discrete F in a broad category of problems will be discussed.




next up previous
Next: About this document

Craig L. Zirbel
Mon Apr 20 17:11:04 EDT 1998