Principal Points for
Mixed Effects Models: Applications to Applications to Identifying
Placebo Responders
(with Eva Petkova), submitted for publication
Principal points are cluster means for theoretical distributions. For
longitudinal data where each observation corresponds to a curve,
principal points can be used to
determine a set of representative curve profiles that optimally
represent the distribution. This paper presents a method of determining
maximum likelihood estimators of principal points for linear mixed
models. The method can incorporate covariates as well. The
results are applied to an anti-depressant study to identify
prototypical drug and placebo response profiles.
Model
Misspecification: Finite Mixture
or Homogeneous?
(with Dong Yun and Eva Petkova, Statistical
Modelling, to appear in volume 8, 2008)
A common problem in statistical modelling is to distinguish between
finite mixture distribution and a homogeneous non-mixture
distribution. Finite mixture models
are widely used in practice and often mixtures of normal densities are
indistinguishable from homogenous non-normal densities. This
paper illustrates what happens when
the EM algorithm for normal mixtures is applied to a distribution that
is a homogeneous non-mixture distribution. In particular, a population-based EM algorithm for
finite mixtures
is introduced and applied directly to density functions instead of
sample data. The population-based EM algorithm is used to find
finite mixture approximations to common homogeneous distributions.
An example regarding the nature of a placebo response in drug treated
depressed subjects is used to illustrate ideas.
Click here
for R
code that will fit a 2-component univariate normal mixture to a given
density function using the Population-Based EM Algorithm. The R
software is freely available at CRAN.
In the program, the user specifies the density function g(y). The
default density is a gamma density.
Linear
Transformations and the k-Means
Clustering Algorithm: Applications to Clustering Curves
(2007, The American Statistician, 61, 34-40)
Functional data can be clustered by plugging estimated regression
coefficients from individual curves into the k-means algorithm. Clustering
results can differ
depending on how the curves are fit to the data. Estimating
curves using different sets of basis functions corresponds to different
linear transformations of the data.
k-means clustering is not invariant to linear transformations of
the data. The optimal linear transformation for clustering will
stretch the distribution
so that the primary direction of variability aligns with actual
differences in the clusters. It is shown that clustering the raw
data will often give results
similar to clustering regression coefficients obtained using an
orthogonal design matrix. Clustering functional data using an L^2
metric on function
space can be achieved by clustering a suitable linear transformation of
the regression coefficients. An example where depressed
individuals are treated
with an antidepressant is used for illustration.
Latent
Regression Analysis
(with Eva Petkova, submitted for publication)
An important question in clincial studies if there exist distinct
disease classes (e.g. cancer versus no cancer) or if instead there
exists a continuous latent variable representing disease severity (e.g.
level of depression). To answer this question, a latent
regression model is proposed which represents a generalization of a
finite mixture model. The finite mixture model is cast as a
regression with a latent Bernoulli predictor. The latent
regression model generalizes the finite mixture model by allowing the
latent predictor to vary continuously distribution on the interval
(0,1). An EM algorithm is used to estimate parameters of the latent
regression model. Examples and simulations are given to
illustrate the latent regression model. In particular, the latent
regression model is illustrated in a depression treatment study to
determine if there exist two distinct classes of subjects (those who
experience a placebo effect and those who do not), or if instead
everyone experiences a placebo effect over a continous range.
A
Parametric k-Means
Algorithm
(2007,
Computational Statistics, 22, 71-89)
The k points that optimally
represent a distribution (usually in terms of a squared error loss) are
called the k principal
points. This paper presents a computationally intensive method
that automatically determines the principal points of a parametric
distribution. Cluster means from the k-means algorithm are
nonparametric estimators of principal points. A parametric k-means approach is introduced
for estimating principal points by running the k-means algorithm on a very large
simulated data set from a distribution whose parameters are estimated
using maximum likelihood.
Theoretical and simulation results are presented comparing the
parametric k-means algorithm
to the usual k-means
algorithm and an example on determining sizes of gas masks is used to
illustrate the parametric k-means
algorithm.
R-code for the Parametric k-means algorithm:
- parametrickmeans.r
R-function for estimating k principal points of a multivariate normal
distribution using the parametric k-means algorithm.
- SCtest.r
R-function that performs the parametric bootstratp test for
self-consistency using the parametric k-means
algorithm.
On
Estimation in
Compartment Modeling with an Input Function
(2006, with Ogden, Biostatistics,
7, 115-129)
Abstract: In some nonlinear regression situations, one
or more of the parameters in the expression for the regression function
is estimated from a separate data source. In such a case, the
typical
estimation procedure is to estimate the appropriate parameters from the
separate data, then plug these estimated values into the expression for
the regression function for the estimation of the rest of the
parameters.
This situation arises frequently in compartment modeling when there is
an external ``input function'' to the system. This paper
addresses
the general question of the estimation of parameters and their standard
errors in nonparametric regression when some parameters are estimated
separately.
One important application of this method is for estimation of rate
parameters
and their standard errors in a compartmental system when parameters
from
an input function are estimated from separate data. An example
and
a simulation study are provided to illustrate the results and to study
the performance of the proposed methodologies.
Allometric
Extension
for
Multivariate Regression Models
(2006, with Ivey, Journal
of Data Science, 4,
479-495)
Abstract: In multivariate regression, interest lies on
how the response vector depends on the covariates. A multivariate
regression
model is proposed where covariates explain variation in the response
only
in the direction of the first principal component axis. This model
provides
allows a clear interpretation in situations where the first principal
component
has a meaningful interpretation. In particular, in allometric studies
where
the first principal component is considered a size variable, the model
stipulates that the covariates effect only the size and not the shape
of
an organism. We show that the model naturally generalizes the
two--group
allometric extension model to the framework of multivariate regression
where groups differ conditionally on a set of covariates. An example
which
motivated this model is illustrated.
Linear Conditional Expectation for Discretized
Distributions
(2004, with Sanders, Journal of
Applied Statistics, 31,
361-372)
Abstract: Many statistical methods for continuous
distributions
assume a linear conditional expectation. Components of multivariate
distributions
are often measured on a discrete ordinal scale based on a
discretization
of an underlying continuous
latent variable. The results in this paper show that common examples
of discretized bivariate and trivariate distributions will have a
linear
conditional expectation. Examples and simulations are provided to
illustrate
the results.
Clustering Functional Data
(2003, with Kimberly K. J. Kinateder, The
Journal of Classification, 20, 93-114)
Abstract: The problem of clustering functional data is
addressed. Results on principal points (cluster means for probability
distributions)
are given for functional Gaussian distributions. Examples and
simulations
are provided to illustrate results.
Estimating the Average Slope
(2003, Journal
of Applied Statistics, 30, 389-396)
Abstract : In the regression setup Y = f(X) + e, an example
is illustrated where the average slope E[f ' (X)] is to be
estimated.
A simple solution is to use the slope obtained from fitting a least
squares
line as an estimator of
E[f ' (x)]. If f(X) is not linear, then the simple linear regression
model is the wrong model. However, in certain circumstances, the
slope from the simple linear regression model is a correct estimator
for
the average slope of the response. This paper investigates when
the
slope of a least squares line is a suitable estimator of the average
slope
of the reponse.
Identifying Placebo Responders Among Drug
Treated Subjects
(2003, with Petkova and Ogden, Journal
of the American Statistical Association, 98, 850-858)
Abstract: Identification of placebo responders among
subjects
treated with active drug has significant clinical and research
implications.
In clinical practice when a patient treated with medication improves,
this
improvement may be attributed to the chemical component of the drug
itself,
a ``placebo effect'', or some combination of these. Determining the
proper
subsequent treatment and maintenance of the patient may be aided
greatly
by understanding the mechanism of a patient's improvement.
In a research context, classification of patient response has bearing
on the way efficacy and effectiveness clinical trials are designed and
conducted. This paper presents a framework for studying placebo
response
in diverse areas of medicine.
In order to identify placebo responders among drug treated patients,
a profile of the clinical status over time (outcome profile) is
estimated
for each subject. Self-consistent partitioning techniques are
used
to group subjects based on the amount of curvature in the profile
as well as the overall trend in the profile. The resulting
partitions
determine representative profiles for subjects in the drug group which
can subsequently be used to classify patients. The proposed
method
is applied to data from a clinical trial for treatment of depression
involving
placebo and the active drug phenelzine. Data from the placebo arm
of the study is used to help validate the procedure since the
drug-treated
and placebo-treated subjects should share common profiles.
Last updated March 20, 2008