Thursday, January 29, 2009

MDS and social psychology

Searching JPSP by scholar. The 12 results found are categorized as the following:

A. Structure of Emotion

1. Russell (1980) A circumplex model of affect: 28 emotion-denoting adjectives are reduced to a 2D space: pleasure-displeasure and arousal-sleepiness.
  • In the same year, Russell and Pratt (1980) also talked about the two dimensions on the meaning that persons attribute to environments.
  • Russell and Bullock (1985) followed up on Russell (1980) to show that the two dimensions reveal a basic property of the human conception of emotions, rather than represent an artifact that is due to semantic relations learned along with the emotion lexicon.
  • Russell, Weiss, and Mendelsohn (1989) followed up to develop a single-item scale, the Affect Grid, to quickly assess affect along the dimensions of pleasure-displeasure and arousal-sleepiness.
  • Feldman (1995) interpreted the 2D as valence-focus and arousal-focus and suggested their relation to Positive Affect and Negative Affect.
  • Barrett (2004) followed up on Feldman (1995) to talk about how valence-focus and arousal-focus are related to cognitive structure of emotion language vs. phenomenological experience.
  • Extending Russell's model, Larsen, McGraw, and Cacioppo (2001) argued that people can feel happy and sad at the same time; they do not have to experience positive-negative emotions in a bipolar way.
B. Structure of Self-Other Relationship:

2. Falbo (1977) Multidimensional scaling of power strategies: 16 strategies of "How I Get My Way." reduced to a 2D space: (a) rational/nonrational and (b) direct/indirect.

3. Bartholomew and Horowitz (1991) examined a model of individual differences in adult attachment in which two underlying dimensions, the person's internal model of the self (positive or negative) and the person's internal model of others (positive or negative), were used to define four attachment patterns. (as seen in General Discussion)

4. Wiggins, Phillips, and Trapnell (1989) interpersonal circumplex: dominant/submissive and agreeable/cold-hearted.
  • Gurtman (1992) applied this to plot individuals' profiles of high/low trust and high/low Machiavellianism.
5. Walker and Hennig (2004) studied the underlying 2D for the three exemplars of morality: just, brave, and caring, and found different 2D for each of them.

6. Abele and Wojciszke (2007) found that a large number of trait names can be organized into the 2D space of agency and communion.

7. Grouzet et al. (2005) found that 11 types of goals can be organized into a 2D space of intrinsic (e.g., self-acceptance, affiliation) versus extrinsic (e.g., financial success, image), and self-transcendent (e.g., spirituality) versus physical (e.g., hedonism). This results has cross-cultural validity.

Wednesday, January 28, 2009

(Incomplete) list of MDS researchers

  1. Warren S. Torgerson:

    • former professor at John Hopkins
    • developed MDS while he was a PhD student
    • known for the classical scaling (aka., Torgerson scaling) in MDS
    • Solution from Torgerson scaling can be used as initial configuration; however, it is a rational configuration and is prone to local minima

  2. Louis E. Guttman:

    • former president of the Psychometric Society
    • developed Guttman loss function in SYSTAT

  3. Roger N. Shepard:

    • former president of the Psychometric Society
    • professor of cognitive psychology at Stanford University (Emeritus)
    • known for Shepard diagram

  4. Joseph B. Kruskal:

    • former president of the Psychometric Society
    • former president of the Classification Society of North America
    • developed stress formula 1 and formula 2
    • developed the program of KYST (Kruskal, Young, & Seery, 1973)

  5. Forrest W. Young:

    • former president of the Psychometric Society
    • professor of quantitative psychology at the University of North Carolina at Chapel Hill (Emeritus)
    • developer of ALSCAL (alternating least squares scaling) (available in SPSS)

  6. J. Douglas Carroll:

    • former president of the Psychometric Society
    • professor of management and psychology at Rutgers University
    • developer of INDSCAL (individual differences scaling)

  7. Jan de Leeuw:

    • former president of the Psychometric Society
    • developer of smacof package in R

  8. Lawrence J. Hubert:

    • former president of the Psychometric Society
    • developer of combinatorial analysis
    • developer of dynamic programming
    • developer of city-block MDS

  9. Ingwer Borg and Patrick J. F. Groenen:


Softwares

  1. R:
    • Package: stats

    • Package: proxy
      • dist(): distance matrix

    • Package: MASS (Mondern Applied Statistics in S)
      • isoMDS(): Kruskal's non-metric MDS (an example can be found here)
      • Shepard(): for drawing Shepard diagram
      • sammon(): Sammon's non-metric MDS (similar to Kruskal's non-metric MDS but independently developed)

    • Package: smacof (Scaling by MAjorizing a COplicated Function; a paper is here)
      • smacofSym(): for symmetric dissimilarity matrices
      • smacofRect(): for rectangular input matrices, i.e., unfolding
      • smacofIndDiff(): individual difference MDS
      • smacofSphere.primal(): projection of the resulting con gurations onto spheres
      • smacofSphere.dual(): indirect function to solve linear problems, sometimes faster than primal
      • sim2diss(): convert similarity matrix to dissimilarity matrix

    • Package: labdsv (Laboratory for Dynamic Synthetic Vegephenonenology)
      • nmds(): application of isoMDS()

    • Package: vegan (R functions for vegetation ecologists)
      • metaMDS(): an integration of initMDS(), isoMDS(), postMDS(), and wascores()
      • procrustes(): for the Procrustes Problem
      • wcmdscale(): weighted classical (metric) multidimensional scaling

    • Package: rggobi
      • ggobi(): interactive multidimensional scaling using ggobi and ggvis for display

    • Useful Links

  2. SYSTAT:
    • use EM to estimate missing data in nonmetric unfolding model
    • power transformation (metric MDS)
    • log transformation (metric MDS)

  3. PERMAP: a highly entertaining, interactive tool to explore perceptual mapping

  4. SPSS: proxscal, prefscal, alscal

  5. MATLAB: mdscale()
A more complete list of MDS softwares can be found here.

Tuesday, January 27, 2009

Internal and external analyses

To facilitate the interpretation of the dimensions in the reduced space, we may do internal or external analyses.

In internal analysis, we use the same proximities data, run alternative analysis method (e.g., cluster analysis) with them, and embed the results within MDS. If different methods all converge to the same interpretation, then it is!

In external analysis ("property fitting"), we use supplementary data. Specifically, we may try to predict the property (collected on the objects) for object_i from the 2D coordinates for the objects through multiple regression.

For example, in a study, the objects are 14 stressful experiences relevant to early parenting, and the two dimensions are labeled as "major vs. minor child problems" and "child welfare vs. self-welfare". The external property is "infuriating", and we want to predict "infuriating" for each of the 14 objects from the 2D coordinates for the 14 objects, which results in a directed line. It is found that infuriating tends to be associated with the problems of self-ware as opposed to the welfare of the child.

In external analysis, we regress a given external attribute of the objects (e.g., "infuriating") on the 2D coordinates of the objects (i.e., dim 1 and dim 2), and the resulting unstandardized multiple regression coefficients form a point in the 2D space. A directed line is then drawn from the origin to that point. Evidently, the projections of the objects on this line give a set of 2D coordinates, (dim1, dim2), which correspond best to the external attribute (Borg & Gronen, 2005, pp.77-79).

Monday, January 26, 2009

The scaling: Basic concepts

The goal of scaling is to minimize the dissimilarity of data between the original and the reduced space. Specifically,

p_ij is the proximity (typically, dissimilarity) between object_i and object_j in the original space, whereas d_ij is the Euclidean distance between object_i and object_j in the reduced space

We use a linear regression equation to predict d_ij from p_ij, and dhat_ij is the predicted value of d_ij. Then, we want to minimize the difference between d_ij and dhat_ij, using least squares. Here, we have the raw stress index (which we want to minimize):



Because the dimensions in the reduced space can be arbitrarily stretched or contracted, we normalize the raw stress index in order to achieve the following,



Also, a square root places the index in the same unit as d_ij, so we have the normalized stress index (which we want to minimize):

(Note. this is Kruskal's stress formula 1)



Typically, a monotone regression (aka., isotonic regression) is used instead of a linear regression, and it leads to minimizing distance ranks and therefore non-metric MDS. If a linear regression is used, it is metric MDS.

According to Kruskal and Wish (1978), with non-metric MDS, at least 9 objects are required for a 2D solution, while at least 13 objects are required for a 3D solution.



Degenerate solution:

According to Merriam-Webster dictionary, degenerate means " being mathematically simpler (as by having a factor or constant equal to zero) than the typical case".

In MDS, a degenerate solution is one with a zero (or very close to zero) stress value but retaining no (or minimal) structural information about the data. For example, the objects cluster into a few (e.g., 2) nodes and the dimensions are uninterpretable.

Friday, January 23, 2009

Why do we need MDS?

Initially, researchers want to interpret a set of objects in terms of their relationships. However, the proximities (typically, dissimilarity) among them are in a high-dimensional space, which is beyond human's capacity of comprehension. Being troubled, the researchers think,

Heck! Why don't we try to project the objects into a 2D space and display them on a X-Y plane? As human beings, we are much more familiar with a X-Y plane and such an interpretation will be more exciting!

Thus, dimension reduction and therefore information loss is involved in MDS, and the general purpose of MDS program is to preserve the proximities between objects in the high-dimensional space as much as possible. An example of MDS in social psychology is that the 11 factors of the Aspiration Index are visually represented in an 2D plane. (And Don't you like it more when you are familiar with the way of interpreting the results?!)

Some notes:

1. MDS is a visualization tool. The goal is to reduce the observed complexity in the data matrix to lower dimensions (2 or 3) for humans to visualize.

2. MDS is a descriptive tool, rather than an inferential tool (de Leeuw, 2001). However, a representative sample should be recruited in order to generalize the description to the population.

3. MDS is more flexible than factor analysis: (a) it doesn't require that the underlying data are distributed as multivariate normal, and (b) it can be applied to any kind of distances or similarities, rather than just the computed correlation matrix.

4. MDS is different from cluster analysis. The goal of MDS is not to group/partition objects, but users can still visually cluster objects based on MDS.

5. MDS is related to self-organizing map (SOM) because they both enable visualizing low-dimensional views of high-dimensional data. However, SOM preserves data neighorhood, wheres MDS does not.

6. Besides dimensional representation (more exploratory), another goal of MDS is configural verification (more confirmatory).

7. The labeling of a dimension in MDS is arbitrary. The only requirement is that the two ends sum to zero at the center. It is similar to, but not the same as, bipolar, because it doesn't say anything about mutual exclusivity of the two ends in reality.

8. The number of dimensions is usually 2 (at best 3). On the one hand, the number should not be just 1; otherwise, all gradient-based methods in one-dimension will typically result in local optima. On the other hand, the number should not exceed 3; otherwise, visualization could be very difficult.

9. Another example of MDS would be to visualize the travel-times between cities. In the matrix, each row and each column would correspond to a city. MDS could then recreate a map containing the cities, solely from the matrix. This map would look similar to the actual map of city locations, but would differ in interesting ways. Cities connected by faster than average transportation passageways would appear closer together, while roadblocks would move cities apart.