Four Principles for Personality Assessment, Dan Ozer UC Riverside. *) It's a sketch area. Condensing and elaborating Meehl 1972's 7 principles we have 4: 1) The measure's "content" is fully rational within a psychological theory and is appropriate to specific, defined, assessment circumstances. i.e. seems to make sense in terms of the theory. E.g., V should be prima facie like violations. 2) The internal structure of items in a measure should match the requirements of both the relevant psychological theory and the measurement model. (Even in uni-trait models test item behaviors have multiple influences.) 3) The measure has demonstrably high validity for the most theoretically relevant inferences, and this validity is not derived through theoretically unwarranted channels. 4) The implicative relations of the measure are well explored, and the internal structure of the measure and important validity inferences are invariant over theoretically and pragmatically defined generalization criteria. That is, in construct validation process, what is the logic, and how does it yield problems and methodological solutions. Then: 1) What does the construct "mean" to the scientist AND the subject, operationally, as extractable data. What AND How. Answers: *) How the idea came to us *) What we know when we know a person **) As generality over all contexts **) As context-specific: where, when, role, motive, wrt what inventory of strategies, motives, skills. **) a) Individual-specific or b) re: Identity, re: Life Story *) Data dimensions: Modes: self-description (past vs present), capability, prior vs observed behavior, psychophysiology. Source: self, other, instrument, school &c records. Observer effect: hidden, seen Task given to the producer. etc. Justify your methods, appropriate to circumstances, call out likely sources of invalid variance. Esp. consider Source. SLOT: Self-reports, Life outcomes, Observer ratings, Test in context. Big5 is Garbage-in, Garbage-out (O-in, O-out) S/O "content" is in the descriptors used. L data are multivariate functions, hence remote. T data should be coded tightly. *) So the content of the measures should be inferentially close to the construct measured. Have a theory of the instrument and how it measures the construct. How much info does the producer need? How different are the items on the construct? What predicts this variation? Hence a theory of the instrument. S- and O-data have no theory of the instrument as of 1995. *) Assessment circumstances interact with items: artificiality, self-aggrandizing presentation, incomprehensibility, and hidden effects. Observer effects of contempt-since-close or acting high-status with high-status observers. Personality work ignores these: Scientist guilds select items, tests, questions, data types. 2) Internal structure should fit theory. *) a single-trait theory still has many effects on each test item, so theorize this, and that should fit the correlation matrixes. But it's hard when the multiple indep vars overlap. *) SEM > Common Factor Modelling: SEM attends to the measurement model *) A LATENT variable is invisible but causative to measureables. CFM weights each measureable ("effect indicators") on the latent variable (and on each other?), calling it good. *) An EMERGENT variable is visible and DEFINED as that function of its inputs, e.g. Income is emergent on sources of income. Hence measurements CAUSE the emergent variable's value (by definition) *) SEM? Show the relations among measures. Justify measures/sampling and their implementation. Multidimensional measures are no problem. Validity of content (did you catch the idea fully), criterion (did you predict the external criterion well), and construct (does your assessment measure your construct, enabling valid inferences from it) Validity is a property of an inference not a construct. 'Insisting that there be evidence of appropriate relations to external criteria as part of any construct validity argument is fully consistent with Cronbach and Meehl's (1955) original notion: "Numerous successful predictions dealing with phenotypically diverse 'criteria' give greater weight to the claim of construct validity than do fewer predictions, or predictions involving very similar behavior" (p. 295).' (This is N+V's strength.) Cronbach 1995 says, include controls for method effects for convergent validity using the construct's theory and knowledge of method effects within the construct. Ozer p682: "Contrary to the conventional view of trait and method effects as "crossed," it now seems apparent that regarding method effects as nested within trait effects offers greater promise." Makes no sense to me. Anyhow use MTMM expecting certain patterns. "the rules ... are those of rational argument." ---------------- 4) Mutual implications among the measures are well explored, including their internal structure, and validity inferences are invariant over theoretically and pragmatically defined generalization criteria. Uh, yeah. Know your measures, and show your construct's predictions are good conceptually and practically. He wants careers dedicated to the construct, explore widely, apply widely, before generalizing. So you can use your scale scientifically and for applications. It's easy to do a crappy job, hard to make sharp instruments. Item response theory (estimate std dev of measurement for each subject or item) might help. But necessary.