Clinical Psychological Science 2013 Keeley 16 29

clinical psichology science joournal 2013
View more...
   EMBED

Share

Preview only show first 6 pages with water mark for full document please download

Transcript

Clinical Psychological Science http://cpx.sagepub.com/ The Commutative Property in Comorbid Diagnosis: Does A + B = B + A? Jared W. Keeley, Chafen S. DeLao and Claire L. Kirk Clinical Psychological Science 2013 1: 16 originally published online 17 October 2012 DOI: 10.1177/2167702612455742 The online version of this article can be found at: http://cpx.sagepub.com/content/1/1/16 Published by: http://www.sagepublications.com On behalf of: Association for Psychological Science Additional services and information for Clinical Psychological Science can be found at: Email Alerts: http://cpx.sagepub.com/cgi/alerts Subscriptions: http://cpx.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav >> Version of Record - Dec 14, 2012 OnlineFirst Version of Record - Oct 17, 2012 What is This? Downloaded from cpx.sagepub.com at Alexandru Ioan Cuza on October 31, 2013 Empirical Article The Commutative Property in Comorbid Diagnosis: Does A + B = B + A? Jared W. Keeley, Chafen S. DeLao, and Claire L. Kirk Mississippi State University Clinical Psychological Science 1(1) 16­–29 © The Author(s) 2013 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/2167702612455742 http://cpx.sagepub.com Abstract The Diagnostic and Statistical Manual of Mental Disorders (4th edition, text revision) assumes an additive model for describing comorbid symptomatology, including the commutativity of disorder descriptions across order of presentation (e.g., A + B = B + A). Given the high prevalence of individuals with comorbid conditions, it is important to investigate if clinicians follow an additive model when conceptualizing disorders. Three studies involving a total of 138 clinicians tested assumptions of commutativity for conceptualizations of three disorders—major depressive disorder (MDD), generalized anxiety disorder (GAD), and antisocial personality disorder (ASPD)—by either fixed-choice or free-response descriptions of all possible pairwise comparisons of the disorders. Clinicians demonstrated less-than-perfect commutativity for all disorder combinations, and this finding was replicated across two samples. In addition, MDD and ASPD tended to overshadow the presence of GAD in combinations. These results challenge the additive assumptions of the current diagnostic system and may suggest the order in which diagnoses are conceptualized influences the resulting symptomatology. Keywords comorbidity, Diagnostic and Statistical Manual of Mental Disorders (DSM), major depressive disorder, generalized anxiety disorder, antisocial personality disorder Received 5/10/12; Revision accepted 6/25/12 Psychiatric disorders are among the most common disorders in the United States, with approximately 50% of noninstitutionalized people reporting at least one disorder during their lifetimes and 30% reporting at least one disorder occurring in the past year (Kessler et al., 1994). Psychiatric disorders, however, often occur in bunches (Kessler, Chiu, Demler, & Walters, 2005). When two or more psychiatric disorders occur simultaneously, the coexisting disorders represent psychiatric comorbidity. The term comorbidity was introduced by Feinstein in 1970 and is defined as “any distinct additional clinical entity that has existed or that may occur during the clinical course of a patient who has the index disease under study” (p. 456). Although comorbidity is not unique to psychiatric disorders, it has emerged as a ubiquitous construct. Psychiatric comorbidity is commonplace both in the community (Kessler, Chiu, et al., 2005) and in treatmentseeking populations (Evren, Barut, Saatcioglu, & Cakmak, 2006), whereby half of the individuals meeting criteria for any single mental disorder will also meet criteria for two or more mental disorders (Kessler, Chiu, et al., 2005). When multiple disorders occur, said disorders may overlap simply by chance (e.g., an individual with clinical depression may also have a common cold). Among psychiatric disorders, however, the rates of comorbidity greatly exceed the expected rates predicted by chance alone (i.e., the product of the base rates of the two disorders; Boyd et al., 1984; Kessler, Chiu, et al., 2005; Kessler et al., 1994). Although psychiatric comorbidity is a common phenomenon, the current diagnostic system for psychopathology in the United States (Diagnostic and Statistical Manual of Mental Disorders, 4th edition, text revision [DSM-IV-TR]; American Psychiatric Association [APA], 2000) does not explicitly address the manner in which clinicians should conceptualize comorbid cases. In fact, the word comorbidity is not even mentioned in the DSM-IV-TR (APA, 2000), although the concept of diagnostic co-occurrence is addressed. In the Multiaxial Assessment section, the DSM-IV-TR (APA, 2000) states that when an individual has one or more disorders on an axis, the principal diagnosis or reason for the visit should be listed first under its respective axis and all subsequent diagnoses should be listed beneath the principal diagnosis on their respective axes. The recording system of the DSM-IV-TR (APA, 2000) implicitly follows an additive model whereby the symptoms of one disorder are added to the symptoms of the Corresponding Author: Jared W. Keeley, P.O. Box 6161, Department of Psychology, Mississippi State University, Mississippi State, MS 39762 E-mail: [email protected] Commutative Property second—and possibly the third, fourth, and fifth—to complete the clinical picture (e.g., Disorder A + Disorder B = Disorder AB). For example, the symptoms of an individual with major depressive disorder (MDD) and generalized anxiety disorder (GAD) may include sad mood (from MDD), persistent worry (from GAD), and sleep disturbance (common to both). Although the implicit additive model seems logical, research studying how humans combine concepts generally has found that people do not always follow additive strategies (Chater, Lyon, & Myers, 1990; Springer & Murphy, 1992; Wisniewski, 1996). One assumption of an additive model of combination is that symptom presentation is commutative. In other words, a person with a principal diagnosis of MDD who subsequently is diagnosed with GAD should look the same as someone who has a principal diagnosis of GAD and is later diagnosed with MDD (i.e., MDD + GAD = GAD + MDD). In theory, clinicians should provide an identical symptom profile when describing MDD + GAD and GAD + MDD because the disorders are additive. Research, however, has shown that clinicians do not always follow an additive strategy (Keeley & Blashfield, 2010). The current study examines the assumption that comorbid disorder descriptions are commutative. Clinicians’ conceptualizations of comorbid cases can be considered a special case of what cognitive psychologists term conceptual combination. A small but rich literature has examined how humans combine concepts in natural language (Chater et al., 1990; Hampton, 1988; Osherson & Smith, 1981; Springer & Murphy, 1992). The focus of this work has been on noun-noun combinations, like zebra pants or refrigerator door. Some combinations are relatively commonplace, like refrigerator door, and usually are a means of specifying language—a refrigerator door is a special kind of door. Other combinations, like zebra pants, are more novel. Individuals must undergo some sort of cognitive process when surmising the meaning of the new combination. For example, zebra pants might be pants made out of a striped material or pants made specifically for a zebra. There are two major theories that attempt to account for how individuals combine concepts. The first, termed the competition among relations in nominals (CARIN) model, proposes that a modifier and a head noun are joined through a particular relation (Gagne, 2002; Gagne & Shoben, 1997). For example, zebra pants could be pants made of zebra material— the concepts are linked using the “made of ” relation. In this theory, each modifier and noun have particular types of relations with which they are more commonly associated. The meaning of a combination is chosen among possible relations based on a probabilistic weighting of the available relations. The second, competing theory of conceptual combination posits that there are two types of processes for combining concepts. The dual-process theory holds that although some combinations are joined by relations, as in the CARIN model, others involve a parallel property mapping process (Wisniewski, 1996, 1997; Wisniewski & Love, 1998). An example of property mapping would be taking the property 17 “striped” from zebra and applying it to a pair of pants in the combination zebra pants. In this theory, property mapping is not simply a different form of a relation (specifically, the identity relation), as it is a mapping of alignable features of the two concepts involved and does seem to be an independent process (Estes, 2003). It is interesting to note that the description of conceptual combinations does not appear to be commutative (Hampton, 1988, 1997; Storms, De Boeck, Van Mechelen, & Geeraerts, 1993; Storms, De Boeck, Van Mechelen, & Ruts, 1996; Storms, Ruts, & Vandenbroucke, 1998). Reciprocal combinations do not have identical descriptions or properties. For example, exemplars of the combination sports that are also games are not the same as exemplars of games that are also sports (Hampton, 1988). Classical set theory would predict that members of a combination must be a subset of the two parent concepts; that is, all sports that are games must also be sports. Similarly, the overlap between the two concepts of sports and games should lead to identical sets for the reciprocal combinations. However, for this example and many others, participants do not provide commutative descriptions of the combined concepts (Hampton, 1997). Conceptual combinations also demonstrate what has been termed “the dominance effect” (Storms et al., 1993). Generally speaking, one member of the combination tends to dominate the features of the combination. For example, the combination of pet birds tends to have many more features of birds than of pets (Storms et al., 1996). Again, the dominance effect would not be predicted by classical set theory. However, both the CARIN and dual-process theories can accommodate dominance of a pair by weighting the influence of each member, although the theories would account for the strength of that dominance by different means. The dominance effect might be expressed in diagnostic concepts in a somewhat unconventional fashion, given how the effect has traditionally been studied by cognitive psychologists. Typically, dominance has interacted with order of presentation (i.e., pets that are also birds tested against birds that are also pets). Because the feature lists of the concepts are unknown, this method allows a relativistic comparison of the two lists to help determine if one concept was dominated. With diagnostic concepts, the feature list is more determined, due to the concepts’ definitions in the diagnostic manual. Thus, it is possible to demonstrate the dominance of one concept over another in a single combination by determining which features belong to which parent concept. Keeley and Blashfield (2010) examined mental health clinicians’ conceptualizations of comorbid psychopathology as an extension of conceptual combination in general. They found that one property of combination, overextensions, was present in clinicians’ descriptions of comorbid disorders, challenging an additive model of diagnostic comorbidity. The current study examines another property of combination, commutativity, given that the current diagnostic system does not predict (or accommodate) meaningful symptom differences 18 for reciprocal pairs of diagnoses (e.g., MDD + GAD vs. GAD + MDD). Rather, both the CARIN and dual-process models would predict some noncommutative pairings, depending on the role and nature of the words in the combination. Clinicians might judge the specific etiologies and symptoms of diagnostic categories (considered to be the “features” of the concept parallel to the “features” of zebra or pants) to have differential effects in combination. Thus, two disorders that are phenomenologically and etiologically similar should have more commutativity than disorders that are less compatible. Therefore, we selected two disorders (MDD and GAD) that share considerable symptom overlap to the degree some consider them to be different expressions of the same condition (Brown, Chorpita, & Barlow, 1998; Clark & Watson, 1991; Mineka, Watson, & Clark, 1998) and a third disorder that is phenomenologically and etiologically dissimilar from both (antisocial personality disorder [ASPD]; Krueger, Markon, Patrick, & Iacono, 2005). We predict that some degree of noncommutativity will occur overall but that noncommutativity will be greater for combinations including ASPD. In addition, we expect that the features of one disorder might overshadow the features of another in combination, just as occurs with the dominance effect. Keeley et al. Materials. Participants noted the presence of the symptoms of three disorders: MDD, GAD, and ASPD. In an effort to reduce the time required to complete the task, only three stimuli were used. These three disorders were chosen because they represent two disorders that are conceptually and phenomenologically very similar (MDD and GAD; Brown et al., 1998; Clark & Watson, 1991; Mineka et al., 1998) and one disorder that is dissimilar to both (ASPD; Krueger et al., 2005). Stimuli were presented each on a separate page. At the top of the first page, a heading read, “Describe an individual with . . .” where the ellipsis represented the name of one of the three disorders used in the study. The next two pages asked participants what would change in their descriptions if one of the other two disorders in the study were added, such that the heading now read, “What would change if the person with . . . also had . . . ?” All possible combinations were presented and counterbalanced across participants. Under the heading on each page was a list of 120 descriptions. These descriptions were chosen to be representative of the entire domain of psychopathology and personality functioning. It was important to include symptoms beyond simply the DSM criteria for each disorder in order to assure that participants were not producing results based only on the constriction of their response options. The descriptions were taken from a major psychopathology assessment instrument (the Personality Assessment Inventory [PAI], Morey, 1991) and the currently most investigated theory of personality functioning, the Big Five personality factors (Costa & Widiger, 1994). Of the 120 descriptors included, 60 came from the PAI and 60 came from the Big Five factors. For the PAI, the 60 descriptions were taken from the 15 clinical and treatment scales, excluding items on the validity and interpersonal scales. The 60 items selected for inclusion were those from each scale that had the highest loadings per scale in factor analytic studies (Morey, 1991). An equal number of items were selected from each scale, assuming that the items were not directly redundant. Items were rephrased to be one to three words in length (e.g., “feeling worthless,” “fear of abandonment”). The remaining 60 descriptors came from the Big Five personality factors. Each of the five factors has six facets. A representative descriptor for the extremes of each facet was included (2 extremes × 6 facets × 5 factors = 60 items; e.g., “accomplished” vs. “aimless” as the two poles of achievement striving). The 120 descriptions were presented in alphabetical order. Participants also completed a brief demographic questionnaire assessing their sex, age, years of experience in the field, type of clients usually seen, degree, American Board of Professional Psychologist status, state of residency, and familiarity with a variety of assessment measures, including familiarity with the five factor model and DSM systems. Procedure. Six hundred members of ABCT were contacted through the mail with information about the study. Those expressing interest were instructed to return a postcard with Study 1 Study 1 was designed to examine if professionals follow the commutative property when describing different reciprocal orders of comorbid mental disorders. Given the unstated assumption that the classification system follows an additive model, it is important to test if the model holds under working conditions. This study utilized a relatively conservative method for describing disorders’ symptoms. Participants were asked to describe disorders using a fixed list of predetermined symptoms in order to control for variability in response. This method was chosen to ease the interpretability of results but may have limited the scope and generalizability of the findings. Study 2 then addressed these limitations by altering the method and replicating the results. Method Participants. Participants were 35 members of the Association for Behavioral and Cognitive Therapies (ABCT). Most participants were women (n = 22, 62.86%; men n = 13, 37.14%) and specialized in working with adults (n = 21, 60.00%, vs. working with children, n = 7, 20.00%, vs. other, n = 7, 20.00%). The mean age was 47.94 (SD = 11.14) with a mean of 17.80 (SD = 10.83) years of experience in the field. Most held a Ph.D. (n = 33, 94.3%; Psy.D., n = 1, 2.86%; other, n = 1, 2.86%). The sample was geographically representative of all regions in the United States. Two participants who failed to follow instructions were excluded from all analyses. Commutative Property their preferred mailing address. Sixty-eight potential participants returned postcards and received the materials (11.33% return rate); of those, 35 completed the study (51.47% return rate). Participants first completed the demographic questionnaire, which was followed by an instruction sheet for the task. Participants were instructed to create descriptions of three initial disorders (by marking all descriptors that applied to the principal diagnosis) and note how those descriptions would change (by either adding or subtracting symptoms) when the principal diagnosis became comorbid with an additional diagnosis. The instructions explicitly stated that participants should complete the task based on their own clinical experience rather than attempting to reproduce the DSM criteria. Each participant saw each of the three disorders singly (MDD, GAD, and ASPD) as well as in all possible combinations (MDD + GAD, MDD + ASPD, GAD + MDD, GAD + ASPD, ASPD + MDD, and ASPD + GAD). The order in which the single disorders and their combinations were presented was counterbalanced across participants. For the initial description of a single disorder, participants were instructed to circle all descriptors from the list of 120 that they believed to be relevant for that disorder. For a comorbid combination, participants were instructed to circle additional symptoms relative to the comorbid combination and cross out symptoms that no longer applied as a result of the secondary diagnosis. In that way, the methodology explicitly reflected participants’ intentional changes to rule out possible chance omissions or commissions when describing comorbid pairs. 19 The description of the comorbid pair (e.g., MDD + GAD) was generated by taking the descriptors selected for the first disorder (MDD) and then adding the symptoms selected for the second disorder (GAD) and removing symptoms specifically excluded (those that the participant crossed out) from the second disorder. We were then able to calculate kappa by examining the number of descriptors a participant included in both pairs (e.g., MDD + GAD and GAD + MDD), the number included for one pair but not the other, and the number excluded from both. We calculated kappa for each of the three reciprocal disorder pairs (MDD & GAD, MDD & ASPD, GAD & ASPD) for each participant. Table 1 displays the mean and median kappa value for each disorder pair across all participants, as well as measures of variability. The first thing to note about Table 1 is that participants varied widely in their agreement regarding the reciprocal pairs. Indeed, nearly the full range of variability (0.00–1.00) is present for two of the disorder pairs (MDD & ASPD and GAD & ASPD) and about half the range of variability for the other (MDD & GAD). Thus, some participants expressed near perfect commutativity for a pair, whereas others displayed very little agreement. The standard deviations for each represent an average variability corresponding to 15% of the possible range of the scale. Clearly, clinicians approached the commutativity of these disorder pairs in different ways. However, it could be that the majority of the sample conceptualized the pairs in a consistent, commutative fashion. Thus, we also tested the commutativity of the disorder pairs directly. Test of commutativity. Perfect commutativity (i.e., MDD + GAD = GAD + MDD) would be represented by a kappa value of 1.00. In that case, no descriptors would be given to one disorder pair that were not also present in the other. Thus, we tested whether the mean kappa value for each disorder pair was statistically equal to 1.00. We entered the kappa values for the three disorder pairs into an overall multivariate analysis of variance (MANOVA). The overall model was significant, Wilks’s λ (3, 30) = .031, p < .001, indicating differences between or within variables. Custom hypothesis tests indicated that the mean value for each pair was significantly different from 1.00, MDD & GAD F(1, 32) = 98.28, p < .001, η2 = .75, 95% confidence interval (CI) for mean [.7094, .8085]; Table 1.  Means, Medians, and Variability of Kappa Values for the Disorder Pairs in Study 1 Cohen’s Kappa Mean Median Standard deviation Minimum Maximum MDD & GAD MDD & ASPD .7589 .7620 .1397 .4234 .9785 .6659 .6666 .1589 .1215 1.0000 GAD & ASPD .6190 .6320 .1594 .2652 .9409 Results Intrarater agreement. The dependent variable of interest in this study is the amount of agreement for a participant between reciprocal orders of disorder pairs (e.g., MDD + GAD and GAD + MDD). For ease of presentation, we refer to a disorder pair with the convention “MDD & GAD” to represent the agreement between the reciprocal comorbid pairs. We calculated agreement using Cohen’s (1960) kappa statistic. Kappa is a metric commonly used in interrater agreement because it accounts for chance levels of agreement. For this study, it is important to account for chance agreement because the nature of the disorder description task could inflate agreement artificially. Of the list of 120 descriptors, often a large number would not be selected as relevant to either pair. This high number of exclusions relative to descriptors included would inflate agreement but would be largely meaningless on a theoretical level. For example, a participant might exclude paranoid ideation from both reciprocal pairs of MDD & GAD because paranoid ideation is not typically associated with MDD, GAD, or MAD & GAD, but that exclusion is not truly “agreement” on the commutativity of the disorder descriptions. Kappa is able to factor out the amount of agreement predicted by chance based on the number of descriptions included versus those excluded. Note: MDD = major depressive disorder; GAD = generalized anxiety disorder; ASPD = antisocial personality disorder. 20 MDD & ASPD F(1, 32) = 145.87, p < .001, η2 = .82, 95% CI for mean [.6096, .7223]; and GAD & ASPD F(1, 32) = 188.61, p < .001, η2 = .85, 95% CI for mean [.5625, .6755]. Indeed, the confidence intervals for these means do not approach 1.00. Rather, these means are spread well below the value of 1.00, as indicated by their relatively large effect sizes. Therefore, the sample evidenced lower kappa values than perfect commutativity for each disorder pair. Although the DSM-IV-TR (APA, 2000) would predict perfect commutativity (i.e., a kappa value of 1.00), perfection is a rarely attained state. To allow for human error, we subsequently compared the mean kappa value for each disorder pair to a reliability of .85. The comparison point was developed in light of the Structured Clinical Interview for DSM Disorders (SCID; i.e., SCID-I and SCID-II) test-retest values for MDD, GAD, and ASPD. With a 7- to 10-day test-retest interval, the average kappa value for MDD was .73 and for GAD was .63 (Zanarini & Frankenburg, 2001); with a 1- to 3-week testretest interval, the average kappa value for MDD was .64 and for GAD was .56 (Williams et al., 1992). With a 1- to 3-week test-retest interval, the average kappa value for ASPD was .76 (First et al., 1995). Because the test-retest interval in the previously cited reliability tests was substantially greater than the 30-min interval for the current study, we believed a kappa value of .85 represented a fair adjustment. Custom hypothesis tests indicated that the mean value for each pair was significantly different from .85, MDD & GAD F(1, 32) = 14.03, p < .001, η2 = .305; MDD & ASPD F(1, 32) = 44.29, p < .0001, η2 = .58; GAD & ASPD F(1, 32) = 69.32, p < .0001, η2 = .68. Therefore, the sample evidenced lower kappa values than .85 for each disorder pair. Differences in commutativity. Although each disorder pair demonstrated a degree of noncommutativity, some disorder pairs appeared closer to a perfect value of 1.00 than other pairs. In other words, some disorder pairs might be more commutative than others. Within the same overall MANOVA reported above, we tested for differences between disorder pairs. The pair MDD & GAD had a higher mean kappa value than either the MDD & ASPD pair, F(1, 32) = 16.89, p < .001, η2 = .35, or the GAD & ASPD pair, F(1, 32) = 43.39, p < .001, η2 = .58. MDD & ASPD; GAD & ASPD were equal to each other, F(1, 32) = 3.05, n.s., η2 = .09. This pattern indicates that combinations involving incongruent disorders (i.e., those including ASPD) evidenced lower agreement than did combinations with congruent disorders (MDD & GAD). The results indicate that group averages for commutativity are lower than theorized. Nonetheless, it could be that some individuals consistently evidence low agreement across the disorder pairs, whereas others are consistently high. In other words, participants might be consistent in their relative level of commutativity across the three disorder pairs. The level of agreement across disorder pairs was related, MDD & GAD to MDD & ASPD r(31) = .63, p < .001; MDD & GAD to GAD & ASPD r(31) = .67, p < .001; and MDD & ASPD to GAD & Keeley et al. ASPD r(31) = .53, p < .01. However, the medium correlation values indicate that individuals were not always consistent in the amount of commutativity they displayed. Thus, a participant might have a relatively low kappa value for one disorder pair and a relatively high value for another. Across participants, individuals displayed different patterns of which disorder pairs were high and low. Thus, it does not appear that the effect of noncommutativity is due simply to an individual being more or less likely to maintain the commutative property when conceptualizing comorbid diagnoses. Study 2 Study 2 was conducted to supplement the findings of Study 1, which found a degree of noncommutativity for disorder pairs. However, the forced-choice symptom selection method of Study 1 may have limited or distorted the descriptions clinicians offered for the disorders. Specifically, because clinicians were limited in their possible responses, they might have suppressed descriptions that they considered relevant to one (decreasing commutativity) or both (increasing commutativity) disorders. Thus, Study 2 employed an open-ended, freeresponse method in which clinicians simply provided any descriptions they saw fit for a disorder. In addition, the descriptions of the combinations were gathered holistically rather than as a function of change, as in Study 1. The method of first describing one disorder and then noting changes may have biased participants’ responses by leading them toward producing changes, thereby assuming the different combinations should be noncommutative. Further, the sample for Study 1 came from ABCT, so it is possible that participants were more homogeneous regarding their theoretical orientations than the general population, which could influence the way in which the participants conceptualized the disorders. The sample for Study 2 was selected to balance this bias by being more representative of the distribution of theoretical orientations in the population while losing some geographic representation. Method Participants. For the second study, participants were 45 psychologists licensed in the states of Maine and Vermont. In contrast to Study 1, this sampling strategy focused on individuals from a limited geographic region representing a wider range of theoretical orientations. Most participants were women (n = 25, 55.56%; men, n = 20, 44.44%), and the average age was 53.48 years (SD = 10.80). The average years of experience was 26.03 (SD = 9.71). Most participants held a Ph.D. (n = 28, 62.22%), with some having a Psy.D. (n = 9, 20.00%) or other degree (Ed.D. n = 4, 8.89 %; M.S. or M.A. n = 3, 6.67%; and other n = 1, 2.22%). They described themselves as being in the fields of clinical (n = 29, 64.44%), counseling (n = 13, 28.89%), school (n = 2, 4.44%), and other (n = 1, 2.22%). About half (n = 24, 53.33%) of the participants considered themselves to hold a cognitive-behavioral theoretical Commutative Property orientation, with 17.78% (n = 8) describing themselves as eclectic or integrative, 11.11% (n = 5) describing themselves as interpersonal or psychodynamic, and 4.44% (n = 2) endorsing a family systems approach. A remaining 13.33% (n = 6) listed another theoretical orientation that was not duplicated in the sample. Many participants worked in private practice (n = 22, 48.89%), with 13.33% (n = 6) working in an outpatient community setting, 6.67% (n = 3) working in a hospital or inpatient setting, 11.11% (n = 5) working in a federal or state position, 6.67% (n = 3) working in a school, 4.44% (n = 2) working in a university counseling center, 2.22% (n = 1) working in a rehabilitation facility, and 6.67% (n = 3) working in some other setting. Ten participants were excluded because they failed to properly complete the materials or left portions blank; the final sample size was 35. The 10 participants excluded did not differ from the rest of the sample regarding age, sex, years of experience, degree type, field (clinical, counseling, school), theoretical orientation, or work setting. Materials. The same three disorders were used as stimuli in this study. However, participants described the disorders in a free-response task rather than with the forced-choice response presented in Study 1. Further, each disorder and combination were rated independently rather than by adding and subtracting descriptors from the single disorder description (i.e., participants were asked to create new descriptions for the disorder combinations as opposed to amending their descriptions from the single disorder). Participants also completed a demographics questionnaire assessing age, sex, years of experience, type of clients, frequency of consulting the DSM, theoretical orientation, work setting, degree level, and field of degree. The clinicians were asked to rate their perceived familiarity with each of the disorders, on a 5-point Likert-type scale, as well as the perceived similarity of the three disorders on a scale of –3 to 3 (with 3 being most similar and 0 being a neutral midpoint). Procedure. Participants were selected through publically available lists of licensed psychologists in the states of Maine and Vermont. We initially invited 1,419 clinicians to participate by mail, and 69 participants (4.86% return rate) returned postcards indicating their willingness to do so. We then mailed the study materials to those 69 clinicians, of whom 45 (65.22% return rate) returned materials. Participants first completed the demographic questionnaire, which was followed by a set of instructions. Participants were asked to provide descriptions of the disorders (or disorder pairs) based on their clinical experience. The instructions explicitly stated that this task was not a test of clinical knowledge and that participants were not expected to try to reproduce DSM criteria. Clinicians were asked not to reference any materials when completing the task but to base their descriptions on their own experience. They were allowed to provide as many descriptions as they liked for each stimulus, but they were asked to limit the length of each description to two or three words. They were provided with a content-irrelevant 21 example of a list of features one might generate for the characteristics of a bird (e.g., feathers, wings, lays eggs, kept as pet). Clinicians then described each of the single disorders (presented in random order), followed by all of the possible pairwise combinations (also presented in random order). This sequence ensured equal exposure to each of the stimuli prior to rating the comorbid pairs so that all were equally primed in memory while also controlling for possible order effects. As with Study 1, participants were asked to describe each reciprocal order of the disorder pairs (e.g., MDD + GAD and GAD + MDD). Results Intrarater agreement. As an extension of Study 1, the dependent measure of interest in this study was also the amount of agreement for a participant between reciprocal orders of disorder pairs (e.g., MDD + GAD and GAD + MDD). For ease of presentation, we refer to a disorder pair with the same convention (e.g., “MDD & GAD”) to represent the agreement between the disorder pairs. Intrarater agreement was calculated using percentage agreement, which was the number of descriptors shared between MDD & GAD divided by the total number of descriptors for MDD & GAD including the descriptors that were unique to MDD + GAD and GAD + MDD. Although percentage agreement does not control for chance agreement, like kappa does, because this study’s methodology incorporated a free-response task as opposed to a forcedchoice task, artificially inflated agreement was not a factor. In other words, the set of excluded symptoms was unknown (and theoretically infinite) because participants recorded only descriptions that were present. As such, it is not possible to calculate kappa, but it is also irrelevant because the number of descriptions a participant generated directly indicates the amount of agreement. Percentage agreement was calculated for each reciprocal pair (e.g., MDD + GAD and GAD + MDD). First, we calculated the descriptive statistics for each reciprocal pair. Table 2 shows the mean, standard deviation, and maximum and minimum values for percentage agreement of each reciprocal pair of disorders. Again, the variability in agreement was large, with nearly the full range of values present for each reciprocal pair. One may be tempted to compare the percentage agreement values to the kappa values Table 2.  Means, Standard Deviations, and Maximum and Minimum of Percentage Agreement Values for the Disorder Pairs in Study 2 Percentage agreement Mean  Standard deviation Minimum Maximum MDD & GAD MDD & ASPD GAD & ASPD .8270 .2796 .1429 1.0000 .7758 .3530 0.0000 1.0000 .8306 .3071 .0556 1.0000 Note: MDD = major depressive disorder; GAD = generalized anxiety disorder; ASPD = antisocial personality disorder. 22 shown in Study 1; however, the metrics are not directly comparable. Thus, rather than make direct comparisons to the values, we completed the same analytical procedure done in Study 1 to examine the reliability of the findings. Test of commutativity. As with Study 1, a percentage agreement value of 1.0 would represent perfect agreement, which is what would be predicted given the theoretical stance of the diagnostic system. Thus, we examined the percentage agreement values to determine if they were in fact equal to 1.0 (which would represent perfect intrarater agreement) or if clinicians deviated from purely commutative combination. We entered the three percentage agreement variables into an overall MANOVA, which was significant, Wilks’s λ (3, 32) = .644, p < .01. The mean percentage agreement values for each disorder pair (see Table 2) were each significantly lower than 1.0, MDD & GAD F(1, 34) = 13.39, p < .001, η2 = .28, 95% CI for mean [.7310, .9231]; MDD & ASPD F(1, 34) = 14.11, p < .001, η2 = .29, 95% CI for mean [.6546, .8971]; and GAD & ASPD F(1, 34) = 10.65, p < .01, η2 = .24, 95% CI for mean [.7251, .9361]. The mean values, which are significantly lower than 1.0 percentage agreement, show that once again participants did not follow the commutative property when combining disorders. In addition, to be consistent with replicating the results from Study 1, mean percentage agreement values for each disorder pair were compared to .85 to account for a reasonable degree of expected error or inconsistency. Custom hypothesis tests indicated that the mean value for each pair was not significantly different from .85, MDD & GAD F(1, 34) = 0.24, n.s., η2 = .007; MDD & ASPD F(1, 34) = 1.54, n.s., η2 = .04; and GAD & ASPD F(1, 34) = 0.140, n.s., η2 = .004. Differences in commutativity. As before, the reciprocal pairs do not evidence perfect commutativity, but it could be that one pair of disorders is more commutative than another. Thus, in the same overall MANOVA, we conducted planned pairwise comparisons among the three percentage agreement variables. In this case, the mean percentage agreement levels for each reciprocal disorder pair were equal to each other (all ps > .05). Thus, when completing a free-response task, there do not appear to be differences in commutativity rate across similar and dissimilar disorder pairings. However, similar to Study 1, participants were not always consistent within themselves in commutativity rate across the pairs. Even though the average commutativity rates did not differ, participants varied within themselves to a moderate degree in how high or low the pairs were, relatively, MDD & GAD to MDD & ASPD r(33) = .48, p < .01; MDD & GAD to GAD & ASPD r(33) = .77, p < .001; and MDD & ASPD to GAD & ASPD r(33) = .66, p < .001. Keeley et al. one disorder might dominate the features of the combination. This study employed a fixed-choice methodology similar to that of Study 1 but did not include both possible orderings of a pair of disorders. However, participants did provide independent ratings of the disorder combinations (as in Study 2) rather than noting changes (as in Study 1). These independent ratings provide a clearer structure for testing the dominance effect, as the change-based methodology could lead one to over-rely on the features of the initial concept. However, examining dominance within the free-response methodology of Study 2 would be unclear, as participants might use slightly different terminology to list two symptoms (sad vs. depressed) in the reciprocal pairs. Therefore, we returned to utilizing a fixed-choice description method. Further, because traditional methods of studying dominance have focused on stimuli where the list of features is unknown, we controlled for the individual’s understanding of the single disorder concepts by using his or her ratings of individual disorders to determine which features belong to which parent disorders in the combination. Method Participants. Participants for this study came from two samples: clinicians licensed in the state of Florida and professionals belonging to ABCT. The Florida state licensing board provides a list of all licensed practicing psychologists in the state. Of the 4,028 clinicians licensed in the state, 500 were randomly selected to receive information about the study. The second sample consists of members of ABCT. The association provides a list of all members’ mailing addresses. At the time of data collection, there were 2,423 members, not including graduate students or associated members (e.g., vendors). The 69 members who listed their home state as Florida also were excluded so that they did not have a double chance of being selected. Five hundred members of ABCT were randomly selected to receive information about the study. The average participant was 46 years old (SD = 11.33) with 20 years of experience (SD = 10.45). Most participants were women (63%). The majority of participants had experience working with adults (88%), many had experience working with adolescents (66%), and 48% of the sample had experience with children. Forty-five percent of participants consulted the DSM once a week, 34% consulted it once a month, 15% consulted it rarely, and 6% consulted it daily. The largest portion of participants listed their theoretical orientation as cognitivebehavioral (63%), with an additional 19% describing themselves as either integrative or eclectic and 6% as strictly behavioral. The remaining 12% were single representations of orientations, including psychodynamic, interpersonal, applied developmental, humanistic, and so forth. Participants worked in a variety of settings, including private practice (41%), faculty position (16%), hospital or medical center (14%), community mental health or outpatient clinic (11%), Veterans Administration (5%), college counseling center (5%), and Study 3 Studies 1 and 2 demonstrated a noncommutativity effect across two different methodologies. Study 3 addresses the issue of the dominance effect (Storms et al., 1993), whereby Commutative Property others. The two samples did not differ significantly on any of these demographic variables except for orientation, χ2(3) = 11.73, p < .01. As expected, the ABCT sample had a higher proportion of cognitive-behavioral or behavioral clinicians, whereas a Florida clinician was more likely to be eclectic or integrative or in the 12% of “other” orientations. Participants were initially mailed an information letter about the study and a postcard to return if they were interested in participating. Those who returned postcards were mailed the study materials. The return rate for the first mailing of postcards was 9.70% (40 from Florida, 57 from ABCT). Of those who returned postcards, 75.26% returned the study materials (25 from Florida, 48 from ABCT). Three participants did not fully complete the materials, leaving the final number of participants at 70. Materials. The same three disorders (MDD, GAD, and ASPD) were used as stimuli in this study. Participants described disorders using the same list of 120 descriptors as in Study 1. However, the instructions and stimuli were altered in the following ways. First, participants were instructed to describe each disorder or disorder pair by circling relevant descriptors from the list of 120. They performed these ratings independently, such that each disorder combination was rated from scratch (rather than noting changes after making an initial rating). Also, not all disorder pairs were represented. Only the following orders of pairs occurred: MDD + GAD, MDD + ASPD, and GAD + ASPD. Participants also completed a demographic questionnaire that assessed their age, sex, years of clinical experience, domain of experience (adults, adolescents, children), frequency with which they consult the DSM, theoretical orientation, and primary work setting. In addition, participants were asked to rate their familiarity with each of the disorders included in the study on a 5-point Likert-type scale ranging from not at all to very. Participants also rated the perceived similarity of the possible disorder pairs on a Likerttype scale ranging from –3 (dissimilar) to 3 (similar) with 0 as a neutral midpoint. Procedure. Participants were given as much time as they needed to complete the materials. They were, however, asked 23 to complete the materials in one sitting if possible. The packet of materials for the study was organized such that participants completed the demographic questionnaire first, then saw the instruction sheet for the task, followed by the task itself. The three single disorders were presented first in random order, followed by the three two-way combinations and one threeway combination in random order. This arrangement ensured equal exposure to the stimuli prior to encountering the combinations while controlling for possible order effects. Results The conceptual combination literature has noted that when concepts are combined, sometimes one of the constituents will dominate the features of the combination. To examine if this result occurred in this study, the number of symptoms included in a combination from one of its constituents was divided by the total number of symptoms for the combination minus the number of overextensions (symptoms that were not included in either single disorder description). This procedure calculated the percentage of symptoms in the combination that were included from the constituent concept while controlling for overextensions, which were not a part of either constituent. These values are presented in Table 3. Note that the combined percentage of symptoms from the constituents is greater than 100% because some symptoms overlapped across constituent concepts. For example, a participant may have included six symptoms in the descriptions of MDD and GAD that occurred in both. The pattern revealed in Table 3 indicates that GAD was dominated by both MDD and ASPD in combinations, but MDD and ASPD exerted equal weight. A dominance effect did appear to occur within this data set. One explanation for the dominance effect might be that clinicians are less familiar with GAD relative to MDD and ASPD. Participants rated their familiarity with GAD (M = 4.51, SD = 0.65) as relatively close to but still significantly different from their familiarity with MDD (M = 4.70, SD = 0.52), F(1, 69) = 7.38, p < .01, η2 = .10. Participants rated their familiarity with ASPD (M = 3.58, SD = 1.12) as significantly lower than MDD, F(1, 69) = 71.28, p < .001, η2 = .51, and GAD, F(1, 69) = 41.89, p < .001, η2 = .38. Table 3.  Means (and SDs) for the Percentage of Each Constituent Concept Included in a Combination in Study 3 MDD MDD + GAD MDD + ASPD GAD + ASPD MDD + GAD + ASPD 75.02 (11.95) 60.86 (14.47) 53.17 (14.44) GAD 57.70 (14.02) 48.30 (17.59) 37.67 (14.14) 57.31 (16.15) 65.40 (17.16) 49.13 (16.07) ASPD Test F(1, 69) = 55.07, p < .001, η² = .44 F(1, 69) = 1.30, ns F(1, 69) = 20.28, p < .001, η² = .23 F(2, 138) = 18.66, p < .001, η² = .21a Note: MDD = major depressive disorder; GAD = generalized anxiety disorder; ASPD = antisocial personality disorder. a Post hoc comparisons across the three were MDD to GAD F (1, 69) = 70.87, p < .001, η² = .51; MDD to ASPD F (1, 69) = 1.58, n.s.; GAD to ASPD F (1, 69) = 18.50, p < .001, η² = .21. 24 Keeley et al. comorbid disorders could be additive, even though one disorder might exert dominance over the others. A third possibility is that listing one disorder first implies that the symptom picture changes by bringing certain features to the forefront and perhaps reducing the likelihood of others. Whether these effects are intentional or unintentional, the order of the diagnoses appears to have an effect on how clinicians conceptualize the symptomatology of comorbid diagnoses. There are several possible explanations for the findings of this study. To explain the results of Studies 1 and 2, one could suggest that clinicians are straying from the additive guidelines of the DSM, which models the true “state of nature.” Another possibility is that clinicians are deviating from their knowledge gained in graduate school whereby clinical experience has led to rater drift. Rater drift occurs when “raters unintentionally redefine their scoring criteria or standards over time” (Wheeler, Haertel, & Scriven, 1992, p. 12). Rater drift in the studies described above would consist of any deviation from DSM criteria, guidelines, or both. Rater drift is a welldocumented and accepted phenomenon in academic (Congdon & McQueen, 2000; Lunz & Stahl, 1990), occupational (Borman, 1977; Cascio & Valenzi, 1977; Thorndike, 1920), and clinical (McLaughlin, Ainslie, Coderre, Wright, & Violato, 2009; Ventura, Liberman, Green, Shaner, & Mintz, 1998) settings. Research has shown that even objective structured clinical examinations are not immune to rater drift, which often occurs through differential rater function over time (DRIFT; McLaughlin et al., 2009). DRIFT describes rater biases that result from rater-by-time interactions (Wolfe, Moulder, & Myford, 2001) and is one explanation for rater drift that would help explain the deviation that occurs in clinicians as they gain more clinical experience. Furthermore, the organization responsible for the dissemination of the SCID recommends that SCID group sessions with feedback and comparison be conducted periodically to minimize rater drift (SCID, 2011). If rater drift occurs in objective structured diagnostic assessments, it is reasonable to assume that rater drift will occur, and likely be more prevalent, in subjective assessments, such as diagnosing psychiatric disorders in clinical settings. Although DRIFT is one potential source for rater drift, it is not alone. Primacy or recency effects, practice or fatigue, and differential centrality or extremism are other potential sources for rater drift (Wolfe et al., 2001). In addition to the more standard sources of rater drift, other clinicians, doctors, and nurses with whom clinicians work could potentially influence clinicians’ diagnoses. Working with other professionals who did not train with the DSM could lead to deviations from its additive model. In addition, if a clinician works alone and receives little to no feedback on his or her diagnoses, it could be easier for rater drift to occur. To solve this problem, additional continuing education requirements may be considered to create a more cohesive diagnostic model across the discipline. Another interpretation of the lack of commutativity among disorders is that clinicians are accurately modeling an aspect General Discussion This article consists of a series of three studies: Studies 1 and 2 examined the commutativity of disorder pairs, whereas Study 3 explored the dominance effect. Both Studies 1 and 2, despite different methodologies, demonstrated a degree of noncommutativity between disorder pairs. In the first two studies, participants were asked to note the presence of symptoms of MDD, GAD, and ASPD, using a different procedure in each study. In Study 1, participants, who were clinicians representing all regions of the United States identifying with similar theoretical orientations, were asked to circle descriptors from a list that they believed to be relevant to a disorder. In Study 2, participants were asked to describe the same three disorders but were able to respond freely with an open-ended prompt instead of choosing descriptors from a list. These clinicians came from a limited geographic region but identified with a representative range of theoretical orientations. Both Studies 1 and 2 found that participants were not always consistent within themselves in commutativity across pairings, as they varied widely in their agreement regarding the pairs. In addition, it was found that participants did not follow an additive model, as the samples evidenced lower kappa and percentage agreement values than perfect commutativity for each disorder pair. However, in Study 2, participants did not differ from what might be considered reasonable human error given the reliability with which the disorders are rated in other contexts. A strictly additive model of the combination of comorbid mental disorders would require commutativity. The order of presentation of the conditions should not affect the symptom picture. Clinicians in these two samples, however, did not operate in accordance with that assumption. Consistent with predictions from cognitive psychology models of combination (Gagne, 2002; Gagne & Shoben, 1997; Wisniewski, 1996, 1997), examples of noncommutativity occurred (but were not exclusive), and for Study 1, a compatible combination of MDD and GAD produced more commutativity than combinations including noncompatible ASPD (although this pattern was not replicated in Study 2). These studies did not explore which features of ASPD relative to GAD and MDD created the difference, and this would be an interesting area for further exploration. Even the DSM, despite its unarticulated additive model, seems reluctant to hold commutativity for disorders to be true. In the quote provided in the introduction to this article, the authors of the DSM-IV-TR (APA, 2000) state that the “primary” disorder should be listed first, indicating that its primacy in some way differentiates that disorder from others in the clinical picture. The reason for that difference could be as simple as administrative convenience, such as listing the diagnosis that was the primary focus of the appointment. However, listing a diagnosis in the first position could also imply causal primacy, implicitly (or explicitly) stating that one disorder caused the others or occurred temporally before the others. In the prior two examples, it is still possible that the symptom picture of Commutative Property of psychopathology that our current diagnostic system has yet to accommodate. The findings of Studies 1 and 2 may reflect clinicians’ interpretations of changes in symptom likelihood that have resulted from their collective experience. If that is the case, further research is needed to confirm the possibility that symptom expression could vary as a function of disorder presentation, either causally or temporally. Indeed, some literature has found temporal relationships between comorbid disorders. For example, in a longitudinal study of adolescents, Gallerani, Garber, and Martin (2010) found that prior externalizing disorders (oppositional defiant disorder or conduct disorder) predicted later depression, but not vice versa. Similarly, in a large community longitudinal study in Bavaria, a previous anxiety disorder predicted the later presence of MDD, but the converse was not true (Fichter, Quadflieg, Fischer, & Kohlboeck, 2010). Other studies have examined the causal relationship between comorbid conditions. As outlined by Rhee, Hewitt, Corley, Willcutt, and Pennington (2005), there are a variety of liability models that could account for various patterns of comorbidity. Specific tests of these models have found that some comorbid conditions can be considered alternate forms of the same underlying condition (e.g., Mineka et al., 1998; Rhee et al., 2006). Alternatively, other models suggest that a disorder might have a differential effect, increasing the chance of an individual developing a second comorbid condition (e.g., Pennington, Groisser, & Welsh, 1993; Purvis & Tannock, 2000; Schachar & Tannock, 1995). Despite these efforts, studies to date have focused only on the presence or absence of a disorder. The results of Studies 1 and 2 suggest that the effects could be more subtle and create differences in the likelihood of the expression of particular symptoms in comorbid conditions. For example, the presence of GAD might increase the chance that a person also diagnosed with MDD displays insomnia rather than hypersomnia. To our knowledge, such specific tests have not yet been conducted on comorbid pathology. If there are differences in disorder presentation based on order, it could be very valuable to clinicians and clients alike for the DSM committee to begin formulating ways to conceptualize multiple diagnoses. Because it is overwhelming and near impossible to create a diagnostic scheme for every combination of disorders in the DSM, it could potentially be very valuable to examine and record common patterns across comorbid disorders in a general manner. Some researchers have suggested that there are potential advantages to the DSM working toward a dimensional approach as opposed to its current categorical model (Clark, Watson, & Reynolds, 1995; Widiger, 1992; Widiger & Clark, 2000). It has been proposed that dimensional approaches offer a greater amount of information that is more relevant clinically, as well as the ability to rate the severity of the dysfunction instead of simply its presence or absence. One reason why incorporating severity into the DSM would be beneficial is that it has been found to be a good predictor of both comorbidity and the course and chronicity of disorders (Clark et al., 1995; Watson, 2005). 25 Study 3 examined the dominance effect to see if one disorder would dominate the features of a combination of disorders. This study had a larger number of participants than Studies 1 and 2, and the clinician participants represented several regions of the United States and a variety of theoretical orientations. Participants were asked to circle descriptions of disorders (MDD, GAD, and ASPD), as in Study 1, but each disorder was rated independently instead of adding or crossing out symptoms that no longer applied when a secondary diagnosis was added. Results showed that GAD was dominated by both MDD and ASPD; however, ASPD and MDD exerted equal weight. Therefore, a dominance effect was found in Study 3. In regard to Study 3, there are several possible explanations for the dominance effect occurring. One option is that the participants are less familiar with GAD than with MDD and ASPD, which is why it was dominated. However, clinicians were least familiar with ASPD. If familiarity were the factor driving dominance, ASPD would have been dominated by MDD and GAD. Another explanation for the existence of the dominance effect in Study 3 is the prevalence of the three disorders. It is possible the two more prevalent disorders could dominate the lesser. According to the National Institute of Mental Health, MDD has a lifetime prevalence rate of 16.5% and a 12-month prevalence rate of 6.7% in adults (Kessler, Berglund, et al., 2005; Kessler, Chiu, et al., 2005). The 12-month prevalence rate of ASPD is 1%, and information has not yet been gathered on the lifetime prevalence rate of ASPD (Lenzenweger, Lane, Loranger, & Kessler, 2007). Lastly, GAD has a lifetime prevalence rate of 5.7% and a 12-month prevalence rate of 3.1% (Kessler, Berglund, et al., 2005; Kessler, Chiu, et al., 2005). Therefore, ASPD is the least prevalent of the three disorders, with GAD having a much higher prevalence rate than ASPD. Therefore, prevalence rate is not a viable explanation for why GAD was dominated. Another possible explanation of why GAD was dominated in these combinations is because clinicians consider it a less severe or less disabling disorder. Clinicians could have considered MDD and ASPD as more important when conceptualizing the combination because they believed those two disorders to be more functionally impairing. However, research suggests that individuals with GAD experience a similar level of impairment as those with MDD (Hunt, Slade, & Andrews, 2004; Kessler, DuPont, Berglund, & Wittchen, 1999; Stein & Heimberg, 2004). A recent study found that individuals seeking treatment for GAD reported meaningful levels of disability. Participants with GAD reported severe disability across several domains, including romantic and social relationships, home and family responsibilities, mood regulation, education, employment, and hobbies (Henning, Turk, Mennin, Fresco, & Heimberg, 2007). If the dominance effect seen in Study 3 was driven by level of impairment, MDD and ASPD should not have dominated GAD because GAD is considered to cause impairment levels similar to those of MDD. Lastly, the dominance effect seen in Study 3 could be a result of diagnostic overshadowing. The term diagnostic 26 overshadowing was first coined by Reiss, Levitan, and Szyszko in 1982 and refers to the clinical error that occurs when several disorders are present, but the features of one condition preclude or overshadow consideration of others (Wood & Tracey, 2009). For example, studies have shown that when presented with case studies involving intellectual disability, clinicians tend to overlook agoraphobia (Levitan & Reiss, 1983), mood disturbances (Reiss et al., 1982), thought disorders (Garner, Strohmer, Langford, & Boas, 1994), avoidant personality disorder (Reiss et al., 1982), and overall severity of psychopathology (Alford & Locke, 1984). Research outside of the intellectual disability realm has further shown that conditions such as traumatic brain injury and epilepsy overshadow psychiatric disorders (Garner et al., 1994) and learning disabilities and hearing impairment overshadow behavior disorders (Goldsmith & Schloss, 1984). It is possible that the more salient features of MDD or ASPD could have overshadowed the less salient features of GAD, not because they are more severe but because clinicians perceive them to be more severe. Keeley et al. investigations (Falvey & Hebert, 1992; Keeley & Blashfield, 2010; Sciutto & Cantwell, 2005). A third limitation is that the samples were restricted in several key ways. First, for Study 1 and part of Study 2, the samples came from ABCT, a professional organization heavily represented by cognitive-behavioral therapists. Second, the remaining samples came from limited geographic regions and also included large portions of cognitive-behaviorally oriented clinicians. Further, all participants were psychologists. It could be that mental health professionals of different orientations or disciplines could conceptualize comorbid conditions in a different pattern, based on their different training and practices. Future studies should include a wider variety of professionals to determine if such systematic differences exist. A final limitation regards the nature of the data themselves. It is crucial to note that these studies represent what clinicians consider these disorder descriptions to be like. The findings are not the result of examining actual symptom patterns in a clinical population. Although we would hope that clinicians’ conceptualizations of pathology would correspond to some clinical reality, it is almost certain that there would be differences. For that reason, the results of this study should be considered an interesting hypothesis about how clinical pathology might be structured rather than a definitive report of “the state of nature.” Just as clinical experience has informed scientific hypotheses since the beginning of clinical psychology, so do we believe that the conceptualizations of the clinicians in these samples might inform an important effect that has heretofore been unstudied in the psychopathology literature. Limitations There are a number of limitations to the current set of studies. First, the sample sizes (particularly for Studies 1 and 2) are modest. However, small sample sizes are the norm in the cognitive studies on which the current methodologies are based (Chater et al., 1990, n = 10 for Study 1, n = 18 for Study 2, n = 10 for Study 3; Gagne & Shoben, 1997, n = 39; Springer & Murphy, 1992, n = 20 for Studies 1, 2, and 3). The withinsubject nature of the design requires relatively few participants in order to achieve adequate statistical power. Given the smallest effect size for the planned analyses (η2 = .24), the samples exceeded the minimum sample size to achieve power of .95 (n = 32; Faul, Erdfelder, Buchner, & Lang, 2009; Faul, Erdfelder, Lang, & Buchner, 2007). Further, the fact that the results replicated between Studies 1 and 2, in spite of small sample size, speaks to the robustness of the findings. Nonetheless, the size of the samples could limit the generalizability of the results, as relatively few clinicians were represented. An additional limitation is the low return rate for each study. It could reasonably be argued that self-selection effects prevent the samples of clinicians from being representative of the population. There was no compensation for participants in these studies, such that busy professionals were asked to volunteer their valuable time to complete the study. Those who did so may be different on some important variable that led them to respond differently than the average clinician. As such, we urge the reader to take the results of this study to be preliminary until the findings can be replicated in larger studies. Nonetheless, we argue that the balance of geographic region, work setting, and theoretical orientation found within these samples decreases the likelihood that the findings are spurious or due to exogenous variables. Further, the response rates obtained in these studies are similar to those from other Future directions and implications This study revealed information about how clinicians conceptualize disorders and the role of the dominance effect. It would be useful for researchers to now take this information and examine the effect of clinicians’ disorder conceptualization strategies on the types of treatments offered to clients. How are clinicians establishing their clients’ primary diagnoses? Determining base rates for the reasons clinicians list a diagnosis first in actual practice could go a long way toward standardizing the meaning of those primary placements. Longitudinal work on the interactive course of commonly comorbid disorders (see Borsboom, Cramer, Schmittmann, Epskamp, & Waldorp, 2011, for a mathematical model approach) could elucidate if order effects are causative in nature. In other words, it might be that particular patterns exist (e.g., MDD is more likely to lead to GAD than the other way around) that could account for why order effects exist, especially if that work progressed at the level of the symptoms rather than at the level of disorders. Further, does the order of their clients’ diagnoses affect the treatment offered? If clinicians are basing their treatments (even if subconsciously) on the order of their clients’ diagnoses, are their clients receiving the appropriate treatment? Perhaps it is appropriate to conceptualize a case on the order in Commutative Property which a client was diagnosed. Future work could determine if comorbidly diagnosed individuals have differential treatment outcome, depending on the order in which their issues are addressed. However, it seems as though it would be beneficial for clinicians to be aware of these biases and consistent in the way they conceptualize the cases and treatment plans of clients with multiple diagnoses. Standardization of practice is the first step toward understanding how these order effects influence conceptualization and practice. One interpretation of the findings in these studies could be that the structure of the DSM needs to be altered to better capture how clinicians are conceptualizing cases. This argument rests on the idea that the contents of the DSM are based on more than observations of psychopathology. Instead, this interpretation would conclude that the manual should be influenced by additional factors that play a role in its use, such as clinical utility (First et al., 2004; First & Westen, 2007; Verheul, 2006). Indeed, many researchers have argued (and demonstrated, in our opinion) that many sociopolitical forces exert influence on the structure of the DSM (Blashfield, 1984; Kirk & Kutchins, 1992; Sadler, 2005). Internationally, the revision of the International Classification of Diseases (11th edition) has very explicitly taken the utility of the diagnoses and organization into consideration (Reed, 2010). The results of the current studies might be one more piece to the puzzle of how the classification of psychopathology might be brought further into congruence with clinical practice. 27 American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text revision). Washington, DC: Author. Blashfield, R. K. (1984). The classification of psychopathology: NeoKraepelinian and quantitative approaches. New York: Plenum Press. Borman, W. C. (1977). Consistency of rating accuracy and rating errors in the judgment of human performance. Organizational Behavior and Human Performance, 20, 238–252. Borsboom, D., Cramer, A., Schmittmann, V., Epskamp, S., & Waldorp, L. (2011). The small world of psychopathology. PLoS ONE, 6, e27407. doi:10.1371/journal.pone.002740710.1371/ journal.pone.0027407 Boyd, J. H., Burke, J. D., Gruenberg, E., Holzer, C. E., Rae, D. S., George, . . . Nestadt, G. (1984). Exclusion criteria of DSM-III: A study of co-occurrence of hierarchy-free syndromes. Archives of General Psychiatry, 41, 983–989. Brown, T. A., Chorpita, B. F., & Barlow, D. H. (1998). Structural relationship among dimensions of the DSM-IV anxiety and mood disorders and dimensions of negative affect, positive affect, and autonomic arousal. Journal of Abnormal Psychology, 107, 179– 192. Cascio, W. F., & Valenzi, E. R. (1977). Behaviorally anchored rating scales: Effects of education and job experience of raters and ratees. Journal of Applied Psychology, 62, 278–282. Chater, N., Lyon, K., & Myers, T. (1990). Why are conjunctive categories overextended? Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 497–508. Clark, L. A., & Watson, D. (1991). Tripartite model of anxiety and depression: Psychometric evidence and taxonomic implications. Journal of Abnormal Psychology, 100, 316–336. Clark, L. A., Watson, D., & Reynolds, S. (1995). Diagnosis and classification of psychopathology: Challenges to the current system and future directions. Annual Review of Psychology, 46, 121–153. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. Congdon, P. J., & McQueen, J. (2000). The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 37, 163–178. Costa, P. T., & Widiger, T. A. (Eds.). (1994). Personality disorders and the five-factor model of personality. Washington, DC: American Psychological Association. Estes, Z. (2003). Attributive and relational processes in nominal combination. Journal of Memory and Language, 48, 304–319. Evren, C., Barut, T., Saatcioglu, O., & Cakmak, D. (2006). Axis I psychiatric comorbidity among adult inhalant dependents seeking treatment. Journal of Psychoactive Drugs, 38, 57–64. Falvey, J. E., & Hebert, D. J. (1992). Psychometric study of the clinical treatment planning simulations (CTPS) for assessing clinical judgment. Journal of Mental Health Counseling, 14, 490–507. Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41, 1149–1160. Conclusion The commutativity of mental health diagnoses is an untested assumption of an additive model of comorbid pathology. The implicit model of the DSM-IV-TR would suggest additive combination, but clinicians in these studies violated that assumption in their descriptions of three disorders. Instead, the order of presentation mattered, and certain disorders tended to dominate the symptom landscape of the comorbid pairs. Although these findings might indicate imperfect diagnostic procedures on the part of the clinicians, it is also possible that their responses reflect a new model of psychopathology that deserves further exploration. Acknowledgments We would like to thank Roger K. Blashfield for his assistance in data collection for part of this study. Declaration of Conflicting Interests The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article. References Alford, J. D., & Locke, B. J. (1984). Clinical responses to psychopathology of mentally retarded persons. American Journal of Mental Deficiency, 89, 195–197. 28 Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191. Feinstein, A. R. (1970). The pre-therapeutic classification of comorbidity in chronic disease. Journal of Chronic Diseases, 23, 455–468. Fichter, M., Quadflieg, N., Fischer, U., & Kohlboeck, G. (2010). Twenty-five-year course and outcome in anxiety and depression in the Upper Bavarian Longitudinal Community Study. Acta Psychiatrica Scandinavica, 122, 75–85. First, M. B., Pincus, H. A., Levine, J. B., Williams, J. B. W., Ustun, B., & Peele, R. (2004). Clinical utility as a criterion for revising psychiatric diagnoses. American Journal of Psychiatry, 161, 946–954. First, M. B., Spitzer, R. L., Gibbon, M., Williams, J. B. W., Davies, M., Borus, J., . . . Rounsaville, B. (1995). The Structured Clinical Interview for DSM-III-R personality disorders (SCID-II). Part II: Multi-site test-retest reliability study. Journal of Personality Disorders, 9, 92–104. First, M. B., & Westen, D. (2007). Classification for clinical practice: How to make ICD and DSM better able to serve clinicians. International Review of Psychiatry, 19, 473–481. Gagne, C. L. (2002). The competition-among-relations-in-nominals theory of conceptual combination: Implications for stimulus class formation and class expansion. Journal of the Experimental Analysis of Behavior, 78, 551–565. Gagne, C. L., & Shoben, E. J. (1997). Influence of thematic relations on the comprehension of modifier-noun combinations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 71–87. Gallerani, C., Garber, J., & Martin, N. (2010). The temporal relation between depression and comorbid psychopathology in adolescents at varied risk for depression. Journal of Child Psychology and Psychiatry, 51, 242–249. Garner, W. A., Strohmer, D. C., Langford, C. A., & Boas, G. J. (1994). Diagnostic and treatment overshadowing bias across disabilities: Are rehabilitation professionals immune? Journal of Applied Rehabilitation Counseling, 25, 33–37. Goldsmith, L., & Schloss, P. (1984). Diagnostic overshadowing among learning-disabled and hearing-impaired learners with an apparent secondary diagnosis of behavior disorders. International Journal of Partial Hospitalization, 2, 209–217. Hampton, J. A. (1988). Overextension of conjunctive concepts: Evidence for a unitary model of concept typicality and class inclusion. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 12–32. Hampton, J. A. (1997). Conceptual combination: Conjunction and negation of natural concepts. Memory & Cognition, 25, 888–909. Henning, E. R., Turk, C. L., Mennin, D. S., Fresco, D. M., & Heimberg, R. G. (2007). Impairment and quality of life in individuals with Generalized Anxiety Disorder. Depression and Anxiety, 24, 342–349. Hunt, C., Slade, T., & Andrews, G. (2004). Generalized Anxiety Disorder and Major Depressive Disorder Comorbidity in the Keeley et al. National Survey of Mental Health and Well-Being. Depression and Anxiety, 20, 23–31. Keeley, J., & Blashfield, R. K. (2010). Clinicians’ conceptualizations of comorbid cases: A test of additive versus non-additive models. Journal of Clinical Psychology, 66, 1121–1130. Kessler, R. C., Berglund, P. A., Demler, O., Jin, R., & Walters, E. E. (2005). Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the National Comorbidity Survey Replication (NCS-R). Archives of General Psychiatry, 62, 593–602. Kessler, R. C., Chiu, W. T., Demler, O., & Walters, E. E. (2005). Prevalence, severity, and comorbidity of twelve-month DSM-IV disorders in the National Comorbidity Survey Replication (NCSR). Archives of General Psychiatry, 62, 617–627. Kessler, R. C., DuPont, R. L., Berglund, P., & Wittchen, H. (1999). Impairment in pure and comorbid Generalized Anxiety Disorder and major depression at 12 months in two national surveys. American Journal of Psychiatry, 156, 1915–1923. Kessler, R. C., McGonagle, K. A., Zhao, S., Nelson, C. B., Hughes, M., Eschlman, S., . . . Kendler, K. S. (1994). Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States: Results from the National Comorbidity Survey. Archives of General Psychiatry, 51, 8–19. Kirk, S., & Kutchins, H. (1992). The selling of the DSM: The rhetoric of science in psychiatry. Hawthorne, NY: Aldine de Gruyter. Krueger, R. F., Markon, K. E., Patrick, C. J., & Iacono, W. G. (2005). Externalizing psychopathology in adulthood: A dimensionalspectrum conceptualization and its implications for DSM-V. Journal of Abnormal Psychology, 114, 537–550. Lenzenweger, M. F., Lane, M. C., Loranger, A. W., & Kessler, R. C. (2007). DSM-IV personality disorders in the National Comorbidity Survey Replication. Biological Psychiatry, 62, 553–564. Levitan, G. W., & Reiss, S. (1983). Generality of diagnostic overshadowing across disciplines. Applied Research in Mental Retardation, 4, 59–64. Lunz, M. E., & Stahl, J. A. (1990). Judge consistency and severity across grading periods. Evaluation & the Health Professions, 13, 425–444. McLaughlin, K., Ainslie, M., Coderre, S., Wright, B., & Violato, C. (2009). The effect of differential rater function over time (DRIFT) on objective structured clinical examination ratings. Medical Education, 43, 989–992. Mineka, S., Watson, D., & Clark, L. A. (1998). Psychopathology: Comorbidity of anxiety and unipolar mood disorders. Annual Review of Psychology, 49, 377–412. Morey, L. C. (1991). Personality Assessment Inventory Professional Manual. Odessa, FL: Psychological Assessment Resources. Osherson, D. N., & Smith, E. E. (1981). On the adequacy of prototype theory as a theory of concepts. Cognition, 11, 35–58. Pennington, B. F., Groisser, D., & Welsh, M. C. (1993). Contrasting cognitive deficits in attention deficit hyperactivity disorder versus reading disability. Developmental Psychology, 29, 511–523. Purvis, K. L., & Tannock, R. (2000). Phonological processing, not inhibitory control, differentiates ADHD and reading disability. Journal of the American Academy of Child and Adolescent Psychiatry, 39, 485–494. Commutative Property Reed, G. (2010). Toward ICD-11: Improving the clinical utility of WHO’s international classification of mental disorders. Professional Psychology: Research and Practice, 41, 457–464. Reiss, S., Levitan, G. W., & Szyszko, J. (1982). Emotional disturbance and mental retardation: Diagnostic overshadowing. American Journal of Mental Deficiency, 86, 567–574. Rhee, S. H., Hewitt, J., Corley, R., Willcutt, E., & Pennington, B. (2005). Testing hypotheses regarding the causes of comorbidity: Examining the underlying deficits of comorbid disorders. Journal of Abnormal Psychology, 114, 346–362. Rhee, S. H., Hewitt, J., Young, S., Corley, R., Crowley, T., Neale, M., & Stallings, M. (2006). Comorbidity between alcohol dependence and illicit drug dependence in adolescents with antisocial behavior and matched controls. Drug and Alcohol Dependence, 84, 85–92. Sadler, J. (2005). Values and psychiatric diagnosis. New York: Oxford University Press. Schachar, R., & Tannock, R. (1995). Test of four hypotheses for the comorbidity of attention-deficit hyperactivity disorder and conduct disorder. Journal of the American Academy of Child and Adolescent Psychiatry, 34, 639–648. Sciutto, M. J., & Cantwell, C. (2005). Factors influencing the differential diagnosis of Asperger’s disorder and high-functioning autism. Journal of Developmental and Physical Disabilities, 17, 345–359. Springer, K., & Murphy, G. L. (1992). Feature availability in conceptual combination. Psychological Science, 3, 111–117. Stein, M. B., & Heimberg, R. G. (2004). Well-being and life satisfaction in Generalized Anxiety Disorder: Comparison to Major Depressive Disorder in a community sample. Journal of Affective Disorders, 79, 161–166. Storms, G., De Boeck, P., Van Mechelen, I., & Geeraerts, D. (1993). Dominance and noncommutativity effects in concept conjunctions: Extensional or intensional basis? Memory & Cognition, 21, 752–762. Storms, G., De Boeck, P., Van Mechelen, I., & Ruts, W. (1996). The dominance effect in concept conjunctions: Generality and interaction aspects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 1266–1280. Storms, G., Ruts, W., & Vandenbroucke, A. (1998). Dominance, overextensions and the conjunction effect in different syntactic phrasings of concept conjunctions. European Journal of Cognitive Psychology, 10, 337–372. Structured Clinical Interview for DSM Disorders. (2011). Training overview: SCID training sequence of steps. Retrieved from http:// www.scid4.org/training/overview.html 29 Thorndike, E. L. (1920). A constant error in psychological ratings. Journal of Applied Psychology, 4, 25–29. Ventura, J., Liberman, R. P., Green, M. F., Shaner, A., & Mintz, J. (1998). Training and quality assurance with the Structured Clinical Interview for DSM-IV (SCID-I/P). Psychiatry Research, 79, 163–173. Verheul, R., (2006). Clinical utility of dimensional models for personality pathology. In T. Widiger, E. Simonsen, P. Sirovatka, & D. Regier (Eds.), Dimensional models of personality disorders: Refining the research agenda for DSM-V. Washington, DC: American Psychiatric Association. Watson, D. (2005). Rethinking the mood and anxiety disorders: A quantitative hierarchical model for DSM-V. Journal of Abnormal Psychology, 114, 522–536. Wheeler, P., Haertel, G., & Scriven, M. (1992). Teacher Evaluation Glossary. Kalamazoo, MI: CREATE Project, The Evaluation Center, Western Michigan University. Widiger, T. A. (1992). Categorical versus dimensional classification: Implications from and for research. Journal of Personality Disorders, 6, 287–300. Widiger, T. A., & Clark, L. A. (2000). Toward DSM-V and the classification of psychopathology. Psychological Bulletin, 126, 946– 963. Williams, J. B. W., Gibbon, M., First, M. B., Spitzer, R. L., Davis, M., Borus, J., . . . Wittchen, H. (1992). The Structured Clinical Interview for DSM-III-R (SCID). Multi-site test-retest reliability. Archives of General Psychiatry, 49, 630–636. Wisniewski, E. J. (1996). Construal and similarity in conceptual combination. Journal of Memory and Language, 35, 434–453. Wisniewski, E. J. (1997). When concepts combine. Psychonomic Bulletin and Review, 4, 167–183. Wisniewski, E. J., & Love, B. C. (1998). Relations versus properties in conceptual combination. Journal of Memory and Language, 38, 177–202. Wolfe, E. W., Moulder, B. C., & Myford, C. M. (2001). Detecting differential rater functioning over time (DRIFT) using a Rasch multi-faceted rating scale model. Journal of Applied Measurement, 2, 256–280. Wood, D. S., & Tracey, T. J. G. (2009) A brief feedback intervention for diagnostic overshadowing. Training and Education in Professional Psychology, 3, 218–225. Zanarini, M. C., & Frankenburg, F. R. (2001). Attainment and maintenance of reliability of axis I and axis II disorders over the course of a longitudinal study. Comprehensive Psychiatry, 42, 369– 374.