Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Exploratory Variable Clustering for Integrating Analyses David J Pasta, Technology Assessment Group, Inc. Lori Potter, Technology Assessment Group, Inc. quality oflife. Two treatment groups (209 Group 1 versus 295 Group 2) were compared on 36 summary outcome measures. Measurements were analyzed as single question items and as scales and subscales (weighted combinations of items); the total number of original items represented is 100 (Table 1). ABSTRACT In a cross sectional study comparing treatments for pain, we demostrate how PROC VARCLUS was used to strengthen and simplify results and discover the underlying structure in the data. Analysis was begun using standard estimation procedures (PROC GLM) with a large number of outcome measures. This paper traces the steps involved in developing composite scores, and discusses how to apply clustering techniques to data that has both single item, scale, and subscale scores. A proposed approach to integrating and summarizing clustering results is described. TABLE 1 INTRODUCTION Comparing two treatments using a large set of variables often yields results that are difficult to interpret. Using clustering techniques to reduce the number of variables into a manageable set of measurements often represents a fair tradeoff: an acceptable loss in the amount of variance explained for an increased understanding of the data and a clearer picture of how variables relate. The VARCLUS procedure in SAS was used to evaluate the validity of the pre-specified composite scales and to explore other possible scales. VARCLUS uses an iterative splitting technique to divide a group of variables into non-overlapping subgroups, each of which is approximately unidimensional. For a group of variables with a great deal of structure, a small number of clusters can explain a large proportion of the variance in the original group of variables. With variables that have little relationship, many clusters, each composed of only a handful of variables, might be needed to explain the same proportion of variance. The first k principal components will always explain at least as much variance as k cluster scores. However, the cluster scores are easier to describe and to interpret and often the reduction in variance explained compared with principal components is not dramatic. Reactions Towards Medication FACT-G: BPI TAG MOS MSAS Q12 a, b TAG Functional Assessment of Cancer Therapy-General Brief Pain Index Technology Assessment Group Medical Outcomes Study Memorial Symptom Assessment Scale Analysis of covariance (ANCOVA), using PROC GLM, was performed, controlling for treatment site, demographics, and cancer stage. Initial differences found between the two treatment groups were in outcome measures that, while not specifically linked to pain medication, were indicative of the patient's overall physical condition. Accordingly, we repeated the analysis using physical functioning as a covariate instead of an outcome measure. Main effects by treatment interactions were tested. DATA ANALYSIS In a cross-sectional, non-randomized study of , pain management, 504 cancer patients were surveyed regarding their current mode of pain medication delivery and its impact on their 297 Group 1 patients had lower social, emotional, and functional well-being. There were strong differences between the two treatments on side effects. Group 1 patients experienced fewer medication side effects (p=.004) and less impact from side effects (p<.001). Willingness to continue the medication was greater for men in Group 1 vs Group 2 (p <.01); No difference was found among women. Gender and age affect whether or not the medication met a patient's expectations. Group 1 men had higher scores, and older Group 2 patients were disappointed in how well their medication met expectations. For the various pain measures and sleep measures, and for the symptom scores and subscales, there were few differences and some presented interpretation difficulties. Results are summarized in Table 2 in the next column. TABLE2 Variable Clustering Exploratory evaluation of the outcome measures to determine which variables could be combined into scales revealed a fairly simple structure. We found that six underlying scales explained a substantial proportion of the variability. The cluster scores themselves can be the first principal component of the variables in the cluster or they can be the average of the raw (or standardized) variables. Where it was reasonable, we used the average of the raw variables in order to keep the composites as simple as possible. ·2.07 (0.039) -0.55 (0.58) omitting response of"!Oh-1" (never taken any pain med) As a preliminary step, we applied PROC VARCLUS to large groups of variables, including both individual items, predefined scales and predefined subscales of those scales. This meant that some variables might be included two or more times - as individual items and as part of one or more of the scales and subscales. Although such multiple inclusion of variables would horrify purists, and would not be recommended for confirmatory clustering, in practice it was very useful. When the dust settled from the clustering, we could see which variables ended up in the same cluster as the overall scale score, and which subscales wound up together and which apart. This allowed us to define six broad scales. 298 individual variables. This was determined by comparing the variance explained by the first principal component with the variance explained by the first centroid component (by specifying the CENTROID and COV options in PROC VARCLUS). The four FACT scales generally explained from 42% to 50% of the variance of the individual items, and had internal consistency (as measured by Cronbach's alpha) of .75 to .78. However, the Social score explained only 28% of the variance in the seven items comprising the scale and had a Cronbach's alpha of .58. Creating a single composite of the four FACT scales was supported by the finding that 55% of the variance of the four scales was explained by the average of those scales. This single FACT composite explained about 23% of the variance in the original 26 items. The six scales are summarized as follows: the Functioning scale consists of the social, emotional, and functional scores from the FACT (The physical functioning score from the FACT was not included among the outcome measures because it was being used as a control variable). The Symptom scale consists of the 24 frequency-times-bothersomeness MSAS items plus the 8 bothersomeness items. The Pain scale consists of the three pain items from the BPI (pain in the last week, pain on average, present pain) and the three pain disruption during sleep items. The Sleep scale consists of five MOS sleep items (adequacy (2 items), disturbance-initiation, maintenance and respiratory problems), omitting the MOS sleep somnolence item, "Have trouble staying awake during the day". The Satisfaction scale consists of the pain relief, ease of use, delivery satisfaction, willingness to continue medication, would recommend, met expectations, and similar relief items. Finally, the Side Effects scale consists of the frequency and impact of side effect items. SIDE EFFECTS A "side effects" composite was created from the simple average ofQ10b "How often do you have side effects with <medication>?" and Q 1Oc "How bothered are you by the side effects with <medication>?" These variables had a correlation of .85 and a Cronbach's alpha of .92. These variables did not correlate highly enough with any other variables to warrant increasing the composite beyond these two items. FACT Exploratory variable clustering was performed on the items comprising the FACT scales. This clustering, using the VARCLUS procedure in SAS, tended to confirm the scales provided by the developers. The deviations are potentially instructive. Three items were assigned to different scales. Question 2a, "I feel distant from my friends," was more closely associated with the Physical Well-being scale than the Social Well-being scale. Question 3b, "I am proud of how I'm coping with my illness," was found to be more tied to Social rather than to Emotional Well-being. Question 4e, "I am sleeping well," was found to fit better with the Emotional Well-being scale than the Functional Well-being scale. Finally, one item-- Question 2h, "I am satisfied with my sex life" - was found to have relatively low association with the other items in the Social Well-being scale. A comparison of the variation explained by a simple average of the items in the scale with the variation explained by the first principal component revealed that the simple average explained at least 93% as much variation as the first principal component on each of the four scales. That is, at most 7% of the explanatory power of a composite was lost by making the simplifying assumption of equal weights for the SYMPTOMS Symptoms were assessed by 24 items in Q14 and an additional8 items in Ql5; these items comprise the MSAS. For Q14, patients indicated how often they had a symptom and how much it distressed or bothered them. We created individual combinations of frequency and bothersomeness as the product of the two responses; these scores were then analyzed. For Q15, only the bothersomeness of the symptom was elicited and these scores were used directly. The MSAS scales include some overlap, and preliminary analyses of those scales was not especially revealing. Therefore, exploratory clustering was undertaken using VARCLUS. We conjectured that some of the clusters of symptoms would be related to the cancer itself, some to chemotherapy and other cancer treatments, and some to the pain medication. We found that the symptoms were not easy to classify and the empirical clusters were not entirely of one type or another. In the 299 assignment process, we found that. due in part to the wide range of cancer types, most of the 32 items could be attributed to the disease itself, but that only some were likely to be drug- or treatment-related. Nonetheless, the symptom clusters do have different emphasis. Cluster I includes symptoms that are primarily psychological. They are associated with the disease, and, to a lesser extent, either treatment or pain medication. Only 2 of the 12 symptoms in this cluster are specifically physical symptoms (mouth sores and constipation), and both have low cluster correlation scores. Of the remaining 10 symptoms, only one (feeling drowsy) is not related to cancer. Cluster 2 includes symptoms that appear to be primarily gastrointestinal and chiefly treatment-related. Cluster 3 is composed of symptoms that are physical and mostly disease-related. Also, while we thought that many of the symptoms could be linked to treatment or drugs, these are not the primary side effects one might associate with either. Cluster 4 contains 4 items, mostly related to appearance, and all attributable to treatment. variables were closely associated with the BPI pain measures. Those items, all developed specifically for this study, were Q13g "Wake up during the night to take your pain medication?", Q13h "Feel your sleep was interrupted because of your pain?", and Q13i "Feel frustrated at having your sleep interrupted due to your pain?'' SATISFACTION Several of the original questionnaire items pertaining to medication satisfaction revealed a close association via initial exploratory variable clustering. Internal consistency was strong (.78) among these items: Pain relief(Q9), Ease of Use (QlOa), Satisfaction with treatment delivery system (QIOd), Whether the patient would recommend their medication (Q10e), Whether the medication met the patient's expectations (Q 1Of), Whether the medication provided similar relief to pervious pain medications (QIOg), and Whether the patient would be willing to continue using the current medication (Qll). Inital clustering was performed on these items as they were originally measured, each on a different scale, and results were poor. These items were all rescaled (0-100), and the composite Satisfaction score was created as a simple average of the seven items. SLEEP Sleep was assessed by 6 sleep measures from the MOS, Q13a through Q13f, and by 3 measures of sleep pain disruption developed specifically for this study. Preliminary results revealed that the sleep pain disruption items functioned more as pain items than as sleep items. Accordingly, those items are included with pain. Of the 6 MOS sleep items, Q13d "Have trouble staying awake during the day?," was found to have a low correlation (.24) with the other items. Further, the internal consistency was higher (.80) without the item than with it (.77). The other 5 items relate to quality and quantity of sleep at night, and therefore represent a reasonable scale of"nighttime sleep." Accordingly, we created a sleep composite consisting of the remaining five questionss, Q13a, b, c, e, and f. TABLE3 Initiation, Maintenance, Respiratory problems Pain PAIN Initial analysis showed that the three pain measures from the BPI, Q5 " ... your pain at its worst in the past week," Q6 " ... your pain on average," and Q7 "pain you have right now" formed a consistent scale (Cronbach's alpha .87). As noted above, exploratory variable clustering revealed that the sleep pain disruption 300 Results We repeated the ANCOVA using these six measures to simplify and unify the results. We found that the results confirmed and strengthened the conclusions based on the initial analysis of36 items. There were no significant differences between Group 1 patients and Group 2 patients, after controlling for site and the demographic variables, on Symptoms or Sleep. The Group 1 patients reported lower scores on the Functioning scale. Men receiving the Group 1 treatment reported higher Satisfaction than other men, but no significant difference emerged for women. The most dramatic difference was found for frequency and impact of Side Effects, where 9toup 1 patients reported much less difficulty. Finally, the findings for Pain included an interaction with cancer stage. The Group 1 patients reported higher pain for Stages 1 and 2, lower pain for Stage 3, and similar pain for Stage 4. It should be noted that the number of patients with Stage 1 or Stage 2 cancers is small (roughly eight percent). CONCLUSION We found that applying variable clustering to the data allowed us to develop a more coherent and straightforward presentation to our client. Instead of having to analyze 32 separate symptom measures, for instance, we could gauge the impact of the treatment on symptoms overall. For clients who are statistically unsophisticated, details of the exploration need not be explained; we have found that most clients are comfortable with the basic description of the process. Most clients, regardless of statistical background, may grasp results more easily when dealing with a few integrated measures rather than a large and varied group of variables. 301