Clinical validation of the cingulate island sign visual rating scale in dementia with Lewy bodies

Introduction: The cingulate island sign (CIS) is a metabolic pattern on [ 18 F]fluorodeoxyglucose ([ 18 F]FDG) positron emission tomography (PET) associated with dementia with Lewy bodies (DLB). The aim of this study was to validate the visual CIS rating scale (CISRs) for the diagnosis of DLB and to explore the clinical correlates. Methods: This single-center study included 166 DLB patients and 161 patients with Alzheimer ’ s disease (AD). The CIS on [ 18 F]FDG-PET scans was rated using the CISRs independently by three blinded raters. Results: The optimal cut-off to differentiate DLB from AD was a CISRs score ≥ 1 (sensitivity = 66%, specificity = 84%) whereas a CISRs score ≥ 2 (sensitivity = 58%, specificity = 92%) was optimal to differentiate amyloid positive DLB ( n = 43 (82.7%)) and AD. To identify DLB with abnormal ( n = 53 (72.6%)) versus normal ( n = 20 (27.4%)) dopamine transporter imaging, a CISRs cut-off of 4 had a specificity of 95%. DLB with a CISRs score of 4 performed significantly better in tests on free verbal recall and picture based cued recall, but worse on processing speed compared to DLB with a CISRs score of 0. Conclusion: This study confirms the CISRs as a valid marker for the diagnosis of DLB with a high specificity and a lower, but acceptable, sensitivity. Concomitant AD pathology does not influence diagnostic accuracy of the CISRs. In DLB patients, presence of CIS is associated with relative preserved memory function and impaired processing speed.


Introduction
Dementia with Lewy bodies (DLB) is clinically characterized by progressive dementia and other core features such as cognitive fluctuation, parkinsonism, rapid eye movement sleep behavior disorder, and recurrent visual hallucinations [1]. In >50% of patients with DLB amyloid beta (Aβ) accumulation can be identified, a key protein in the pathophysiology of Alzheimer's disease (AD), which may influence the pattern of hypometabolism in the brain and the prognosis and, thus, poses diagnostic challenges [2][3][4][5]. Distinguishing AD from DLB is important to provide patients with the optimal information and treatment including the use of antipsychotics due to the increased sensitivity in patients with DLB [6].
The presence of the cingulate island sign (CIS) on [ 18 F]fluorodeoxyglucose (FDG) positron emission tomography (PET) has been demonstrated as a valuable diagnostic marker that is highly specific for DLB and is a supportive biomarker in the diagnostic criteria for DLB [1]. CIS is defined as preserved metabolism in the posterior cingulate cortex (PCC) relative to reduced metabolism in precuneus and cuneus [1,7].
Previous studies have primarily assessed CIS quantitatively, but such an approach can be both time consuming and require specific software or standardized acquisition of the functional and structural scan [8][9][10]. Visual assessment of CIS has shown promise and is equal to or may even outperform quantitative methods for distinguishing DLB from AD [7,11]. However, until recently a standardized approach for the visual rating of CIS has been lacking. To this end, we previously developed a standardized visual CIS rating scale (CISRs) providing good diagnostic accuracy [11].
Whereas mainly the diagnostic value of CIS has been investigated previously, the clinical features and other biomarkers associated with CIS remain largely uninvestigated. The PCC is involved in networks, subserving cognitive functions such as processing speed. Elucidating the correlates of the CIS in terms of cognitive profile, association to AD biomarkers, and dopamine transporter (DaT) imaging would advance the understanding of the presence of CIS and may improve diagnostic accuracy.
The aims of this study were to carry out a clinical validation of the diagnostic accuracy of the CISRs to differentiate DLB from AD in a larger, independent cohort from our previous study in which we developed the CISRs and further explore potential clinical correlates including neuropsychological features, DaT imaging, and the presence of concomitant AD pathology with Aβ in patients with DLB.

Study design
This is a single-center, retrospective, case-control study. We included patients who had consecutively undergone diagnostic evaluation and received a diagnosis of DLB between February 1st 2017 and February 1st 2021, and a matched group of patients with AD who had received a diagnosis in the same time period. AD patients were matched on age, Mini-Mental State Examination (MMSE) to match groups regarding level of cognitive impairment and disease severity, date of diagnosis, and sex. Patients were included from the Memory Clinic, Danish Dementia Research Centre, Department of Neurology, Rigshospitalet. The Memory Clinic, a tertiary clinic, is a highly specialized unit that provides diagnostic evaluations and management of patients with dementia as well as rare neurodegenerative diseases. Patients are referred from general practitioners, other hospitals, and other specialists from within and outside the catchment area. Diagnostic assessment includes history taking, physical examination, assessment of cognition and function, and a structural scan -magnetic resonance imaging (MRI) or computerized tomography (CT). Supplementary investigations with imaging biomarkers, lumbar puncture, or neuropsychological tests were conducted if necessary. [ 18 F]FDG-PET to image regional cerebral glucose consumption as a marker of neurodegeneration was performed before or in immediate conjunction with other investigations [12]. [ 11 C]Pittsburgh Compound-B ([ 11 C]PiB) PET or lumbar puncture was used to estimate cerebral Aβ deposition if AD was suspected [13]. Dopamine imaging using [ 123 I]FP-CIT SPECT or [ 18 F]FE-PE2I to assess presynaptic dopamine integrity was performed in cases with uncertainty of the diagnosis [14]. Ancillary tests were never conducted on a single day and were usually conducted within 30 days of first clinic visit. All patients are followed up at least one year after time of diagnosis as a standard clinical routine and no diagnoses were changed at the one-year followup. An overview of supplementary investigations performed in the diagnostic workup of the included patients is provided in Table A.1. Inclusion criteria were patients with a clinical diagnosis of DLB [1], AD [15], mild cognitive impairment (MCI) due to DLB [16], or MCI due to AD [17] who had undergone a [ 18 F]FDG-PET scan and a structural scan of the brain (MRI or CT). The one-year rule regarding onset of parkinsonism and cognitive impairment was applied to separate DLB from Parkinson's disease dementia. Patients fulfilling clinical criteria of both AD and DLB (n = 2), and patients with [ 18 F]FDG-PET scan performed >6 months from clinical diagnosis (n = 31) were excluded (Fig. 1)

[ 18 F]FDG-PET imaging acquisition and analysis
[ 18 F]FDG-PET scans were acquired in multiple hospitals with different scanners and reconstruction methods. The majority of scans (93%) were performed on the primary center (Rigshospitalet) either on a hybrid PET/MR (n = 130) or PET/CT (n = 197) system. Remaining scans were performed at one of four different hospitals (0.3% to 3% at each).
[ 18 F]FDG-PET: According to international practice guidelines 200-300 MBq of the radiopharmaceutical was administered intravenously and the static PET scan, that lasted 5-15 min, was performed 40-60 min after the injection of [ 18 F]FDG with either MRI acquired simultaneously on a PET/MRI scanner or in combination with a diagnostic CT scan or low dose CT scan on a PET/CT scanner [12]. The primary center used OSEM iterative 3D PET reconstruction with a 3 mm Gaussian filter and attenuation correction performed either with CT/ low-dose CT or synthetic CT based on MRI [18]. Reconstruction parameters are provided in Appendix B.

Scoring with the CISRs
Scoring of CIS with the CISRs was performed using the [ 18 F]FDG-PET  scans superimposed on either CT or MRI with the standardized reading approach described previously using a clinical PET platform (Siemens Syngo.via, MI Neurology, Erlangen, Germany) [11]. In short, the presence of CIS in each hemisphere was evaluated and given a score: 0 = absent, 1 = intermediate and 2 = present, based on the degree of hypometabolism in PCC relative to the degree of hypometabolism in precuneus and cuneus while taking atrophy and ischemic lesions into consideration. The sum of the visual CIS score of each hemisphere was added to get the final visual CISRs score (0 to 4: 0 = no CIS and 4 = definite CIS). Fig. 2 shows examples of CISRs scores.
Rating of scans involved three raters. One investigator (LF) was for the purpose of the study trained in the standardized reading approach for the CISRs by completing three training data sets (total n = 128 scans). Subsequently LF rated all scans included in the study (n = 327). All scans with an initial CISRs score ≥ 1 rated by LF (n = 159) were then reevaluated by the two experienced raters (nuclear medicine physicians, IL and OH). In cases of disagreement, the two experienced raters would assess and discuss each scan to reach consensus collaboratively. Interrater agreement of the CISRs score of 30 random patients between the LF and the two highly experienced raters was found to be substantial with Cohen's weighted kappa of 0.68 and 0.71, respectively, and similar to that observed between the two experienced raters (Cohen's weighed kappa of 0.65). Reading of single cases was blinded to the patient's diagnoses and other clinical data.

Neuropsychological tests
Global cognitive function was assessed by the MMSE [19]. Additionally, memory was investigated by three memory tests applied in a subgroup of patients for which the tests at the time of diagnosis were necessary (Table A.1). The delayed recall memory test from the Addenbrooke's Cognitive Examination (ACE) was used to examine free verbal recall [20]. Picture based cued recall was assessed with the Category Cued Memory Test (CCMT-48) immediate recall [21]. Nonverbal memory was assessed using the three-minute delayed recall of Rey Complex Figure Test (RCFT) [22]. The Symbol Digit Modalities Test (SDMT) was used as a measure of processing speed [23]. For all tests, a higher score reflects a better performance.
A [ 11 C]PiB-PET scan was performed in patients with contraindications to or unwilling to undergo lumbar puncture, or when CSF biomarkers were inconclusive, according to international recommendations [26]. The scans were performed and analyzed according to international guidelines [13]. A positive [ 11 C]PiB-PET scan had two or more brain areas each larger than a single cortical gyrus in which there were reduced or absent gray-white matter contrast or one or more areas in which gray matter uptake was intense and clearly exceeds that in adjacent white matter. Patients with Aβ < 1000 pg/mL or an abnormal [ 11 C]PiB-PET scan were considered amyloid positive (Aβ+). In one case of discordant results between the two biomarkers the [ 11 C] PiB-PET scan result was used.

Dopamine transporter (DaT) imaging
Assessment of dopaminergic function in DLB patients was based on dopamine imaging using [ 123 I]FP-CIT SPECT or [ 18 F]FE-PE2I performed as a part of routine work-up. Scans were performed and categorized The presence of cingulate island sign (CIS) in each hemisphere was evaluated and given a score (0 = absent, 1 = intermediate and 2 = present) based on the degree of hypometabolism in the posterior cingulate cortex relative to the degree of hypometabolism in precuneus and cuneus, while taking atrophy and ischemic lesions into consideration. A score of CISRs = 0 (top row) is obtained if CIS is absent bilaterally, because the metabolism of posterior cingulate cortex is equal to or lower than the precuneus and cuneus regardless of the overall level. In CISRs = 1 there is an intermediate CIS unilaterally, e.g., because of an impression of relative hypometabolism in the posterior cingulate cortex, but less hypometabolism / more preserved than in precuneus and cuneus, and CIS absent on the other side. CISRs = 2 is either intermediate CIS bilaterally, or CIS present on one side, but absent CIS on the other. CISRs = 3 is intermediate CIS on one side and CIS present on the other, while CISRs = 4 has CIS present bilaterally.
according to international guidelines [14] based primarily on visual analysis supported by semiquantitative estimations of the specific binding ratios and putamen/caudate ratios relative to age matched healthy controls. In the abnormal scans, structural CT or MRI at the time of imaging was reviewed for evidence of vascular basis of reduced tracer uptake.
Patients with dopamine imaging using [ 123 I]FP-CIT SPECT or [ 18 F] FE-PE2I for which an experienced nuclear medicine physician had found evidence for definite presynaptic degenerative dopamine deficiency were considered positive (DaT+), whereas patients with a normal scan were considered negative (DaT-).

Clinical validation of the CISRs
None of the patients from the development of the CISRs were included in the present evaluation [11]. To assess the diagnostic accuracy of the CISRs we calculated sensitivity, specificity, and balanced accuracy ((sensitivity + specificity)/2) at each cut-off. The cut-off with the maximal balanced accuracy was considered the optimal cut-off. As a sensitivity test, cut-offs were also determined using Youden's index. Group comparisons were made for AD versus DLB, AD versus DLB Aβ+, and DLB DaT+ versus DLB DaT-. No comparisons between DLB Aβ + and DLB Aβ-were conducted due to a low number of DLB Aβ-patients (n = 9).

Assessment of education index score
An educational index score was calculated to reflect the extent of patient education [27]. The number of school years (ranging from 7 to 12 years) is summed with an occupational training index (ranging from 1 to 5) and a total of 17 was the maximal score.

Statistical analysis
Student's t-test or Welch's t-test was used for the parametric continuous data when comparing two groups and Mann-Whitney U test was used for nonparametric data. We compared multiple groups using Kruskal-Wallis test. For categorical data Chi-squared test or Fisher's exact test was applied to test the significance. Cohen's weighted kappa was used to estimate the interrater agreement of the CISRs score between two raters. Bootstrapping with 1000 iterations was used to obtain the 95% confidence intervals (95% CI) of sensitivity, specificity, and balanced accuracy. Spearman's rank correlation or multiple linear regression with age, sex, MMSE, and education index score as covariates was used to examine relationship between CISRs score and neuropsychological test scores. R version 4.1.2 for Windows (64-bit) was used to perform all statistical analyses. Significance level was set at 0.05 (twosided).

Baseline characteristics
The baseline demographic and neuropsychological test scores are displayed in Table 1. No significant differences were observed in age, education or MMSE between the two main diagnostic groups. DLB patients had significantly higher scores on ACE delayed recall than AD patients, whereas AD patients had significantly higher scores on the SDMT. For further comparisons between DLB Aβ + and AD, and DLB DaT+ and DLB DaT-see Table 1. Table 2 shows the diagnostic ability of the CISRs to differentiate DLB from AD and DLB Aβ + from AD. The optimal cut-off based on balanced accuracy to differentiate all DLB patients from AD patients was a CISRs score ≥ 1 with a sensitivity of 66% and a specificity of 84% yielding a balanced accuracy of 75%. A CISRs score of 3 or higher was highly specific for DLB with a specificity of 96%. The optimal cut-off to differentiate DLB Aβ + from AD was a CISRs score ≥ 2 with a sensitivity of 58% and a specificity of 92% yielding a balanced accuracy of 75%. No significant difference was found when calculating the diagnostic accuracy without MCI patients (data not shown). Diagnostic accuracies did not vary when scans acquired using different scanner types were analyzed separately.  Distribution of CIS scores in each group is presented in Fig. 3. The percentage of DLB relative to AD within a specific CISRs score group in the total study population was 29.7% in CISRs = 0, 55.2% in CISRs = 1, 81.3% in CISRs = 2, 87.5% in CISRs = 3, and 92.9% in CISRs = 4. Regarding the separation of DLB DaT+ from DLB DaT-, a cut-off of 4 on the CISRs had a specificity of 95% and a balanced accuracy of 64% (Table 3).

Neuropsychological tests
The scores of the neuropsychological tests as divided by CISRs scores are shown in Table 4. Kruskal-Wallis test showed a significant difference of test scores in ACE delayed recall (χ 2 = 12.9, p = 0.01), CCMT-48 immediate recall (χ 2 = 12.8, p = 0.01), and SDMT (χ 2 = 16.3, p = 0.003) when comparing all patients across CISRs groups. Multiple linear regression performed on all patients pooled (with CISRs score 0 as reference group and with age, sex, MMSE, and education index score as covariates) showed that patients with CISRs score 4 had significantly higher scores in CCMT-48 immediate recall and significantly lower scores in SDMT, but no significant difference in RCFT delayed recall. CCMT-48 immediate recall and SDMT were also significantly different between CISRs score 4 and CISRs score 0 when only DLB patients were examined. A positive correlation between CISRs score and ACE delayed recall score was found with Spearman's rank correlation both in all patients (ρ = 0.25, p = 0.002) and in DLB patients (ρ = 0.26, p = 0.04).

Discussion
In this study, we clinically validated the diagnostic accuracy of the CISRs for the diagnosis of DLB in a cohort independent of our previous study [11]. We found that a clear CIS as indicated by a CISRs score ≥ 3 was highly specific in differentiating DLB from AD even when DLB patients had biomarker evidence of concomitant AD pathology. Moreover, we found that a higher CISRs score was significantly associated with a better performance on memory tests but significantly worse performance on measures of processing speed both across all patients and in DLB patients.
Our findings in relation to the diagnostic value of CIS are in line with previous findings, including our initial study of the CISRs [11], with specificities ranging from 67% to 100% and accuracies ranging from 66% to 89% [7,8,10,11,28,29]. An advantage of the CISRs is that it provides a fast and standardized method to evaluate the presence and severity of CIS on [ 18 F]FDG-PET in contrast to quantitative measures and other visual approaches and may be used by less experienced raters. Previous PET and SPECT studies have calculated the CIS scores based on uptake ratios with or without partial volume correction, or based on statistical Z-scores (after comparison with normal database) derived from vendor supplied anatomical or disease specific regions of interest [8][9][10]30,31]. This makes comparisons of cut-off values of CIS ratios across studies and consequently clinical implementation difficult, if not impossible. These shortcomings are to some degree offset using the CISRs as it would be possible to evaluate the same [ 18 F]FDG-PET study consistently, with an interrater agreement of up to 0.71, independent of experience of staff, software, and methods of imaging acquisition and analysis. The interrater agreement of the CISRs between the two experienced raters was at the same level as that between the less experienced and the experienced raters. This should be taken into consideration as it may affect the usability of the scale in clinical practice. Further work expanding educational materials for rater training may be able to mitigate this shortcoming.
Compared to our initial study using the CISRs we found slightly different optimal cut-offs [11]. For distinguishing DLB from AD and DLB Aβ + from AD in the present study an optimal CISRs cut-off at a score of ≥1 and ≥ 2, respectively, was found in contrast to ≥2 and ≥ 1 in the initial study [11]. However, only small variations in the balanced accuracy led to the different optimal cut-offs between the present study and our initial study [11]. This indicates that a cut-off around 1 to 2 on the CISRs is the most robust.
Many DLB patients have concomitant AD pathology with Aβ deposits in the brain and we hypothesized that it could influence the pattern of hypometabolism on [ 18 F]FDG-PET and thereby the diagnostic accuracy of the CISRs. However, we found that the balanced accuracy of the CISRs was just as high when distinguishing DLB Aβ + from AD as when distinguishing all DLB patients from AD patients. However, it should be taken into consideration that only a subset of patients with DLB had undergone CSF sampling and that this group may have differed from the whole DLB cohort leading to the physician ordering a lumbar puncture or a [ 11 C]PiB-PET scan (e.g. uncertainty of the diagnosis or atypical  symptoms). This may have influenced our findings, despite DLB patients fulfilled the clinical criteria for DLB and all patients in which dual pathology was suspected to impact the clinical phenotype were excluded. It is not obvious that this would have led to an inflation of diagnostic accuracy and may indeed have influenced the results in the opposite direction.
A CISRs score of 4 separated DLB DaT+ from DLB DaT-with a specificity of 95% and with an only slightly lower specificity at a cut-off of ≥3, which on the other hand yielded a slightly higher balanced accuracy. This suggests that CISRs in a clinical setting can be an aid to determine whether the patient needs additional imaging biomarkers to confirm the DLB diagnosis. In other words, if a patient suspected of DLB has a CISRs score of 3 or above, further diagnostic assessment with dopamine imaging using [ 123 I]FP-CIT SPECT or [ 18 F]FE-PE2I is unlikely to add to the diagnostic accuracy as this is likely to be abnormal. On the contrary, due to the relatively low sensitivity of 66%, a CISRs score of 0 cannot exclude DLB and does not predict a negative DaT scan as also indicated by a relatively high number of patients with a diagnosis of DLB and a CISRs score of 0 who had an abnormal DaT scan. Therefore, dopamine imaging using [ 123 I]FP-CIT SPECT or [ 18 F]FE-PE2I may provide further diagnostic information in this situation. An earlier study showed that the CIS ratio increased in prodromal DLB and decreased in mild DLB which could explain why some DLB DaT+ patients had a CISRs score of 0 [32]. It should be emphasized that for clinical reading of [ 18 F] FDG-PET, other imaging features considered typical of DLB (e.g. occipital hypometabolism) and AD (e.g. parietotemporal and mesial temporal hypometabolism) are evaluated, and diagnostic accuracy of AD versus DLB is likely to be higher than reported for CIS alone.
Regarding the clinical correlates of CISRs score, we investigated three memory tests and one test tapping processing speed. The significant difference found between the CISRs groups in neuropsychological tests was likely affected by the distribution of the AD and DLB patients in Table 4 CISRs score and clinical characteristics.   the CISRs groups since patients with a CISRs score of 0 predominantly were AD patients and patients with a CISRs score of 4 predominantly were DLB patients, whereas the positive correlation with ACE delayed recall and CCMT-48 immediate recall remained significant when examining DLB patients alone. These results could suggest that preserved metabolism in PCC indicates an intact connection with the medial temporal structures (including parahippocampal formation and hippocampus) and, thus, relatively intact memory functioning of the "hippocampal type". Previous studies showed that hippocampal atrophy and medial temporal lobe atrophy were associated with white matter disruption in the PCC and a lower CIS ratio which support our findings [33,34]. A higher CIS ratio had also been associated with less clinical impairment, better awareness index, and a positive correlation with scores on the Rey Auditory Verbal Learning Test [34][35][36]. That the patients with a CISRs score of 0 could have either entirely normal or entirely decreased metabolism in all three areas (PCC, precuneus and cuneus) further complicates the interpretation of the results since a CISRs score of 0 does not provide information about the degree of hypometabolism in PCC. This study has a number of limitations. One is the risk of circularity of the DLB diagnoses. However, a reevaluation of the diagnosis ensured that patients fulfilled the clinical criteria for DLB without using the [ 18 F] FDG-PET scan. Furthermore, this is a single center study so the robustness of the CISRs in a multicenter setting remains to be determined. The present study used a convenience sample which meant that the data collection including imaging acquisition was not done in a uniform way. However, this could also be taken as a strength as the diagnostic performance was demonstrated in a setting resembling clinical practice. Patients were well matched including on MMSE score which is a measure of global cognitive function. It is however possible that the MMSE score does not reflect the same degree of impairment in the two diagnostic groups as the MMSE is skewed towards memory and orientation which on average is earlier and more impaired in AD than in DLB. An important caveat related to the study cohort, and alluded to previously, is that only a subgroup underwent CSF sampling, DaT imaging, and neuropsychological testing which may have biased the results.
A strength of our study is the large sample size and inclusion of a study population with a more heterogeneous disease severity which to a much higher degree reflect the general patient population in a memory clinic.
In conclusion, we have validated the diagnostic accuracy of the CISRs for the diagnosis of DLB. Concomitant AD pathology in DLB patients does not affect the diagnostic accuracy of the CISRs. In cases of suspected DLB, a high score on the CISRs does not necessitate additional DaT imaging, whereas this may be relevant if the CISRs score is low. The CISRs has yet to be validated across diagnostic centers and the correlation with staging of DLB remains unclear. Further research is needed to investigate temporal changes of CISRs score and the ability of CISRs to predict the clinical progression of DLB as well as the diagnostic performance in a mixed memory cohort clinic population with other diagnoses represented.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of Competing Interest
All authors declare no conflicts of interest.