Trends in ischemic stroke outcomes in a rural population in the United States

INTRODUCTION
The stroke mortality rate has gradually declined due to improved interventions and controlled risk factors. We investigated the associated factors and trends in recurrence and all-cause mortality in ischemic stroke patients from a rural population in the United States between 2004 and 2018.


METHODS
This was a retrospective cohort study based on electronic health records (EHR) data. A comprehensive stroke database called "Geisinger NeuroScience Ischemic Stroke (GNSIS)" was built for this study. Clinical data were extracted from multiple sources, including EHR and quality data.


RESULTS
The cohort included in the study comprised of 8561 consecutive ischemic stroke patients (mean age: 70.1 ± 13.9 years, men: 51.6%, 95.1% Caucasian). Hypertension was the most prevalent risk factor (75.2%). The one-year recurrence and all-cause mortality rates were 6.3% and 16.1%, respectively. Although the one-year stroke recurrence increased during the study period, the one-year stroke mortality rate decreased significantly. Age > 65 years, atrial fibrillation or flutter, heart failure, and prior ischemic stroke were independently associated with one-year all-cause mortality in stratified Cox proportional hazards model. In the Cause-specific hazard model, diabetes, chronic kidney disease and age < 65 years were found to be associated with one-year ischemic stroke recurrence.


CONCLUSION
Although all-cause mortality after stroke has decreased, stroke recurrence has significantly increased in stroke patients from rural population between 2004 and 2018. Older age, atrial fibrillation or flutter, heart failure, and prior ischemic stroke were independently associated with one-year all-cause mortality while diabetes, chronic kidney disease and age less than 65 years were predictors of ischemic stroke recurrence.


Introduction
A recent study by the Centers for Disease Control and Prevention (CDC) indicated that the four-decade decline in stroke death rates in the United States (US) has slowed down, stalled, or in some cases, reversed in recent years [1]. Substantial variations exist in terms of the timing and magnitude of this unfavorable change. The rural-urban disparities in life expectancy widened between 1969 and 2009, with 7% of the disparity due to stroke [2]. Additionally, the incidence of stroke in rural areas of the US remains high [3] mainly due to a lower socioeconomic status and a higher prevalence of risk factors [3,4], including obesity [5], smoking [6], and lower rates of physical activity [7].
Defining disparities in stroke risk factors, outcomes, and geography that might be driving the unfavorable outcome could lead to the implementation of targeted interventions to reduce stroke burden among vulnerable populations [8]. Yet very few studies have reported the trends in stroke outcomes among a large rural population in the US (Supplemental Table 1). The goal of this study was to define trends in stroke outcomes of all-cause mortality and recurrence among ischemic stroke patients from a rural population of central Pennsylvania between 2004 and 2018 and evaluate the factors that are associated with these stroke outcomes.

Data source and study population
This was retrospective cohort study based on extracted data from multiple sources including Geisinger's Electronic Health Record (EHR) system, Geisinger Quality database, as well as the Social Security Death database. These data sources were used to build a comprehensive stroke database called "Geisinger NeuroScience Ischemic Stroke (GNSIS)". GNSIS includes demographic, clinical, genetic, and laboratory data of 8929 ischemic stroke patients from September 2003 to May 2019. Geisinger is a fully integrated health system and the largest rural health maintenance organization (HMO) in the country. Geisinger serves a predominantly Caucasian population of 2.6 million people living in 43 counties, outside of the major metropolitan regions, in northeastern and central Pennsylvania with a very low (<5%) outmigration rate (Supplemental Fig. 1) [9,10]. The study was reviewed and approved by the Geisinger Institutional Review Board.

GNSIS population and data processing 2.2.1. GNSIS population
To build the GNSIS database, we first created a high-fidelity, datadriven, in-house phenotype definition for ischemic stroke. Multiple manual validations were carried out to finalize the phenotype and finetune the data pull criteria. The patients were included in the GNSIS database if they had (1) a primary hospital discharge diagnosis of ischemic stroke, (2) a brain magnetic resonance imaging (MRI) performed during the same encounter to confirm the diagnosis, and (3) an overnight stay in the hospital. Brain MRI is part of stroke order-set for all stroke patients with no contraindications (cardiac defibrillator, foreign body, etc.). The Current Procedural Terminology (CPT) 4 codes for brain MRIs are available in Supplemental Table 2a. The diagnosis of ischemic stroke and other health conditions was based on the International Classification of Diseases Clinical Modification, Ninth or Tenth Revision, (ICD-9-CM or ICD-10-CM) codes (Supplemental Table 2b). The manual validation of 125 randomly selected patients, including reviewing the MR imaging, indicated a specificity of 100% ensuring all patients in the GNSIS database had an accurate diagnosis of acute ischemic stroke.
In cases of multiple encounters due to recurrent cerebral infarcts, the first hospital encounter was considered as the index event. The data from all encounter types were extracted and processed to ensure the comprehensiveness of the follow-up information. This database interfaces with the Social Security Death Index to reflect updated information on the vital status.

Data processing
As part of data pre-processing, steps were taken to ensure the integrity and validity of the data. For instance, units were verified and reconciled if needed; distributions of variables were assessed over time to ensure data stability. The range for the variables was defined according to expert knowledge and available literature-outliers were assessed, replaced, or capped based on clinical data. As part of the deidentification process, the age of patients older than 89 years old was masked and changed to 89. Filters were applied to ensure that the relevant variables were captured within the desired time frame and that the order of events was maintained. The last encounter of patients was also recorded to ensure that patients were active. The admission National Institute of Health Stroke Scale (NIHSS) was extracted from the quality data and merged with the EHR data using medical record numbers.

Trends in stroke outcomes -Cohort and outcome definitions
For the current study, we included data from 2004 to 2018 as the full-year data for the year 2003 and 2019, and follow-up data for 2019 were not available in this study. As a result, 8561 out of 8929 ischemic stroke patients from GNSIS were included in the study. Genetic data was not used for this study as it was only available for approximately 20% of the patients. Supplemental Table 2c includes the full descriptions of the data elements included in the study. The outcome measures were the rate of one-year ischemic stroke recurrence and all-cause mortality following the ischemic stroke event. The stroke recurrence for each year was calculated by dividing the number of ischemic stroke patients during that year who had a recurrence within one year from the initial stroke event, by the total number of ischemic stroke patients who completed one-year follow-up. Patients who had follow-up of less than one year or died within a year without a stroke recurrence were not included in the calculation. One-year all-cause mortality rate was calculated by dividing the total number of patients who died within one year after the initial stroke event, by the total number of stroke patients with at least one year of follow-up. Patients from 2018 were not part of the analysis of the trends in the outcomes as majority of patients from 2018 did not have a one-year follow-up data but were included in the survival analysis. The ischemic stroke patients were also grouped into five- The follow-up time for ischemic stroke recurrence was defined as the time between index stroke date and last encounter date in the electronic health record. The follow-up time for all-cause mortality was defined as the time between index stroke date and the end of study period (May 20, 2019) because mortality information from Social Security Death Index till this date was included in the study.

Statistical analysis
All continuous variables were summarized as mean ± standard deviation or median with inter-quartiles [IQR]), and categorical variables as count and percentage. For comparison between groups, the chisquare test was used for the categorical variables and analysis of variance (ANOVA) or Kruskal-Wallis test was used for continuous variables. A post-hoc analysis was performed with Bonferroni correction to determine the difference between subgroups. Correlation between variables was assessed using the Spearman correlation coefficient (between numerical variables), Cramer's V (between categorical variables), and Point-biserial correlation coefficient (between categorical and numerical variables).
Cochrane-Armitage test for trend was used to analyze the time trends in acute stroke care and stroke outcomes. One-year ischemic stroke recurrence was assessed using the cumulative incidence function in which mortality was considered a competing risk. Fine-Gray subdistribution hazard model and cause-specific hazard model were also employed to examine ischemic stroke recurrence. The former allows for estimating the effect of covariates on absolute risk of outcome over time and whereas the latter is appropriate for causal analysis of competing risk [11]. One-year all-cause mortality was assessed using the Kaplan-Meier estimator and Cox proportional hazards model. The Log-rank test and Gray's test were used to compare groups in the Kaplan-Meier estimator and cumulative incidence function, respectively. The proportional hazards assumptions were tested using the Schoenfeld residuals test and the variables not meeting the proportional hazards assumption were used to stratify the models. For all analyses, p < 0.05 was considered statistically significant.

Patient population and clinical characteristics
The study cohort included 8561 consecutive patients of ischemic stroke (mean age: 70.1 ± 13.9 years, men: 51.6%, Caucasian: 95.1%) who presented to one of the six Geisinger Health System centers from 2004 to 2018. The majority of patients (7354; 85.9%) were older than 55 years at the index event and 66.3% of patients were older than 65 years at index event. Twenty-seven (0.3%) patients were younger than Table 1 Demographics, comorbidities and outcomes of ischemic stroke patients included in the study. Among the patient cohort, hypertension was the most prevalent comorbidity (75.2%) followed by dyslipidemia (62.2%), and diabetes (32.4%). Out of 8561 patients, 2028 (23.7%) had all the three risk factors. A past medical history of ischemic stroke, atrial fibrillation or flutter, and hypercoagulable state was seen in 9.5%, 21.7% and 1.3% patients respectively (Table 1). When considering the most common comorbidities, there was a significant difference among the three intervals (A, B, and C; Table 1). Post-hoc analyses demonstrated that hypertension, atrial fibrillation, dyslipidemia, history of hemorrhagic and ischemic stroke, chronic liver disease, and chronic kidney disease were significantly different between all three intervals. Heart failure, peripheral vascular disease, and chronic lung diseases (asthma, chronic obstructive pulmonary disease, and occupational lung diseases) were significantly lower in interval A but there were no significant differences in these comorbidities between intervals B and C (Supplemental Table 3).
Among the 8534 adult patients, 14.5% of patients reported to be current smokers at their index date, and 24.6% were former smokers when asked at their stroke index date. Due to the proportion of patients who did not report their smoking status (29.3%), a meaningful conclusion regarding the smoking status pattern over time could not be established.
Among all patients, 31.0% were taking antihypertensives, 30.0% were on statins, and 9.1% were taking an oral anticoagulant before the index stroke. There was a significant increase in the use of statins from time interval A, to interval B and C (25.6%, 28.8%, and 32.3%; p < 0.001). However, no significant difference was observed in antihypertensives use over the same period (31.8%, 30.0%, and 31.2% p = 0.402). A significant increase was observed in the use of warfarin (7.2%, 7.8%, and 9.1% p = 0.024) and new oral anticoagulants (0.0%, 0.1%, and 1.4% p ≤0.001) ( Table 1). It was observed that warfarin was underutilized but showed gradual increase over the years. Warfarin utilization still increased despite wide availability of newer oral anticoagulants after 2010 possibly due to the cost of the newer oral anticoagulants.

Acute stroke care
Among the study cohort, there was an increasing trend in the rate of intravenous thrombolysis (IVT) and mechanical thrombectomy (MT) (Supplemental Table 4). The Cochrane-Armitage test for trend showed increasing trend for both IVT (p < 0.0001) and mechanical thrombectomy (p < 0.0001). Less than 1% patients in the cohort received IVT in interval A; the rate of IVT increased to 5.6% in interval B and 8.7% in interval C. Similarly, the rate of mechanical thrombectomy was less than 1% in intervals A and B but significantly increased to 3.4% in interval C.

One-year ischemic stroke recurrence
Out of 8561 ischemic stroke patients, patients who were lost to follow-up or died within a year without a stroke recurrence and patients from the year 2018 were excluded from recurrence trend analysis. Thus 5444 patients were included in the trend analysis. Out of 5444 patients, 343 (6.3%) had a recurrence within the first year following the index stroke. Compared to intervals A (5.3%) and B (4.9%), there was a significant increase in one-year stroke recurrence in interval C (7.8%, p = 0.001). The annual rates of ischemic stroke recurrence till year 2017 is given in Supplemental Fig. 3. The Cochrane-Armitage test for trend showed increasing trend for ischemic stroke recurrence (Z = − 3.6558, p = 0.0001).
The difference between patients with and without one-year ischemic stroke recurrence is given in Table 2. The rates of myocardial infarction, diabetes, chronic kidney disease, prior ischemic stroke was significantly higher in patients with one-year stroke recurrence compared to those without recurrence.

All-cause mortality
Among the study cohort, patients who were lost to follow-up within one-year and patients from the year 2018 were excluded from trend analysis of all-cause mortality. Thus, 7563 patients were included in the all-cause mortality trends analysis. Out of 7563 patients, 1216 (16.1%) patients died within a year of the index stroke. When stratified by stroke recurrence, one-year all-cause mortality was not statistically different in patients with and without recurrence (16.9% vs 15.1%, p = 0.269). Oneyear all-cause mortality decreased significantly in interval C (14.8%) when compared to interval B (17.3%, p < 0.032) but not compared to interval A (17.0%, p = 0.141). The annual rates of ischemic stroke recurrence till year 2017 is given in Supplemental Fig. 3. The Cochrane-Armitage test for trend showed decreasing trend for one-year all-cause mortality (Z = 1.7351, p = 0.0041).
The patients who died within one-year of stroke were significantly older, 47.5% were men and had significantly higher rate of comorbidities like Atrial fibrillation/flutter, myocardial infarction, heart failure, chronic lung diseases, rheumatic diseases, chronic kidney disease, peripheral vascular disease and history neoplasm, ischemic/hemorrhagic stroke before index date ( Table 2). They also had higher NIHSS and underwent mechanical thrombectomy at higher rate.

Table 2
Comparison of demographics, comorbidities, NIHSS and acute stroke care between the patient groups with and without stroke outcomes.

NIHSS and stroke outcomes
NIHSS was available for only 2016 patients included in the study.
Separate survival models were employed to examine the effect on NIHSS on stroke recurrence and mortality. In both the Fine-Gray subdistribution hazard model and cause-specific hazard model, NIHSS was not associated with one-year stroke recurrence (Supplemental Table 5). NIHSS did predict one-year mortality (HR: 1.07, 95% CI: 1.06-1.09, p < 0.001; Supplemental Table 6) in the Cox proportional hazards model.

Discussion
This study demonstrates an increasing trend in one-year stroke recurrence over a period of 15 years despite a significant decrease in one-year mortality over the same period. The observed decrease in mortality is consistent with other studies in Europe and the United States [21][22][23]. The mortality decline after 2013 parallels increased availability and utilization of endovascular stent thrombectomy [21,[24][25][26][27]. Geisinger has considerably increased its mechanical thrombectomy capacity since 2014; 151 (3.4%) patients in this study were treated with thrombectomy between 2014 and 2018, and 24 (0.6%) before 2014. Many other reasons including the expansion of telestroke and the development of the Get With The Guidelines (GWTG)-Stroke program might have caused the decline in the one-year mortality in Geisinger population and across the nation [28][29][30].
Previous studies from urban populations have shown a general decline in stroke mortality and recurrence over the years [22,[36][37][38][39]. Higher stroke mortality has been seen in rural areas compared to urban areas and appears to be due to higher stroke incidence in rural areas rather than case fatality [3]. Other rural-urban differences contributing to higher incidence in rural areas are more prevalent risk factors like diabetes and hyperlipidemia [4]. People in rural areas are less likely to Table 3 Hazard ratios (and 95% Confidence Intervals) from Subdistribution hazard model and cause-specific hazard model for one-year ischemic stroke recurrence.  be screened for diabetes and less likely to achieve diabetes control [4]. The one-year stroke recurrence rate is reported to be 5% to 15% in different studies [31][32][33][34][35]. One-year stroke recurrence using cumulative incidence function in this study was estimated to be 5% which is lower than given by Kaplan-Meier estimator (5.7% for this study) as Kaplan-Meier estimator does not take competing risks into account and thus overestimates the risk. Contrary to most studies where the rate of oneyear recurrence declined [33,40,41], the recurrence rate in this study increased over the past fifteen years. The rise in stroke recurrence in a rural population as shown by this study is also similar to results from a study in rural China [42]. This disparity in the recurrent stroke trend between urban and rural populations should be further studied. In addition to the increased prevalence of risk factors in the recurrence group, the decline in stroke mortality leads to an increase in the number of stroke survivors, thus leading to a potential increase in the prevalence of recurrent stroke among survivors.

Stroke risk factors and predictors of poor outcomes
A recent meta-analysis showed that hypertension, diabetes, atrial fibrillation, and coronary heart disease are predictors of stroke recurrence [43]. In this study, diabetes was found to be a significant factor in both subdistribution hazard model and cause-specific hazard model. Another significant predictor was chronic kidney disease. Age more than 65 years was found to be associated with lower recurrence in the causespecific hazard model. Young adults with stroke have different risk factors and etiologies than older patients and may have different rates of stroke recurrence. Epidemiologic data regarding stroke recurrence among younger patients are limited. In addition, several factors including older age [44], hypertension [45], diabetes [46,47], and current smoking [45] are associated with stroke death. Appelros et al. showed that old age, atrial fibrillation, stroke severity, and dementia were predictors of poor outcomes including death within one-year after stroke [48]. These observations are similar to the findings in this study. Chronic kidney disease was also confirmed to be an independent predictor of poor outcome.
Other studies indicated a global increase in vascular risk factors over the past years [49][50][51][52]. A study on 6032 patients over a 20-year window  in Minneapolis [38] reported an increasing trend in atrial fibrillation, hypertension, diabetes, and ischemic heart disease among stroke patients. The rates of hypertension, diabetes, current smoking, and myocardial infarction are the most prevalent risk factors reported by prospective studies in ischemic stroke population such as Reasons for Geographic and Racial Differences in Stroke (REGARDS) [53] and Northern Manhattan Stroke Study (NOMASS) [54]. Although we did not study the effect of predictors on the trends in this study, a significant increase in the rate of risk factors prevailed in the recurrence and oneyear mortality cohorts. Additionally, the vascular risk factors may vary based on geographic population and the trend might be higher in rural areas as compared to the urban counterparts [23]. Also, an increasing percentage of individuals older than 65 years (~25%) in rural areas [55] may be associated with a higher proportion of comorbidities in this population. With significant increase in stroke risk factors and stroke recurrence over the years, the focus needs to be on primary and secondary prevention.
An increased administration of medications such as statins, antiplatelet, and anticoagulants was observed in this study, which is similar to other studies [52,56,57]. However, there was negligible change in trends of antihypertensives despite an increasing diagnostic rate of hypertension in the population. There was also a discrepancy between the rate of atrial fibrillation/flutter (21.7%) at the time of index stroke, the documented diagnosis of atrial fibrillation/flutter prior to the stroke event (12.2%), and the medication history of anticoagulants (9.1%).

Strengths and limitations
The present study is one of the few reports that captures the trends of risk factors and outcomes in a large rural population-based cohort over a period of fifteen years in the United States. GNSIS represents one of the largest rural cohort of ischemic stroke patients across all ages. This study included patients from a specific area within a single health care system network and the results of this study might not be generalizable to the general population or other rural areas with different population demographics. The use of EHR and other data sources provided a comprehensive overview of patients' clinical data with deep phenotype and clinical evaluation as well as a large sample size. Furthermore, Geisinger has a stable patient population with a very low out-migration rate and rich longitudinal data. However, the use of these resources has its drawbacks, mainly centered around inherent noise due to the nature of the data, as well as biased patient selection. Patients were included in the GNSIS database based on strict criteria to ensure high specificity, thus cases of ischemic stroke not fulfilling these criteria may not represented in this database. Mortality presented in this study is all-cause mortality and not case-fatality rate and it can be influenced by improved management of other comorbidities contributing to death. We also looked at the overall risk factors causing an ischemic stroke, irrespective of ischemic stroke subtypes, socioeconomic determinants, or genetic predispositions [58,59]. Further research is needed to determine subtype-specific stroke risk factors and trends, taking into account nonclinical risk factors as well as genetic biomarkers.

Conclusion
Although stroke mortality has decreased, stroke recurrence and several vascular risk factors have significantly increased in this rural population over the past fifteen years. More effort is still needed to control stroke risk factors in high-risk and underserved subpopulations.

Sources of funding
RZ and VA had funding support from the Geisinger Health Plan Quality Fund as well as the National Institutes of Health R56HL116832 (sub-award) during the study period. The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Data availability statement
All relevant data are available in the article and/or supplemental file. Due to privacy and other restrictions, the primary data cannot be made openly available. Deidentified data may be available subject to datasharing agreement with Geisinger Health System. Details about requesting access to the data are available from the corresponding author.

Declaration of competing interest
None.