Predicting post-surgical outcomes in idiopathic normal pressure hydrocephalus using clinically important changes from the cerebrospinal fluid tap test

Objective Patients diagnosed with idiopathic Normal Pressure Hydrocephalus (iNPH) typically experience symptom improvements after undergoing a cerebrospinal fluid-tap test (CSF-TT), These improvements are recognized as indicative of potential improvements following surgical intervention. As gait disturbance is the most common iNPH symptom, gait improvements are of predominant interest. The purpose of this study was to examine if clinically important changes in gait and balance from CSF-TT predict meaningful changes following surgery. Method The study involved analysis of data collected in a prospective observational study for 34 iNPH patients who underwent a CSF-TT and subsequent surgery. Linear regression, logistic regression and classification trees were used for predictive modelling comparing changes from CSF-TT with post-surgical changes in Tinetti, Timed Up and Go (TUG) and Berg Balance Scale (BBS) outcomes.

It is a condition characterised by a clinical triad of gait disturbance, cognitive impairment and urinary incontinence [4].Additionally, the diagnostic guidelines for iNPH include the presence of ventriculomegaly on radiological examination and normal cerebrospinal fluid (CSF) opening pressure on lumbar puncture [1][2][3].With symptoms overlapping other neurodegenerative diseases, the diagnosis of iNPH can often be delayed.Commonly, iNPH is considered after the exclusion of other conditions or failure to respond to treatments for conditions such as Parkinson's Disease [2,3].While the global prevalence of iNPH remains unclear, the number of people with the condition will continue to rise as the population continues to age, thus adding to the burden of the disease.There are reports of prevalence ranging from 0.5% to 2.5% in the general population, with one study reporting this to be as high as 3.7% in those aged 65 years and older [5].
Insertion of a ventricular peritoneal (VP) shunt is the primary surgical treatment option for iNPH.Not all patients, however, will benefit from surgery, with an average of 29% of patients showing no improvement [6].In addition, an 11% risk of an adverse event occurring as a result of shunting further highlights the need for accurate identification of patients who would improve from surgery to inform clinical decision making [7].The cerebrospinal fluid-tap test (CSF-TT) is a diagnostic test developed to identify candidates who would benefit from surgery [8].The CSF-TT is performed via lumbar puncture to drain between 30-50 millilitres of CSF [8].It is theorised that the patients showing improvements in their iNPH symptoms from temporary CSF drainage would benefit after VP shunt surgery.However, there is no agreement identifying what constitutes improvement from a CSF-TT or how to quantify when improvement has occurred from a CSF-TT, limiting its ability to identify candidates for surgery [7].In addition, with a predictive accuracy of 62% and negative predictive value of 37%, the CSF-TT has also been limited in excluding patients who may not benefit

J o u r n a l P r e -p r o o f
Journal Pre-proof from VP shunt surgery [9].Whilst gait assessment is commonly used, there remains a lack of consensus on the assessment tools that can quantify meaningful changes for these patients, thus necessitating the development and examination of outcome measures with clinical utility [10].
Work by Gallagher et al. demonstrated that a battery of outcome measures for gait and balance can quantify changes in symptoms from a CSF-TT.Using the Timed up and go (TUG), Timed up and gocognition (TUG-C), Tinetti and Berg balance scale (BBS), minimal clinically important differences (MCIDs) were developed by which changes can be quantified from a CSF-TT [11,12].These outcome measures when combined can objectively differentiate between those who improved on the CSF-TT and those who did not with high sensitivity (90.28%) and specificity (98.58%) [11,12].
Current literature on iNPH symptoms following the CSF-TT provides varying levels of agreement on the outcome measures to be used to identify significant improvements [10].Giannini et al. investigated gait and balance measures for the 18 meter walk test and TUG test using an instrumented gait analysis system, pre and post CSF-TT and 6 months following VP shunt surgery, and found statistically significant improvements over time following surgery [13].However, this study did not use robust modelling which can be used to determine the cut-off values in these measures following a CSF-TT to predict the surgical outcomes.Further, the high loss of participants at follow-up lead to a small sample size at 6-months following the surgery, that was inadequate to power the analysis [13].Nikaido et al. evaluated the MCIDs and actual changes in gait and balance measurements after CSF shunt surgery in iNPH patients and found that the actual changes are comparable to MCIDs for Functional Gait Assessment (FGA) scores [14].In this study, positive CSF-TT results were used as one of the patient selection criteria for the CSF shunt surgery, however, it remains unclear if there were patients who did not reach a similar level of improvement after the J o u r n a l P r e -p r o o f Journal Pre-proof surgery.To date, there have been no studies, that have objectively compared the gait and balance improvements from CSF-TT with the post-surgical improvements in iNPH patients.
This study aimed to identify and compare results of individual outcome measures from the CSF-TT to determine if they could predict outcomes following VP shunt surgery.It was hypothesised that patients whose gait and balance scores exceeded the MICDs after the CSF-TT would show comparable improvements following the surgery.Conversely, it was anticipated that patients whose gait and balance scores did not improve significantly after a CSF-TT would not demonstrate significant improvements after the VP shunt surgery.The findings from this study aimed to improve objective interpretation of CSF-TT results in order to guide the clinical decision making for iNPH patient selection for VP shunt surgery, thus preventing unnecessary surgical interventions for those who may not benefit from the surgery.

Study design and setting
The study involved analysis of data collected in a prospective observational study seeking to establish the validity and predictive ability of standardised gait & balance assessment tools in iNPH patients who underwent a CSF-TT.The data was collected in a tertiary referral neurological and neurosurgical facility in Newcastle, Australia.Ethical approval for this work was granted by Hunter New England Local Health District Human Research Ethics Committee (HNELHD HREC Reference 13/06/19/4.02) and was co-registered by The University of Newcastle Ethics Human Research Ethics committee (UON HREC Reference: H-2013-0384).

J o u r n a l P r e -p r o o f
Journal Pre-proof Patients with a diagnosis of iNPH who were admitted to undergo a CSF-TT at the facility were invited to participate and screened for their eligibility in the original study.Data of all the patients who subsequently underwent VP shunt surgery was extracted from the master dataset and included for analysis in this study.
All participants met the inclusion criteria for age over 55 years and had a diagnosis of iNPH as determined by Evan's index whereby a value higher than 0.3 on computer tomography or magnetic resonance imaging indicates enlarged ventricles.This is commonly used as a marker of ventriculomegaly for the diagnosis of iNPH [15].These patients also presented with one or more manifestations of the clinical triad and had been diagnosed with iNPH by either a neurologist or neurosurgeon.Patients were excluded if they were unable to walk for 10 metres with assistance.This was required to ensure that the participants were able to safely complete assessments for all the outcome measures.
Based on these criteria, 74 participants admitted for a CSF-TT were included in the original study.The sample size of 74 was derived in the previous study using minimal detectable change for the Timed Up and Go with significance value of 0.05 and 80% power [11].Of these 74 participants, 34 underwent VP shunt surgery and were included in this study for data analysis.Patients for VP shunt surgery were selected by the medical team based on the interpretation from CSF-TT results and the international guidelines for the diagnosis and treatment for iNPH [1][2][3].Consent for participation in the study mirrored medical pathways which were in relation to their ability to provide consent.
Signed informed consent was obtained from each participant, or their guardian, with the knowledge that their data would be used for the purpose of analysis and scientific publication.

J o u r n a l P r e -p r o o f
Journal Pre-proof

Measurements and key outcomes
All 34 participants were assessed using a battery of gait and balance outcome measures pre-and post-CSF-TT and after the VP shunt surgery.These outcomes of Tinetti, TUG and BBS are standardised and reliable clinical measures used to assess gait and balance in the elderly population [16][17][18][19].The Tinetti is scored using an ordinal scale of 0-2 for 16 items that include nine balance and seven gait items, giving a total maximum score of 28 [16].The TUG score is recorded as the time taken to complete a 3-metre walk starting with rising from a chair to returning to the chair [17,18].BBS score is evaluated out of a total score of 56 from 14 items for static and dynamic balance [19].
The VP shunt surgery was performed by the consulting neurosurgeon.All the gait and balance outcome measures were administered by the same physiotherapist prior to and after CSF-TT and following VP shunt surgery.

Data analysis
Exploratory analysis focused on identifying factors which could predict changes in scores on the Tinetti, TUG and TUG-C and BBS following VP shunt surgery.Statistical analysis was performed using RStudio using a range of techniques to help with prediction modelling [20].R version 4.0.2(R Foundation, Boston, MA) was used for all analysis utilising the GLM and rpart libraries.
Linear regression of each outcome measure was performed to determine if post-surgical scores could be predicted by a linear relationship between changes in these scores post CSF-TT and post-surgery.
Logistic regression was undertaken to determine if changes observed from the CSF-TT could predict a change equal to or greater than the MCID after the surgery.A classification and regression tree algorithm was used to visualise findings and confirm linear and logistic regression analysis results.
The classification tree employs a mathematical splitting criterion known as the Gini impurity to

J o u r n a l P r e -p r o o f
Journal Pre-proof identify the importance of each feature presented to the model.This method systematically identifies and drops irrelevant features, which do not add to the predictive accuracy of the model.This method is used to understand if a reduced set of features can represent the highest value.The purpose of this dimensionality reduction was to identify which components of the BBS and Tinetti which contributed to predicting post-surgical improvement by the MCID value.The visual representation was utilised to confirm if the changes in identified outcome measures after the CSF-TT could be used to predict whether VP shunt surgery results in meaningful changes.

Results
Demographic information for all patients is reported in Table 1.Patients ranged from 58 to 84 years of age.Mean symptom duration was 9.4 (SD = 7.3) months prior to admission for CSF-TT.Data on duration of symptoms was missing for 4 patients.Gait disturbance was the most common of the triad symptoms with 94% of patients presenting with this symptom.Twenty-five of the 34 patients exhibited more than one triad symptom.

Comparison between CSF-TT and VP shunt surgery outcome scores
No statistically significant difference was observed between CSF-TT and VP shunt surgery for mean change from the baseline in the scores for TUG, Tinetti and BBS.Table 2 lists these mean changes and p-values for the difference between CSF-TT and VP shunt surgery for each outcome measure.
Proportions of patients exceeding MCIDs showed statistically significant differences between CSF-TT and VP shunt surgery with p<0.01 for TUG, p = 0.05 for Tinetti and p = 0.01 for BBS.

J o u r n a l P r e -p r o o f
Journal Pre-proof Clinimetrics for the chosen outcome measures are given in Table 3.The TUG had the highest specificity and positive predictive value, both being 100%.BBS had the highest sensitivity and negative predictive value.As the TUG showed a specificity of 100%, the positive likelihood ratio could not be derived and listed in the table.
One extreme outlier data point existed in the TUG data.This was a real observation with no measurement or data entry error.However, this data point was more than three standard deviations away from the nearest data point.Comparison of results with and without the outlier, showed that the outlier had a significant influence on correlation and linear regression modelling.Hence, the results without the outlier were considered for reporting the correlation and linear regression model analysis for the TUG.

Correlation between CSF-TT and post -surgery change scores
Figure 1 presents the scatter plots for the change scores of the individual outcomes after surgery with respect to the CSF-TT.Statistically significant correlations were evident between changes from CSF-TT and from VP shunt surgery for Tinetti (r = 0.41, p = 0.02, 95% CI = 0.08 to 0.67) and BBS (r = 0.61, p < 0.01, 95% CI = 0.34 to 0.72) scores.The correlation for the TUG scores (r = 0.32) had no statistical significance (p = 0.16, 95% CI = -0.12 to 0.65) with the outlier removed.

Comparison between predictive models of outcome measures
Linear regression modelling was found to be statistically significant for Tinetti to predict post-surgery changes using changes from the CSF-TT (R 2 =0.14, p = 0.02, , Akaike information criteria (AIC) = 178.49)and BBS (R 2 = 0.36, p <0.01 AIC = 179.96).Of the three outcome measures, changes in BBS

J o u r n a l P r e -p r o o f
Journal Pre-proof scores provided best 'goodness of fit' to the linear regression model with the highest determination coefficient (R 2 = 0.36).With the outlier removed, the TUG did not show a statistically significant linear regression model (R 2 = 0.05, p=0.16,AIC = 156.4).The linear models met all assumptions for linear regression.
Table 4 shows the results of the logistic regression analysis.The models were statistically significant for Tinetti (p = 0.02) and BBS (p < 0.01) to predict post-surgery response using MCIDs from the CSF-TT.The logistic regression model for the TUG did not reach statistical significance, neither with (p = 0.06) nor without the outlier (p = 0.07).

Accuracy of outcome measures
Classification tree analysis compared CSF-TT changes in individual items of Tinetti and BBS outcomes with the changes from VP shunt surgery as shown in Figure 2. Model 1 provides the visual representation for the classification tree for Tinetti outcomes post-surgery.Change scores in the items of: immediate standing balance, sitting down, trunk sway change and nudging were identified to contribute to determining surgical change score classification based on MCIDs.
Nine of the 10 patients who had an improvement in their immediate standing balance on the Tinetti scale after the CSF-TT improved by the MCID after surgery.Of the patients who did not improve in standing balance, none reached MCID after surgery if their sitting balance score had not improved after CSF-TT.Four of the six patients who had not improved standing balance but had improved their sitting down, trunk sway change as well as nudge scores after CSF-TT showed improvement exceeding MCID after the surgery.The classification tree analysis for Tinetti showed an accuracy of 79% with these 4 items.The accuracy of a model containing these features of the Tinetti was unchanged from a model with all features.

Journal Pre-proof
The classification tree for BBS outcomes is shown in Model 2 contained in Figure 2.Only two items on the BBS, picking up object and forward reaching, were identified as the contributing factors to determine the surgical outcomes exceeding MCID.Ten of 11 patients who improved on the scores of picking up object, improved by the MCID post-surgery.Of the 7 patients who did not improve on the object picking up score but did improve on the forward reach score, 5 exceeded the MCID postsurgery.The classification tree analysis for BBS showed an accuracy of 76% with these two features.
As with the Tinetti, the accuracy of the classification tree did not improve with the addition of all features.

Discussion
This is the first study to use predictive modelling techniques to demonstrate evidence of a predictive relationship between the changes from CSF-TT and VP shunt surgery in iNPH patients.Specifically, Tinetti and BBS outcomes from CSF-TT have a predictive value for surgical outcomes in iNPH patients.These findings were found to be consistent across various techniques used for the analysis, thus, providing additional confirmation of the results and strengthening the findings of this study.
A statistically significant linear relationship was identified for Tinetti and BBS suggesting that the changes in these outcomes after CSF-TT had the potential to predict proportional changes from surgery.Based on our hypothesis, we were further interested in finding whether the changes in the outcome scores after surgery are equal to or higher than MCIDs rather than the magnitude of the actual change alone.For these two outcomes, the odds of achieving meaningful post-surgical changes increased provided changes from CSF-TT had reached MCIDs.

J o u r n a l P r e -p r o o f
Journal Pre-proof BBS had the highest negative predictive value of 77% suggesting a greater ability to identify those who may not have meaningful improvements after the surgery.According to a systematic review by Mihalj et al. the highest known negative predictive value for the CSF -TT has been 50% [9].This was based on two studies included in the review that used gait outcome measures in addition to CSF opening pressure, cognitive and continence outcome measures.The positive likelihood ratio for all three outcomes indicated an increase in the probability to achieve clinically important changes after the VP shunt surgery if MCIDs for TUG, Tinetti or BBS were reached following the CSF-TT.This probability decreased if the CSF-TT did not result in changes exceeding MCIDs for any of these outcomes as indicated by the negative likelihood ratio of 0.4 or lower.
The TUG did not produce a statistically significant predictive model using either linear or logistic regression, however, with specificity and positive predictive values of 100%, it did demonstrate substantial clinimetric properties that can be useful to determine if a patient will improve after surgery.Earlier, Nikaido et al. had also identified TUG to be an useful tool with high potential to achieve a MCID following shunt surgery in severe iNPH cases, suggesting that it can be applied to a CSF-TT to determine potential effect of shunting [14].It is likely that the sample size in this study was inadequate to determine significance and draw a firm conclusion on the predictive ability of the TUG.
It would be worthwhile to explore the usefulness of this outcome by analysing data from a larger sample in future studies.
Although the predictive regression models for Tinetti and BBS were significant, the model fit was low as explained by high values of AIC in the regression analysis suggesting variations between the outcomes of CSF-TT and those from surgery.This indicated that other unmeasured variables contributed to explaining the variance present in surgical change scores.Multivariate analysis was explored including age and number of symptoms with change scores; however, no significant

J o u r n a l P r e -p r o o f
Journal Pre-proof multivariate model was identified.One explanation for this can be a small sample size that did not provide enough data to adequately power the multivariate analysis.In addition, there may be additional factors such as post-surgical effects, duration of measurement from surgery and comorbidities influencing the gait and balance outcomes that could not be accounted for in this analysis.Nonetheless, it is important to note that irrespective of the low model fit, the positive relationship established between the CSF-TT and surgical change scores for Tinetti and BBS was significant and hence provides useful predictive information.BBS and Tinetti had similar AIC values for linear regression, however, BBS had better model fit for logistic regression with a lower AIC value.This indicates that compared to Tinetti, BBS would have less variations from the CSF-TT in the prediction for surgical outcomes to be meaningful or not.Because the regression models for the TUG were not significant, the predictive use of this outcome remained inconclusive, despite its lower AIC value in logistic regression analysis.
The classification tree analysis resulted in predictive accuracies of 79% using Tinetti and 76% using BBS, which were higher than the earlier reported accuracy of 62% for the CSF-TT [9].In addition, 12% of patients in this study who had clinically important changes from CSF-TT, did not show meaningful improvements after the surgery, as compared to the earlier reports of 29% iNPH patients not improving after surgery [6].The classification tree provided a visual representation that was easier to understand and interpret.These findings supported the results of regression analyses and confirmed the predictive ability of Tinetti and BBS.It also provided dimensionality reduction identifying 4 items on Tinetti and 2 items on BBS that can predict meaningful changes from surgery.The negligible difference between the accuracy of the model with all features of the BBS or Tinetti compared to the smaller feature set could suggest this subset may prove to be the highest utility clinically.

Journal Pre-proof
The Tinetti items which were found to not add to the accuracy of the classification tree model were: sitting balance, arising from a chair, immediate standing balance, eyes closed in standing, turning 360 degrees, and reaching forward with outstretched arm from the balance component, and initiation of gait, step length and height, foot clearance, step symmetry, path and walking time from the gait component.For the BBS the non-predictive items were sitting to standing, standing unsupported, sitting unsupported, standing to sitting.controlled descent from standing to sitting, transfers, standing with eyes closed, standing with feet together, turning 360 degrees, foot placement during standing, sitting to standing without the use of hands, standing on one leg and tandem stance.
Since the study was aimed at identifying the predictive value of outcomes from the CSF-TT, the individual items identified in the classification tree analysis were not investigated further.
However, these findings would be useful for future research to explore the validity of shorter versions of Tinetti and BBS with specific items that can improve predictive accuracy for surgical outcomes.

Limitations
One of the key limitations of this study was the small sample size that provided limited data points for analysis.As a result, multivariate regression could not be explored that could identify the influence of other factors affecting the post-surgical outcomes.Additionally, the study was limited to immediate outcomes from surgery.Long-term effects could not be J o u r n a l P r e -p r o o f Journal Pre-proof determined due to the lack of follow-up.It could not be explored if the outcomes following the surgery were affected by post-operative complications, disease duration and disease severity.The use of multiple techniques to confirm our results was helpful to overcome this limitation to a certain extent, however, it would be beneficial to explore these outcomes using data from a large, longitudinal data set.

Conclusion
Clinically important changes in measures following a CSF-TT are useful in predicting post-surgical outcomes in iNPH patients.Our analysis found that while all three outcomes showed useful clinimetrics, Tinetti and BBS, both have significant predictive value using the MCID scores, of which BBS proved to be a stronger outcome for prediction.These findings provide clinical relevance for the objective interpretation of the CSF-TT outcomes.Using the MCID cut-off scores, clinicians would know the level of change required from the CSF-TT that can support the decision making for iNPH patient selection of those who are mostly likely to benefit from surgery and avoid surgeries in those who are less likely to have any meaningful gains.Clinicians can consider use of BBS and Tinetti as potential outcome measures with predictive ability to identify meaningful changes in iNPH patients.
Future studies with larger sample size and long-term observations are required to further strengthen these findings and explore additional factors that may influence surgical outcomes in iNPH patients.

Table 1 -
Characteristics of the study sample (n=34) Age in years (mean, (SD)deviation J o u r n a l P r e -p r o o f Journal Pre-proof

Table 2 -
CSF tap test change scores compared to surgery change scores & number of participants achieving MCID at the tap test compared to achieving MCID post surgery.