PLoS Medicine

Condividi contenuti PLOS Medicine: New Articles
A Peer-Reviewed Open-Access Journal
Aggiornato: 4 ore 37 min fa

Reducing chronic disease through changes in food aid: A microsimulation of nutrition and cardiometabolic disease among Palestinian refugees in the Middle East

Ma, 20/11/2018 - 23:00

by Sanjay Basu, John S. Yudkin, Seth A. Berkowitz, Mohammed Jawad, Christopher Millett

Background

Type 2 diabetes mellitus and cardiovascular disease and have become leading causes of morbidity and mortality among Palestinian refugees in the Middle East, many of whom live in long-term settlements and receive grain-based food aid. The objective of this study was to estimate changes in type 2 diabetes and cardiovascular disease morbidity and mortality attributable to a transition from traditional food aid to either (i) a debit card restricted to food purchases, (ii) cash, or (iii) an alternative food parcel with less grain and more fruits and vegetables, each valued at $30/person/month.

Methods and findings

An individual-level microsimulation was created to estimate relationships between food aid delivery method, food consumption, type 2 diabetes, and cardiovascular disease morbidity and mortality using demographic data from the United Nations (UN; 2017) on 5,340,443 registered Palestinian refugees in Syria, Jordan, Lebanon, Gaza, and the West Bank, food consumption data (2011–2017) from households receiving traditional food parcel delivery of food aid (n = 1,507 households) and electronic debit card delivery of food aid (n = 1,047 households), and health data from a random 10% sample of refugees receiving medical care through the UN (2012–2015; n = 516,386). Outcome metrics included incidence per 1,000 person-years of hypertension, type 2 diabetes, atherosclerotic cardiovascular disease events, microvascular events (end-stage renal disease, diabetic neuropathy, and proliferative diabetic retinopathy), and all-cause mortality. The model estimated changes in total calories, sodium and potassium intake, fatty acid intake, and overall dietary quality (Mediterranean Dietary Score [MDS]) as mediators to each outcome metric. We did not observe that a change from food parcel to electronic debit card delivery of food aid or to cash aid led to a meaningful change in consumption, biomarkers, or disease outcomes. By contrast, a shift to an alternative food parcel with less grain and more fruits and vegetables was estimated to produce a 0.08 per 1,000 person-years decrease in the incidence of hypertension (95% confidence interval [CI] 0.05–0.11), 0.18 per 1,000 person-years decrease in the incidence of type 2 diabetes (95% CI 0.14–0.22), 0.18 per 1,000 person-years decrease in the incidence of atherosclerotic cardiovascular disease events (95% CI 0.17–0.19), and 0.02 decrease per 1,000 person-years all-cause mortality (95% CI 0.01 decrease to 0.04 increase) among those receiving aid. The benefits of this shift, however, could be neutralized by a small (2%) increase in compensatory (out-of-pocket) increases in consumption of refined grains, fats and oils, or confectionaries. A larger alternative parcel requiring an increase in total food aid expenditure by 27% would be more likely to have a clinically meaningful improvement on type 2 diabetes and cardiovascular disease incidence.

Conclusions

Contrary to the supposition in the literature, our findings do not robustly support the theory that transitioning from traditional food aid to either debit card or cash delivery alone would necessarily reduce chronic disease outcomes. Rather, an alternative food parcel would be more effective, even after matching current budget ceilings. But compensatory increases in consumption of less healthy foods may neutralize the improvements from an alternative food parcel unless total aid funding were increased substantially. Our analysis is limited by uncertainty in estimates of modeling long-term outcomes from shorter-term trials, focusing on diabetes and cardiovascular outcomes for which validated equations are available instead of all nutrition-associated health outcomes, and using data from food frequency questionnaires in the absence of 24-hour dietary recall data.

Healthy volunteers' perceptions of risk in US Phase I clinical trials: A mixed-methods study

Ma, 20/11/2018 - 23:00

by Jill A. Fisher, Lisa McManus, Marci D. Cottingham, Julianne M. Kalbaugh, Megan M. Wood, Torin Monahan, Rebecca L. Walker

Background

There is limited research on healthy volunteers’ perceptions of the risks of Phase I clinical trials. In order to contribute empirically to long-standing ethical concerns about healthy volunteers’ involvement in drug development, it is crucial to assess how these participants understand trial risks. The objectives of this study were to investigate (1) participants’ views of the overall risks of Phase I trials, (2) their views of the risk of personally being harmed in a trial, and (3) how risk perceptions vary across participants’ clinical trial history and sociodemographic characteristics.

Methods and findings

We qualitatively and quantitatively analyzed semi-structured interviews conducted with 178 healthy volunteers who had participated in a diverse range of Phase I trials in the United States. Participants had collective experience in a reported 1,948 Phase I trials (mean = 10.9; median = 5), and they were interviewed as part of a longitudinal study of healthy volunteers’ risk perceptions, their trial enrollment decisions, and their routine health behaviors. Participants’ qualitative responses were coded, analyzed, and subsequently quantified in order to assess correlations between their risk perceptions and demographics, such as their race/ethnicity, gender, age, educational attainment, employment status, and household income. We found that healthy volunteers often viewed the overall risks of Phase I trials differently than their own personal risk of harm. The majority of our participants thought that Phase I trials were medium, high, or extremely high risk (118 of 178), but most nonetheless felt that they were personally safe from harm (97 of 178). We also found that healthy volunteers in their first year of clinical trial participation, racial and ethnic minority participants, and Hispanic participants tended to view the overall trial risks as high (respectively, Jonckheere-Terpstra, −2.433, p = 0.015; Fisher exact test, p = 0.016; Fisher exact test, p = 0.008), but these groups did not differ in regard to their perceptions of personal risk of harm (respectively, chi-squared, 3.578, p = 0.059; chi-squared, 0.845, p = 0.358; chi-squared, 1.667, p = 0.197). The main limitation of our study comes from quantitatively aggregating data from in-depth interviews, which required the research team to interpret participants’ nonstandardized risk narratives.

Conclusions

Our study demonstrates that healthy volunteers are generally aware of and reflective about Phase I trial risks. The discrepancy in healthy volunteers’ views of overall and personal risk sheds light on why healthy volunteers might continue to enroll in clinical trials, even when they view trials on the whole as risky.

Automated detection of moderate and large pneumothorax on frontal chest X-rays using deep convolutional neural networks: A retrospective study

Ma, 20/11/2018 - 23:00

by Andrew G. Taylor, Clinton Mielke, John Mongan

Background

Pneumothorax can precipitate a life-threatening emergency due to lung collapse and respiratory or circulatory distress. Pneumothorax is typically detected on chest X-ray; however, treatment is reliant on timely review of radiographs. Since current imaging volumes may result in long worklists of radiographs awaiting review, an automated method of prioritizing X-rays with pneumothorax may reduce time to treatment. Our objective was to create a large human-annotated dataset of chest X-rays containing pneumothorax and to train deep convolutional networks to screen for potentially emergent moderate or large pneumothorax at the time of image acquisition.

Methods and findings

In all, 13,292 frontal chest X-rays (3,107 with pneumothorax) were visually annotated by radiologists. This dataset was used to train and evaluate multiple network architectures. Images showing large- or moderate-sized pneumothorax were considered positive, and those with trace or no pneumothorax were considered negative. Images showing small pneumothorax were excluded from training. Using an internal validation set (n = 1,993), we selected the 2 top-performing models; these models were then evaluated on a held-out internal test set based on area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and positive predictive value (PPV). The final internal test was performed initially on a subset with small pneumothorax excluded (as in training; n = 1,701), then on the full test set (n = 1,990), with small pneumothorax included as positive. External evaluation was performed using the National Institutes of Health (NIH) ChestX-ray14 set, a public dataset labeled for chest pathology based on text reports. All images labeled with pneumothorax were considered positive, because the NIH set does not classify pneumothorax by size. In internal testing, our “high sensitivity model” produced a sensitivity of 0.84 (95% CI 0.78–0.90), specificity of 0.90 (95% CI 0.89–0.92), and AUC of 0.94 for the test subset with small pneumothorax excluded. Our “high specificity model” showed sensitivity of 0.80 (95% CI 0.72–0.86), specificity of 0.97 (95% CI 0.96–0.98), and AUC of 0.96 for this set. PPVs were 0.45 (95% CI 0.39–0.51) and 0.71 (95% CI 0.63–0.77), respectively. Internal testing on the full set showed expected decreased performance (sensitivity 0.55, specificity 0.90, and AUC 0.82 for high sensitivity model and sensitivity 0.45, specificity 0.97, and AUC 0.86 for high specificity model). External testing using the NIH dataset showed some further performance decline (sensitivity 0.28–0.49, specificity 0.85–0.97, and AUC 0.75 for both). Due to labeling differences between internal and external datasets, these findings represent a preliminary step towards external validation.

Conclusions

We trained automated classifiers to detect moderate and large pneumothorax in frontal chest X-rays at high levels of performance on held-out test data. These models may provide a high specificity screening solution to detect moderate or large pneumothorax on images collected when human review might be delayed, such as overnight. They are not intended for unsupervised diagnosis of all pneumothoraces, as many small pneumothoraces (and some larger ones) are not detected by the algorithm. Implementation studies are warranted to develop appropriate, effective clinician alerts for the potentially critical finding of pneumothorax, and to assess their impact on reducing time to treatment.

The use of machine learning to understand the relationship between IgE to specific allergens and asthma

Ma, 20/11/2018 - 23:00

by Thomas A. E. Platts-Mills, Matthew Perzanowski

Thomas Platts-Mills and Matthew Perzanowski provide their expert Perspective on a translational study from Custovic and colleagues that identifies pairings of IgE that show value in estimating risk of concurrent asthma.

Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records

Ma, 20/11/2018 - 23:00

by Fatemeh Rahimian, Gholamreza Salimi-Khorshidi, Amir H. Payberah, Jenny Tran, Roberto Ayala Solares, Francesca Raimondi, Milad Nazarzadeh, Dexter Canoy, Kazem Rahimi

Background

Emergency admissions are a major source of healthcare spending. We aimed to derive, validate, and compare conventional and machine learning models for prediction of the first emergency admission. Machine learning methods are capable of capturing complex interactions that are likely to be present when predicting less specific outcomes, such as this one.

Methods and findings

We used longitudinal data from linked electronic health records of 4.6 million patients aged 18–100 years from 389 practices across England between 1985 to 2015. The population was divided into a derivation cohort (80%, 3.75 million patients from 300 general practices) and a validation cohort (20%, 0.88 million patients from 89 general practices) from geographically distinct regions with different risk levels. We first replicated a previously reported Cox proportional hazards (CPH) model for prediction of the risk of the first emergency admission up to 24 months after baseline. This reference model was then compared with 2 machine learning models, random forest (RF) and gradient boosting classifier (GBC). The initial set of predictors for all models included 43 variables, including patient demographics, lifestyle factors, laboratory tests, currently prescribed medications, selected morbidities, and previous emergency admissions. We then added 13 more variables (marital status, prior general practice visits, and 11 additional morbidities), and also enriched all variables by incorporating temporal information whenever possible (e.g., time since first diagnosis). We also varied the prediction windows to 12, 36, 48, and 60 months after baseline and compared model performances. For internal validation, we used 5-fold cross-validation. When the initial set of variables was used, GBC outperformed RF and CPH, with an area under the receiver operating characteristic curve (AUC) of 0.779 (95% CI 0.777, 0.781), compared to 0.752 (95% CI 0.751, 0.753) and 0.740 (95% CI 0.739, 0.741), respectively. In external validation, we observed an AUC of 0.796, 0.736, and 0.736 for GBC, RF, and CPH, respectively. The addition of temporal information improved AUC across all models. In internal validation, the AUC rose to 0.848 (95% CI 0.847, 0.849), 0.825 (95% CI 0.824, 0.826), and 0.805 (95% CI 0.804, 0.806) for GBC, RF, and CPH, respectively, while the AUC in external validation rose to 0.826, 0.810, and 0.788, respectively. This enhancement also resulted in robust predictions for longer time horizons, with AUC values remaining at similar levels across all models. Overall, compared to the baseline reference CPH model, the final GBC model showed a 10.8% higher AUC (0.848 compared to 0.740) for prediction of risk of emergency admission within 24 months. GBC also showed the best calibration throughout the risk spectrum. Despite the wide range of variables included in models, our study was still limited by the number of variables included; inclusion of more variables could have further improved model performances.

Conclusions

The use of machine learning and addition of temporal information led to substantially improved discrimination and calibration for predicting the risk of emergency admission. Model performance remained stable across a range of prediction time windows and when externally validated. These findings support the potential of incorporating machine learning models into electronic health records to inform care and service planning.

Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Ma, 20/11/2018 - 23:00

by Pranav Rajpurkar, Jeremy Irvin, Robyn L. Ball, Kaylie Zhu, Brandon Yang, Hershel Mehta, Tony Duan, Daisy Ding, Aarti Bagul, Curtis P. Langlotz, Bhavik N. Patel, Kristen W. Yeom, Katie Shpanskaya, Francis G. Blankenberg, Jayne Seekins, Timothy J. Amrhein, David A. Mong, Safwan S. Halabi, Evan J. Zucker, Andrew Y. Ng, Matthew P. Lungren

Background

Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists.

Methods and findings

We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt’s discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4–28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863–0.910), 0.911 (95% CI 0.866–0.947), and 0.985 (95% CI 0.974–0.991), respectively, whereas CheXNeXt’s AUCs were 0.831 (95% CI 0.790–0.870), 0.704 (95% CI 0.567–0.833), and 0.851 (95% CI 0.785–0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825–0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777–0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution.

Conclusions

In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.

Machine learning assessment of myocardial ischemia using angiography: Development and retrospective validation

Ma, 13/11/2018 - 23:00

by Hyeonyong Hae, Soo-Jin Kang, Won-Jang Kim, So-Yeon Choi, June-Goo Lee, Youngoh Bae, Hyungjoo Cho, Dong Hyun Yang, Joon-Won Kang, Tae-Hwan Lim, Cheol Hyun Lee, Do-Yoon Kang, Pil Hyung Lee, Jung-Min Ahn, Duk-Woo Park, Seung-Whan Lee, Young-Hak Kim, Cheol Whan Lee, Seong-Wook Park, Seung-Jung Park

Background

Invasive fractional flow reserve (FFR) is a standard tool for identifying ischemia-producing coronary stenosis. However, in clinical practice, over 70% of treatment decisions still rely on visual estimation of angiographic stenosis, which has limited accuracy (about 60%–65%) for the prediction of FFR < 0.80. One of the reasons for the visual–functional mismatch is that myocardial ischemia can be affected by the supplied myocardial size, which is not always evident by coronary angiography. The aims of this study were to develop an angiography-based machine learning (ML) algorithm for predicting the supplied myocardial volume for a stenosis, as measured using coronary computed tomography angiography (CCTA), and then to build an angiography-based classifier for the lesions with an FFR < 0.80 versus ≥ 0.80.

Methods and findings

A retrospective study was conducted using data from 1,132 stable and unstable angina patients with 1,132 intermediate lesions who underwent invasive coronary angiography, FFR, and CCTA at the Asan Medical Center, Seoul, Korea, between 1 May 2012 and 30 November 2015. The mean age was 63 ± 10 years, 76% were men, and 72% of the patients presented with stable angina. Of these, 932 patients (assessed before 31 January 2015) constituted the training set for the algorithm, and 200 patients (assessed after 1 February 2015) served as a test cohort to validate its diagnostic performance. Additionally, external validation with 79 patients from two centers (CHA University, Seongnam, Korea, and Ajou University, Suwon, Korea) was conducted. After automatic contour calibration using the caliber of guiding catheter, quantitative coronary angiography was performed using the edge-detection algorithms (CAAS-5, Pie-Medical). Clinical information was provided by the Asan BiomedicaL Research Environment (ABLE) system. The CCTA-based myocardial segmentation (CAMS)-derived myocardial volume supplied by each vessel (right coronary artery [RCA], left anterior descending [LAD], left circumflex [LCX]) and the myocardial volume subtended to a stenotic segment (CAMS-%Vsub) were measured for labeling. The ML for (1) predicting vessel territories (CAMS-%LAD, CAMS-%LCX, and CAMS-%RCA) and CAMS-%Vsub and (2) identifying the lesions with an FFR < 0.80 was constructed. Angiography-based ML, employing a light gradient boosting machine (GBM), showed mean absolute errors (MAEs) of 5.42%, 8.57%, and 4.54% for predicting CAMS-%LAD, CAMS-%LCX, and CAMS-%RCA, respectively. The percent myocardial volumes predicted by ML were used to predict the CAMS-%Vsub. With 5-fold cross validation, the MAEs between ML-predicted percent myocardial volume subtended to a stenotic segment (ML-%Vsub) and CAMS-%Vsub were minimized by the elastic net (6.26% ± 0.55% for LAD, 5.79% ± 0.68% for LCX, and 2.95% ± 0.14% for RCA lesions). Using all attributes (age, sex, involved vessel segment, and angiographic features affecting the myocardial territory and stenosis degree), the ML classifiers (L2 penalized logistic regression, support vector machine, and random forest) predicted an FFR < 0.80 with an accuracy of approximately 80% (area under the curve [AUC] = 0.84–0.87, 95% confidence intervals 0.71–0.94) in the test set, which was greater than that of diameter stenosis (DS) > 53% (66%, AUC = 0.71, 95% confidence intervals 0.65–0.78). The external validation showed 84% accuracy (AUC = 0.89, 95% confidence intervals 0.83–0.95). The retrospective design, single ethnicity, and the lack of clinical outcomes may limit this prediction model’s generalized application.

Conclusion

We found that angiography-based ML is useful to predict subtended myocardial territories and ischemia-producing lesions by mitigating the visual–functional mismatch between angiographic and FFR. Assessment of clinical utility requires further validation in a large, prospective cohort study.

Transforming health policy through machine learning

Ma, 13/11/2018 - 23:00

by Hutan Ashrafian, Ara Darzi

In their Perspective, Ara Darzi and Hutan Ashrafian give us a tour of the future policymaker's machine learning toolkit.

Machine learning to identify pairwise interactions between specific IgE antibodies and their association with asthma: A cross-sectional analysis within a population-based birth cohort

Ma, 13/11/2018 - 23:00

by Sara Fontanella, Clément Frainay, Clare S. Murray, Angela Simpson, Adnan Custovic

Background

The relationship between allergic sensitisation and asthma is complex; the data about the strength of this association are conflicting. We propose that the discrepancies arise in part because allergic sensitisation may not be a single entity (as considered conventionally) but a collection of several different classes of sensitisation. We hypothesise that pairings between immunoglobulin E (IgE) antibodies to individual allergenic molecules (components), rather than IgE responses to ‘informative’ molecules, are associated with increased risk of asthma.

Methods and findings

In a cross-sectional analysis among 461 children aged 11 years participating in a population-based birth cohort, we measured serum-specific IgE responses to 112 allergen components using a multiplex array (ImmunoCAP Immuno‑Solid phase Allergy Chip [ISAC]). We characterised sensitivity to 44 active components (specific immunoglobulin E [sIgE] > 0.30 units in at least 5% of children) among the 213 (46.2%) participants sensitised to at least one of these 44 components. We adopted several machine learning methodologies that offer a powerful framework to investigate the highly complex sIgE–asthma relationship. Firstly, we applied network analysis and hierarchical clustering (HC) to explore the connectivity structure of component-specific IgEs and identify clusters of component-specific sensitisation (‘component clusters’). Of the 44 components included in the model, 33 grouped in seven clusters (C.sIgE-1–7), and the remaining 11 formed singleton clusters. Cluster membership mapped closely to the structural homology of proteins and/or their biological source. Components in the pathogenesis-related (PR)-10 proteins cluster (C.sIgE-5) were central to the network and mediated connections between components from grass (C.sIgE-4), trees (C.sIgE-6), and profilin clusters (C.sIgE-7) with those in mite (C.sIgE-1), lipocalins (C.sIgE-3), and peanut clusters (C.sIgE-2). We then used HC to identify four common ‘sensitisation clusters’ among study participants: (1) multiple sensitisation (sIgE to multiple components across all seven component clusters and singleton components), (2) predominantly dust mite sensitisation (IgE responses mainly to components from C.sIgE-1), (3) predominantly grass and tree sensitisation (sIgE to multiple components across C.sIgE-4–7), and (4) lower-grade sensitisation. We used a bipartite network to explore the relationship between component clusters, sensitisation clusters, and asthma, and the joint density-based nonparametric differential interaction network analysis and classification (JDINAC) to test whether pairwise interactions of component-specific IgEs are associated with asthma. JDINAC with pairwise interactions provided a good balance between sensitivity (0.84) and specificity (0.87), and outperformed penalised logistic regression with individual sIgE components in predicting asthma, with an area under the curve (AUC) of 0.94, compared with 0.73. We then inferred the differential network of pairwise component-specific IgE interactions, which demonstrated that 18 pairs of components predicted asthma. These findings were confirmed in an independent sample of children aged 8 years who participated in the same birth cohort but did not have component-resolved diagnostics (CRD) data at age 11 years. The main limitation of our study was the exclusion of potentially important allergens caused by both the ISAC chip resolution as well as the filtering step. Clustering and the network analyses might have provided different solutions if additional components had been available.

Conclusions

Interactions between pairs of sIgE components are associated with increased risk of asthma and may provide the basis for designing diagnostic tools for asthma.

Safety, tolerability, and pharmacokinetics of long-acting injectable cabotegravir in low-risk HIV-uninfected individuals: HPTN 077, a phase 2a randomized controlled trial

Gi, 08/11/2018 - 23:00

by Raphael J. Landovitz, Sue Li, Beatriz Grinsztejn, Halima Dawood, Albert Y. Liu, Manya Magnus, Mina C. Hosseinipour, Ravindre Panchia, Leslie Cottle, Gordon Chau, Paul Richardson, Mark A. Marzinke, Craig W. Hendrix, Susan H. Eshleman, Yinfeng Zhang, Elizabeth Tolley, Jeremy Sugarman, Ryan Kofron, Adeola Adeyeye, David Burns, Alex R. Rinehart, David Margolis, William R. Spreen, Myron S. Cohen, Marybeth McCauley, Joseph J. Eron

Background

Cabotegravir (CAB) is a novel strand-transfer integrase inhibitor being developed for HIV treatment and prevention. CAB is formulated both as an immediate-release oral tablet for daily administration and as a long-acting injectable suspension (long-acting CAB [CAB LA]) for intramuscular (IM) administration, which delivers prolonged plasma exposure to the drug after IM injection. HIV Prevention Trials Network study 077 (HPTN 077) evaluated the safety, tolerability, and pharmacokinetics of CAB LA in HIV-uninfected males and females at 8 sites in Brazil, Malawi, South Africa, and the United States.

Methods and findings

HPTN 077 was a double-blind, placebo-controlled phase 2a trial. Healthy individuals age 18–65 years at low HIV risk were randomized (3:1) to receive CAB or placebo (PBO). In the initial oral phase, participants received 1 daily oral tablet (CAB or PBO) for 4 weeks. Those without safety concerns in the oral phase continued and received injections in the injection phase (Cohort 1: 3 injections of CAB LA 800 mg or 0.9% saline as PBO IM every 12 weeks for 3 injection cycles; Cohort 2: CAB LA 600 mg or PBO IM for 5 injection cycles; the first 2 injections in Cohort 2 were separated by 4 weeks, the rest by 8 weeks). The primary analysis included weeks 5 to 41 of study participation, encompassing the injection phase. The cohorts were enrolled sequentially. Primary outcomes were safety and tolerability. Secondary outcomes included pharmacokinetics and events occurring during the oral and injection phases. Between February 9, 2015, and May 27, 2016, the study screened 443 individuals and enrolled 110 participants in Cohort 1 and 89 eligible participants in Cohort 2. Participant population characteristics were as follows: 66% female at birth; median age 31 years; 27% non-Hispanic white, 41% non-Hispanic black, 24% Hispanic/Latino, 3% Asian, and 6% mixed/other; and 6 transgender men and 1 transgender woman. Twenty-two (11%) participants discontinued the oral study product; 6 of these were for clinical or laboratory adverse events (AEs). Of those who received at least 1 CAB LA injection, 80% of Cohort 1 and 92% of Cohort 2 participants completed all injections; injection course completion rates were not different from those in the PBO arm. Injection site reactions (ISRs) were common (92% of Cohort 1 and 88% of Cohort 2 participants who received CAB LA reported any ISR). ISRs were mostly Grade 1 (mild) to Grade 2 (moderate), and 1 ISR event (Cohort 1) led to product discontinuation. Grade 2 or higher ISRs were the only AEs reported more commonly among CAB LA recipients than PBO recipients. Two Grade 3 (severe) ISRs occurred in CAB recipients, 1 in each cohort, but did not lead to product discontinuation in either case. Seven incident sexually transmitted infections were diagnosed in 6 participants. One HIV infection occurred in a participant 48 weeks after last injection of CAB LA: CAB was not detectable in plasma both at the time of first reactive HIV test and at the study visit 12 weeks prior to the first reactive test. Participants in Cohort 2 (unlike Cohort 1) consistently met prespecified pharmacokinetic targets of at least 95% of participants maintaining CAB trough concentrations above PA-IC90, and 80% maintaining trough concentrations above 4× PA-IC90. Study limitations include a modest sample size, a short course of injections, and a low-risk study population.

Conclusions

In this study, CAB LA was well tolerated at the doses and dosing intervals used. ISRs were common, but infrequently led to product discontinuation. CAB LA 600 mg every 8 weeks met pharmacokinetic targets for both male and female study participants. The safety and pharmacokinetic results observed support the further development of CAB LA, and efficacy studies of CAB LA for HIV treatment and prevention are in progress.

Trial registration

ClinicalTrials.gov Registry: ClinicalTrials.gov Trial number: NCT02178800.

Hydrometeorology and flood pulse dynamics drive diarrheal disease outbreaks and increase vulnerability to climate change in surface-water-dependent populations: A retrospective analysis

Gi, 08/11/2018 - 23:00

by Kathleen A. Alexander, Alexandra K. Heaney, Jeffrey Shaman

Background

The impacts of climate change on surface water, waterborne disease, and human health remain a growing area of concern, particularly in Africa, where diarrheal disease is one of the most important health threats to children under 5 years of age. Little is known about the role of surface water and annual flood dynamics (flood pulse) on waterborne disease and human health nor about the expected impact of climate change on surface-water-dependent populations.

Methods and findings

Using the Chobe River in northern Botswana, a flood pulse river—floodplain system, we applied multimodel inference approaches assessing the influence of river height, water quality (bimonthly counts of Escherichia coli and total suspended solids [TSS], 2011–2017), and meteorological variability on weekly diarrheal case reports among children under 5 presenting to health facilities (n = 10 health facilities, January 2007–June 2017). We assessed diarrheal cases by clinical characteristics and season across age groups using monthly outpatient data (January 1998–June 2017). A strong seasonal pattern was identified, with 2 outbreaks occurring regularly in the wet and dry seasons. The timing of outbreaks diverged from that at the level of the country, where surface water is largely absent. Across age groups, the number of diarrheal cases was greater, on average, during the dry season. Demographic and clinical characteristics varied by season, underscoring the importance of environmental drivers. In the wet season, rainfall (8-week lag) had a significant influence on under-5 diarrhea, with a 10-mm increase in rainfall associated with an estimated 6.5% rise in the number of cases. Rainfall, minimum temperature, and river height were predictive of E. coli concentration, and increases in E. coli in the river were positively associated with diarrheal cases. In the dry season, river height (1-week lag) and maximum temperature (1- and 4-week lag) were significantly associated with diarrheal cases. During this period, a 1-meter drop in river height corresponded to an estimated 16.7% and 16.1% increase in reported diarrhea with a 1- and 4-week lag, respectively. In this region, as floodwaters receded from the surrounding floodplains, TSS levels increased and were positively associated with diarrheal cases (0- and 3-week lag). Populations living in this region utilized improved water sources, suggesting that hydrological variability and rapid water quality shifts in surface waters may compromise water treatment processes. Limitations include the potential influence of health beliefs and health seeking behaviors on data obtained through passive surveillance.

Conclusions

In flood pulse river—floodplain systems, hydrology and water quality dynamics can be highly variable, potentially impacting conventional water treatment facilities and the production of safe drinking water. In Southern Africa, climate change is predicted to intensify hydrological variability and the frequency of extreme weather events, amplifying the public health threat of waterborne disease in surface-water-dependent populations. Water sector development should be prioritized with urgency, incorporating technologies that are robust to local environmental conditions and expected climate-driven impacts. In populations with high HIV burdens, expansion of diarrheal disease surveillance and intervention strategies may also be needed. As annual flood pulse processes are predominantly influenced by climate controls in distant regions, country-level data may be inadequate to refine predictions of climate—health interactions in these systems.

Machine learning in medicine: Addressing ethical challenges

Ma, 06/11/2018 - 23:00

by Effy Vayena, Alessandro Blasimme, I. Glenn Cohen

Effy Vayena and colleagues argue that machine learning in medicine must offer data protection, algorithmic transparency, and accountability to earn the trust of patients and clinicians.

Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study

Ma, 06/11/2018 - 23:00

by John R. Zech, Marcus A. Badgeley, Manway Liu, Anthony B. Costa, Joseph J. Titano, Eric Karl Oermann

Background

There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task.

Methods and findings

A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong’s test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855–0.866) on the joint MSH–NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both <0.001). The highest internal performance was achieved by combining training and test data from MSH and NIH (AUC 0.931, 95% CI 0.927–0.936), but this model demonstrated significantly lower external performance at IU (AUC 0.815, 95% CI 0.745–0.885, P = 0.001). To test the effect of pooling data from sites with disparate pneumonia prevalence, we used stratified subsampling to generate MSH–NIH cohorts that only differed in disease prevalence between training data sites. When both training data sites had the same pneumonia prevalence, the model performed consistently on external IU data (P = 0.88). When a 10-fold difference in pneumonia rate was introduced between sites, internal test performance improved compared to the balanced model (10× MSH risk P < 0.001; 10× NIH P = 0.002), but this outperformance failed to generalize to IU (MSH 10× P < 0.001; NIH 10× P = 0.027). CNNs were able to directly detect hospital system of a radiograph for 99.95% NIH (22,050/22,062) and 99.98% MSH (8,386/8,388) radiographs. The primary limitation of our approach and the available public data is that we cannot fully assess what other factors might be contributing to hospital system–specific biases.

Conclusion

Pneumonia-screening CNNs achieved better internal than external performance in 3 out of 5 natural comparisons. When models were trained on pooled data from sites with different pneumonia prevalence, they performed better on new pooled data from these sites but not on external data. CNNs robustly identified hospital system and department within a hospital, which can have large differences in disease burden and may confound predictions.

Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study

Ma, 06/11/2018 - 23:00

by Haotian Lin, Erping Long, Xiaohu Ding, Hongxing Diao, Zicong Chen, Runzhong Liu, Jialing Huang, Jingheng Cai, Shuangjuan Xu, Xiayin Zhang, Dongni Wang, Kexin Chen, Tongyong Yu, Dongxuan Wu, Xutu Zhao, Zhenzhen Liu, Xiaohang Wu, Yuzhen Jiang, Xiao Yang, Dongmei Cui, Wenyan Liu, Yingfeng Zheng, Lixia Luo, Haibo Wang, Chi-Chao Chan, Ian G. Morgan, Mingguang He, Yizhi Liu

Background

Electronic medical records provide large-scale real-world clinical data for use in developing clinical decision systems. However, sophisticated methodology and analytical skills are required to handle the large-scale datasets necessary for the optimisation of prediction accuracy. Myopia is a common cause of vision loss. Current approaches to control myopia progression are effective but have significant side effects. Therefore, identifying those at greatest risk who should undergo targeted therapy is of great clinical importance. The objective of this study was to apply big data and machine learning technology to develop an algorithm that can predict the onset of high myopia, at specific future time points, among Chinese school-aged children.

Methods and findings

Real-world clinical refraction data were derived from electronic medical record systems in 8 ophthalmic centres from January 1, 2005, to December 30, 2015. The variables of age, spherical equivalent (SE), and annual progression rate were used to develop an algorithm to predict SE and onset of high myopia (SE ≤ −6.0 dioptres) up to 10 years in the future. Random forest machine learning was used for algorithm training and validation. Electronic medical records from the Zhongshan Ophthalmic Centre (a major tertiary ophthalmic centre in China) were used as the training set. Ten-fold cross-validation and out-of-bag (OOB) methods were applied for internal validation. The remaining 7 independent datasets were used for external validation. Two population-based datasets, which had no participant overlap with the ophthalmic-centre-based datasets, were used for multi-resource validation testing. The main outcomes and measures were the area under the curve (AUC) values for predicting the onset of high myopia over 10 years and the presence of high myopia at 18 years of age. In total, 687,063 multiple visit records (≥3 records) of 129,242 individuals in the ophthalmic-centre-based electronic medical record databases and 17,113 follow-up records of 3,215 participants in population-based cohorts were included in the analysis. Our algorithm accurately predicted the presence of high myopia in internal validation (the AUC ranged from 0.903 to 0.986 for 3 years, 0.875 to 0.901 for 5 years, and 0.852 to 0.888 for 8 years), external validation (the AUC ranged from 0.874 to 0.976 for 3 years, 0.847 to 0.921 for 5 years, and 0.802 to 0.886 for 8 years), and multi-resource testing (the AUC ranged from 0.752 to 0.869 for 4 years). With respect to the prediction of high myopia development by 18 years of age, as a surrogate of high myopia in adulthood, the algorithm provided clinically acceptable accuracy over 3 years (the AUC ranged from 0.940 to 0.985), 5 years (the AUC ranged from 0.856 to 0.901), and even 8 years (the AUC ranged from 0.801 to 0.837). Meanwhile, our algorithm achieved clinically acceptable prediction of the actual refraction values at future time points, which is supported by the regressive performance and calibration curves. Although the algorithm achieved balanced and robust performance, concerns about the compromised quality of real-world clinical data and over-fitting issues should be cautiously considered.

Conclusions

To our knowledge, this study, for the first time, used large-scale data collected from electronic health records to demonstrate the contribution of big data and machine learning approaches to improved prediction of myopia prognosis in Chinese school-aged children. This work provides evidence for transforming clinical practice, health policy-making, and precise individualised interventions regarding the practical control of school-aged myopia.

Lymphopenia and risk of infection and infection-related death in 98,344 individuals from a prospective Danish population-based study

Gi, 01/11/2018 - 22:00

by Marie Warny, Jens Helby, Børge Grønne Nordestgaard, Henrik Birgens, Stig Egil Bojesen

Background

Neutropenia increases the risk of infection, but it is unknown if this also applies to lymphopenia. We therefore tested the hypotheses that lymphopenia is associated with increased risk of infection and infection-related death in the general population.

Methods and findings

Of the invited 220,424 individuals, 99,191 attended examination. We analyzed 98,344 individuals from the Copenhagen General Population Study (Denmark), examined from November 25, 2003, to July 9, 2013, and with available blood lymphocyte count at date of examination. During a median of 6 years of follow-up, they developed 8,401 infections and experienced 1,045 infection-related deaths. Due to the completeness of the Danish civil and health registries, none of the 98,344 individuals were lost to follow-up, and those emigrating (n = 385) or dying (n = 5,636) had their follow-up truncated at the day of emigration or death. At date of examination, mean age was 58 years, and 44,181 (44.9%) were men. Individuals with lymphopenia (lymphocyte count < 1.1 × 109/l, n = 2,352) compared to those with lymphocytes in the reference range (1.1–3.7 × 109/l, n = 93,538) had multivariable-adjusted hazard ratios of 1.41 (95% CI 1.28–1.56) for any infection, 1.31 (1.14–1.52) for pneumonia, 1.44 (1.15–1.79) for skin infection, 1.26 (1.02–1.56) for urinary tract infection, 1.51 (1.21–1.89) for sepsis, 1.38 (1.01–1.88) for diarrheal disease, 2.15 (1.16–3.98) for endocarditis, and 2.26 (1.21–4.24) for other infections. The corresponding hazard ratio for infection-related death was 1.70 (95% CI 1.37–2.10). Analyses were adjusted for age, sex, smoking status, cumulative smoking, alcohol intake, body mass index, plasma C-reactive protein, blood neutrophil count, recent infection, Charlson comorbidity index, autoimmune diseases, medication use, and immunodeficiency/hematologic disease. The findings were robust in all stratified analyses and also when including only events later than 2 years after first examination. However, due to the observational design, the study cannot address questions of causality, and our analyses might theoretically have been affected by residual confounding and reverse causation. In principle, fluctuating lymphocyte counts over time might also have influenced analyses, but lymphocyte counts in 5,181 individuals measured 10 years after first examination showed a regression dilution ratio of 0.68.

Conclusions

Lymphopenia was associated with increased risk of hospitalization with infection and increased risk of infection-related death in the general population. Notably, causality cannot be deduced from our data.

Correction: Signatures of inflammation and impending multiple organ dysfunction in the hyperacute phase of trauma: A prospective cohort study

Me, 31/10/2018 - 22:00

by Claudia P. Cabrera, Joanna Manson, Joanna M. Shepherd, Hew D. Torrance, David Watson, M. Paula Longhi, Mimoza Hoti, Minal B. Patel, Michael O’Dwyer, Sussan Nourshargh, Daniel J. Pennington, Michael R. Barnes, Karim Brohi

Development and validation of a new method for indirect estimation of neonatal, infant, and child mortality trends using summary birth histories

Me, 31/10/2018 - 22:00

by Roy Burstein, Haidong Wang, Robert C. Reiner Jr, Simon I. Hay

Background

The addition of neonatal (NN) mortality targets in the Sustainable Development Goals highlights the increased need for age-specific quantification of mortality trends, detail that is not provided by summary birth histories (SBHs). Several methods exist to indirectly estimate trends in under-5 mortality from SBHs; however, efforts to monitor mortality trends in important age groups such as the first month and first year of life have yet to utilize the vast amount of SBH data available from household surveys and censuses.

Methods and findings

We analyzed 243 Demographic and Health Surveys (DHS) from 76 countries, which collected both complete and SBHs from 8.5 million children from 2.3 million mothers to develop a new empirically based method to indirectly estimate time trends in age-specific mortality. We used complete birth history (CBH) data to train a discrete hazards generalized additive model in order to predict individual hazard functions for children based on individual-, mother-, and country-year-level covariates. Individual-level predictions were aggregated over time by assigning probability weights to potential birth years from mothers from SBH data. Age-specific estimates were evaluated in three ways: using cross-validation, using an external database of an additional 243 non-DHS census and survey data sources, and comparing overall under-5 mortality to existing indirect methods.Our model was able to closely approximate trends in age-specific child mortality. Depending on age, the model was able to explain between 80% and 95% of the variation in the validation data. Bias was close to zero in every age, with median relative errors spanning from 0.96 to 1.09. For trends in all under-5s, performance was comparable to the methods used for the Global Burden of Disease (GBD) study and significantly better than the standard indirect (Brass) method, especially in the 5 years preceding a survey. For the 15 years preceding surveys, the new method and GBD methods could explain more than 95% of the variation in the validation data for under-5s, whereas the standard indirect variants tested could only explain up to 88%. External validation using census and survey data found close agreement with concurrent direct estimates of mortality in the NN and infant age groups. As a predictive method based on empirical data, one limitation is that potential issues in these training data could be reflected in the resulting application of the method out of sample.

Conclusions

This new method for estimating child mortality produces results that are comparable to current best methods for indirect estimation of under-5 mortality while additionally producing age-specific estimates. Use of such methods allows researchers to utilize a massive amount of SBH data for estimation of trends in NN and infant mortality. Systematic application of these methods could further improve the evidence base for monitoring of trends and inequalities in age-specific child mortality.

Health systems thinking: A new generation of research to improve healthcare quality

Ma, 30/10/2018 - 22:00

by Hannah H. Leslie, Lisa R. Hirschhorn, Tanya Marchant, Svetlana V. Doubova, Oye Gureje, Margaret E. Kruk

Hannah Leslie and colleagues of the High-Quality Health Commission discuss in an Editorial the findings from their report that detail the improvements needed to prevent declines in individuals’ health as the scope and quality of health systems increase. Patient-centered care at the population level, improved utility of research products, and innovative reporting tools to help guide the development of new methods are key to improved global healthcare.

Late-pregnancy dysglycemia in obese pregnancies after negative testing for gestational diabetes and risk of future childhood overweight: An interim analysis from a longitudinal mother–child cohort study

Lu, 29/10/2018 - 22:00

by Delphina Gomes, Rüdiger von Kries, Maria Delius, Ulrich Mansmann, Martha Nast, Martina Stubert, Lena Langhammer, Nikolaus A. Haas, Heinrich Netz, Viola Obermeier, Stefan Kuhle, Lesca M. Holdt, Daniel Teupser, Uwe Hasbargen, Adelbert A. Roscher, Regina Ensenauer

Background

Maternal pre-conception obesity is a strong risk factor for childhood overweight. However, prenatal mechanisms and their effects in susceptible gestational periods that contribute to this risk are not well understood. We aimed to assess the impact of late-pregnancy dysglycemia in obese pregnancies with negative testing for gestational diabetes mellitus (GDM) on long-term mother–child outcomes.

Methods and findings

The prospective cohort study Programming of Enhanced Adiposity Risk in Childhood–Early Screening (PEACHES) (n = 1,671) enrolled obese and normal weight mothers from August 2010 to December 2015 with trimester-specific data on glucose metabolism including GDM status at the end of the second trimester and maternal glycated hemoglobin (HbA1c) at delivery as a marker for late-pregnancy dysglycemia (HbA1c ≥ 5.7% [39 mmol/mol]). We assessed offspring short- and long-term outcomes up to 4 years, and maternal glucose metabolism 3.5 years postpartum. Multivariable linear and log-binomial regression with effects presented as mean increments (Δ) or relative risks (RRs) with 95% confidence intervals (CIs) were used to examine the association between late-pregnancy dysglycemia and outcomes. Linear mixed-effects models were used to study the longitudinal development of offspring body mass index (BMI) z-scores. The contribution of late-pregnancy dysglycemia to the association between maternal pre-conception obesity and offspring BMI was estimated using mediation analysis. In all, 898 mother–child pairs were included in this unplanned interim analysis. Among obese mothers with negative testing for GDM (n = 448), those with late-pregnancy dysglycemia (n = 135, 30.1%) had higher proportions of excessive total gestational weight gain (GWG), excessive third-trimester GWG, and offspring with large-for-gestational-age birth weight than those without. Besides higher birth weight (Δ 192 g, 95% CI 100–284) and cord-blood C-peptide concentration (Δ 0.10 ng/ml, 95% CI 0.02–0.17), offspring of these women had greater weight gain during early childhood (Δ BMI z-score per year 0.18, 95% CI 0.06–0.30, n = 262) and higher BMI z-score at 4 years (Δ 0.58, 95% CI 0.18–0.99, n = 43) than offspring of the obese, GDM-negative mothers with normal HbA1c values at delivery. Late-pregnancy dysglycemia in GDM-negative mothers accounted for about one-quarter of the association of maternal obesity with offspring BMI at age 4 years (n = 151). In contrast, childhood BMI z-scores were not affected by a diagnosis of GDM in obese pregnancies (GDM-positive: 0.58, 95% CI 0.36–0.79, versus GDM-negative: 0.62, 95% CI 0.44–0.79). One mechanism triggering late-pregnancy dysglycemia in obese, GDM-negative mothers was related to excessive third-trimester weight gain (RR 1.72, 95% CI 1.12–2.65). Furthermore, in the maternal population, we found a 4-fold (RR 4.01, 95% CI 1.97–8.17) increased risk of future prediabetes or diabetes if obese, GDM-negative women had a high versus normal HbA1c at delivery (absolute risk: 43.2% versus 10.5%). There is a potential for misclassification bias as the predominantly used GDM test procedure changed over the enrollment period. Further studies are required to validate the findings and elucidate the possible third-trimester factors contributing to future mother–child health status.

Conclusions

Findings from this interim analysis suggest that offspring of obese mothers treated because of a diagnosis of GDM appeared to have a better BMI outcome in childhood than those of obese mothers who—following negative GDM testing—remained untreated in the last trimester and developed dysglycemia. Late-pregnancy dysglycemia related to uncontrolled weight gain may contribute to the development of child overweight and maternal diabetes. Our data suggest that negative GDM testing in obese pregnancies is not an “all-clear signal” and should not lead to reduced attention and risk awareness of physicians and obese women. Effective strategies are needed to maintain third-trimester glycemic and weight gain control among otherwise healthy obese pregnant women.

A cash-based intervention and the risk of acute malnutrition in children aged 6–59 months living in internally displaced persons camps in Mogadishu, Somalia: A non-randomised cluster trial

Lu, 29/10/2018 - 22:00

by Carlos S. Grijalva-Eternod, Mohamed Jelle, Hassan Haghparast-Bidgoli, Tim Colbourn, Kate Golden, Sarah King, Cassy L. Cox, Joanna Morrison, Jolene Skordis-Worrall, Edward Fottrell, Andrew J. Seal

Background

Somalia has been affected by conflict since 1991, with children aged <5 years presenting a high acute malnutrition prevalence. Cash-based interventions (CBIs) have been used in this context since 2011, despite sparse evidence of their nutritional impact. We aimed to understand whether a CBI would reduce acute malnutrition and its risk factors.

Methods and findings

We implemented a non-randomised cluster trial in internally displaced person (IDP) camps, located in peri-urban Mogadishu, Somalia. Within 10 IDP camps (henceforth clusters) selected using a humanitarian vulnerability assessment, all households were targeted for the CBI. Ten additional clusters located adjacent to the intervention clusters were selected as controls. The CBI comprised a monthly unconditional cash transfer of US$84.00 for 5 months, a once-only distribution of a non-food-items kit, and the provision of piped water free of charge. The cash transfers started in May 2016. Cash recipients were female household representatives. In March and September 2016, from a cohort of randomly selected households in the intervention (n = 111) and control (n = 117) arms (household cohort), we collected household and individual level data from children aged 6–59 months (155 in the intervention and 177 in the control arms) and their mothers/primary carers, to measure known malnutrition risk factors. In addition, between June and November 2016, data to assess acute malnutrition incidence were collected monthly from a cohort of children aged 6–59 months, exhaustively sampled from the intervention (n = 759) and control (n = 1,379) arms (child cohort). Primary outcomes were the mean Child Dietary Diversity Score in the household cohort and the incidence of first episode of acute malnutrition in the child cohort, defined by a mid-upper arm circumference < 12.5 cm and/or oedema. Analyses were by intention-to-treat. For the household cohort we assessed differences-in-differences, for the child cohort we used Cox proportional hazards ratios. In the household cohort, the CBI appeared to increase the Child Dietary Diversity Score by 0.53 (95% CI 0.01; 1.05). In the child cohort, the acute malnutrition incidence rate (cases/100 child-months) was 0.77 (95% CI 0.70; 1.21) and 0.92 (95% CI 0.53; 1.14) in intervention and control arms, respectively. The CBI did not appear to reduce the risk of acute malnutrition: unadjusted hazard ratio 0.83 (95% CI 0.48; 1.42) and hazard ratio adjusted for age and sex 0.94 (95% CI 0.51; 1.74). The CBI appeared to increase the monthly household expenditure by US$29.60 (95% CI 3.51; 55.68), increase the household Food Consumption Score by 14.8 (95% CI 4.83; 24.8), and decrease the Reduced Coping Strategies Index by 11.6 (95% CI 17.5; 5.96). The study limitations were as follows: the study was not randomised, insecurity in the field limited the household cohort sample size and collection of other anthropometric measurements in the child cohort, the humanitarian vulnerability assessment data used to allocate the intervention were not available for analysis, food market data were not available to aid results interpretation, and the malnutrition incidence observed was lower than expected.

Conclusions

The CBI appeared to improve beneficiaries’ wealth and food security but did not appear to reduce acute malnutrition risk in IDP camp children. Further studies are needed to assess whether changing this intervention, e.g., including specific nutritious foods or social and behaviour change communication, would improve its nutritional impact.

Trial registration

ISRCTN Registy ISRCTN29521514.