Findings & Outputs

The CIVIC programme commenced its research in February 2021, and by January 2022 had been able to evidence the marked effectiveness of modelling respiratory disease via over-the-counter medication purchase data (i.e. self-medication products):

◼ CIVIC’s AI models (built in partnership with NHS-X/Boots) accurately predicted Respiratory Mortality over 17 days in advance every week between 2016-2020 in every on of the UK’s 314 local authorities (R2=0.78***). These results reflect significant advances in fidelity and preparation time (NHS). more >>

◼ When applied to COVID-19 (2020-21), expected to be far more challenging, CIVIC’s XGBoost models accurately predicted mortality over 21 days in advance (R2=0.71***) again at local-area levels – far outperforming models based on official case data alone (R2=0.44**) as used in traditional epidemiological models. more >>

◼ Work undertaken with both OLIO (Food Sharing app) and Co-op Ltd, have shown the ability to examine impact to vulnerable communities of outbreaks at neighbourhood levels. Interfaces being piloted with Havering County Council, London, and expanded to other councils through support from Guy and St. Thomas. more >>

Further details, and information to generated research papers/outputs are reported below:


1. Predicting Respiratory Disease via Shopping: The PADRUS Model

Assessing the value of integrating national longitudinal shopping data into respiratory disease forecasting models (In Review, Nature Behaviour)
Dolan, E., Goulding, J., Marshall, H., Smith, G., Long, H., Tata, L.

We investigated the value of integrating sales of non-prescription medications commonly bought for managing respiratory symptoms, to improve forecasting of weekly registered deaths from respiratory disease at local levels across England. Using >2 billion transactions from a UK high street retailer between March 2016 to March 2020. We report the results from the novel AI explainability variable importance tool Model Class Reliance implemented on the PADRUS model. PADRUS is a machine learning model optimized to predict registered deaths from respiratory disease in 314 local authority areas across England through the integration of shopping sales data and focused on purchases of non-prescription medications.

◼ We find strong evidence that models incorporating sales data significantly out-perform those that solely use traditional variables (e.g. socio-demographics, incidence and weather data). ◼ Accuracy gains are highest in winter periods, where there is maximum risk to the general public. (increases in R2 between 0.09 to 0.11)

2. Examining COVID-19 and Food Insecurity: The FIMS Model

Identifying food insecurity in vulnerable communities via machine learning Journal of Business Research
Nica-Avram, G., Harvey,J., Smith, G., Smith, A., Goulding, J.

Food insecurity in the UK has captured public attention, especially during the recent COVID pandemic. However, estimates of its prevalence are deeply contentious. The lack of precision on the volume of emergency food assistance currently provided to those in need is made even more ambiguous due to increasing use of peer-to-peer food sharing systems (e.g. OLIO). While these initiatives exist as a solution to food waste rather than food poverty, they are nonetheless carrying a hidden share of the food insecurity burden, with the socio-economic status of technology-assisted food sharing donors, volunteers, and recipients remaining obscure. In this article we examine the relationship between food sharing and deprivation generally, before applying machine learning techniques to develop a predictive model of food insecurity based upon aggregated food sharing behaviours by OLIO users in the UK.

◼ We demonstrate that data from food sharing systems can quantify a previously hidden aspect of deprivation ◼ We make the case for a reformed approach to modelling food insecurity in light of public health.

3. Public Attitudes & Acceptance: COVID shopping logs

Public attitudes towards sharing loyalty card data for academic health research – A Qualitative Study BMC Medical Ethics
Dolan, E., Shiells, K., Goulding, J., Skatova, A.

40 Participants undertook semi-structured interviews about data sharing related to either COVID-19 or ovarian/bowel cancer. Content analysis was used to identify sub-themes corresponding to the two a priori themes, attitudes and safeguards. Attitudes fell into two categories, either rational or emotive. Most ‘rational’ participants were in favour of sharing loyalty card data – and with increased understanding of research purpose, participants expressed higher willingness to donate data. Within the ’emotive’ category, participants shared fears about revealing location information, the importance of anonymisation; data detail; control, convenience and choice; and need for transparency and data security. The change in hypothetical purpose of the data sharing, from Covid-19 to cancer research, had no impact on participants’ decision to donate, although did affect their understanding of how loyalty card data could be used.

◼ This study contributes clear recommendations for researchers and wider policy community. ◼ Following the pandemic participants are increasing in favour of donating loyalty card data for academic health research. ◼ However information, choice and appropriate safeguards are key prerequisites upon which public decisions are made.

4. Data Donation: Psychological Models

Psychology of Personal data donation PLOS ONE
Skatova, A., Goulding, J.

GDPR fundamentally reshaped how our data is handled across every sector, enabling the general public to access data collected about them and opening up the possibility of this data being used for research that benefits the public themselves. A significant barrier for using this commercial data for academic research, however, is the lack of publicly acceptable research frameworks. Data donation — the act of an individual actively consenting to donate their personal data for research — could enable the use of commercial data for the benefit of society. However, it is not clear which motives, if any, would drive people to donate their personal data for this purpose. In this paper we present the results of a large-scale survey (N = 1,300) that studied intentions and reasons to donate personal data.

◼ We found that over half of individuals are willing to donate their personal data for research that could benefit the wider general public. ◼ We identified three distinct reasons impacting donation of personal data: opportunity to achieve self-benefit, social duty, and need to understand the purpose of data donation.