This guest blog is the first of a two-part series looking at the work that the Youth Futures Foundation and the Behavioural Insights Team (BIT) have been conducting to mobilise a support programme, ‘Reboot’, for a full impact evaluation. The impact evaluation is a randomised controlled trial, in which participants for both the Reboot group and comparison group are sourced within West of England Combined Authority (WECA) and North Somerset Council.
The Reboot programme works with young people aged 16-25 who are (or have been) looked after by statutory care services in the west of England and provides them with coaching support for up to three years to help them obtain and sustain employment, education and/or training (EET). The programme is delivered by 1625 Independent People (1625ip), a homelessness charity based in the south west of England, with funding from the Youth Futures Foundation and the West of England Combined Authority (WECA). Youth Futures Foundation is an independent What Works Centre for youth employment established to improve employment outcomes for young people from marginalised backgrounds. WECA is a combined authority within the west of England, consisting of the local authorities (LAs) of Bristol, South Gloucestershire and Bath & North East Somerset.
This blog, written by BIT, looks at how BIT conducted a feasibility study to develop a robust approach to impact evaluation.
The feasibility study
Pick 100 young people in the UK at random, and you will find that around 10-15 of them will not be in education, employment or training (NEET). Do the same for young people who have been in care, and that figure will be about three times as high; somewhere around 30-40 of them will be NEET.
It is therefore crucial that we understand what helps improve EET outcomes for care leavers, and that is why Youth Futures Foundation have been working with the Behavioural Insights Team (BIT), to prepare the Reboot programme for an impact evaluation.
This preparation (or ‘mobilisation’) work has involved two central parts.
- a qualitative ‘process study’ to understand how the programme operates, and
- a ‘feasibility study’ to investigate the readiness of the programme for an evaluation and consider how an impact evaluation could be designed.
In the world of impact evaluation, a feasibility study is like a roadmap to success. It works like a pre-flight checklist before launch – by answering critical questions like “Which evaluation design is the best fit?” and “What are the most appropriate outcome measures?”. By reviewing the data available, refining outcome measures and identifying possible roadblocks, we are not shooting in the dark; we are making design decisions that are informed by the reality of how a programme or a system operates in practice.
In many cases, limitations on a commissioner’s budget or timelines for the delivery of an evaluation mean a feasibility study isn’t possible. Where it is, this preliminary step is incredibly valuable. It can save time, reduce the risk of an evaluation running over budget, and increase the quality of the evaluation to ensure it is a valuable tool for making impactful change. Before commissioning an evaluation, Youth Futures Foundation were interested in understanding whether it would be possible to conduct a rigorous evaluation of the Reboot programme, and what would be needed to generate evidence at the highest possible standard. They commissioned BIT to conduct a full feasibility study to investigate.
There were four elements to our study:
- Exploring potential evaluation designs and their requirements: throughout the study, we returned to our key objective – to map what we learned against the requirements for each evaluation design
- Identifying and refining our possible outcome measures: this included a review of the available data, and power calculations to determine the sample size we would need for a trial
- Assessing the feasibility of different evaluation design options: this included working with partners to understand how feasible it would be to construct a comparison group, and what kind
- Developing early evaluation design recommendations: this included identifying elements of the evaluation to pilot before the full impact evaluation
We created a decision tree to summarise what we needed to learn, and as a tool to help us communicate to partners what we were doing and why. It meant that the learning process was two-way – while the evaluation team was finding out about how local authorities worked with Reboot to refer young people into the programme, delivery teams were in turn developing a clearer understanding of what an evaluation would take in practice.
The work took almost a year to complete, running from November 2021 to October 2022. While a year can be a long time in the world of evaluation, what we learned helped us avoid some of the most common pitfalls of impact evaluation – and has given us an early view of some potential surprises.
What we learned
EET outcomes aren’t neat
Based on our work to develop a Theory of Change for Reboot, we knew the most important measure of success for the programme was a young person’s Employment, Education, and Training (EET) status. However, a young individual’s EET status is anything but static. Over an evaluation of the two-year programme, a young person may move in and out of employment, education, or training multiple times.
According to Reboot’s Theory of Change, increased engagement in EET activities in the early stage of a young person’s journey acts as a stepping stone towards longer-term employment in the future. Looking at outcome data from previous years suggested this also happened in practice; data from previous participants showed a lot of variation in EET status early in the programme, which gradually stabilised over 18 months. Based on that, we decided to give priority to indicators that measured employment activities in the later stages of the evaluation period.
We also picked up potential challenges with data collection. While EET activity status is recorded each time a local authority adviser meets with a young person, our analysis of care leaver data suggested that data collection was less frequent for some care leavers, given it was not always possible to be in contact with them (particularly where some may have entered work or training). In order to measure our proposed outcome, we’d need LAs to contact young people more often, and speak to everyone who had participated in the trial. As it stands, LAs only have a statutory requirement to collect data on care leavers until the age of 21, whereas Reboot works with Reboot programme participants between the ages of 16 and 25. Identifying these challenges early on meant we were able to plan around them, working with LAs to ensure the resource and guidance they needed would be available to support the additional work. Since the feasibility report was completed, the evaluation team have also worked closely with Youth Futures Foundation to explore the viability of drawing on LEO (Longitudinal Educational Outcomes) data. Building LEO linking variables into existing trial datasets ensures that outcomes can be captured for all participants of the trial.
Counterfactuals are complicated
The first question we had to answer to understand the feasibility of evaluation design options was whether there would be enough young people within the four participating LAs to fill places on the Reboot programme, and still be enough left over to act as a comparison group.
The eligibility of existing care leavers for the trial was the first hurdle. While LAs provided data to show how many care leavers they had of an appropriate age, other factors ruled some of them out. We couldn’t recruit anyone who’d previously received Reboot, or those who were already in EET. LA data also didn’t identify those at risk of not entering EET – though this would also make a young person eligible for Reboot. This meant we were left to make some assumptions about the number of young people who would realistically be available to join the evaluation.
We knew from reviewing previous studies that robust evaluations of interventions as intensive as Reboot were rare, with the most similar ones generating effect sizes of 2-13 pp on EET status/outcomes. Based on what we’d seen in data from previous Reboot participants, we conducted power calculations that suggested that with 250 allocated to Reboot and between 144 – 417 allocated to a comparison group, we’d be able to detect effect sizes between 9.5pp-12.6pp. In other words, there was a risk that our analysis would not be able to detect a positive effect of Reboot even if participants did in fact benefit from the programme.
We had two options: To maximise the number of programme participants and the size of the comparison group, and to obtain additional data to make our analysis more powerful. Since the study, we’ve worked with project partners to pull both of these levers – maximising our chance of creating a sample large enough to deliver a robust randomised controlled trial.
Mobilisation isn’t over until it’s over
Our headline recommendation from the feasibility study was that it would be most feasible to deliver a randomised controlled design where the comparison group would be created within the four participating LAs. This led us to the next stage of our work. Building randomisation into existing referral routes for LAs and the Reboot intervention itself would be tricky – it would mean more time and resource from partners, the introduction of new procedures and collaborating to develop detailed guidance for operational teams.
We moved rapidly from the feasibility study to the design of a pilot to do just that, working with LAs over several months to develop a new referral route that included randomisation of young people into two groups. The pilot was valuable. It surfaced some important challenges with referral conversations and helped us develop more support for LAs who would need to explain to a young person why they may not have been allocated to the Reboot programme. It also helped us surface differences in how LAs collected data and allowed us to work with them to collect data in a way that made it more consistent.
The result is an evaluation design that has pre-empted challenges, rather than discovered them. None of this work would have been possible without the support of our project partners, including LA operational teams, and the team who deliver Reboot.