A Q&A with Marian Wheeler, GDST Data Systems Manager, Girls' Day School Trust
As GDST Data Systems Manager at the Girls’ Day School Trust, Marian leads 26 schools in their use of data systems and MIS. This role encompasses the development of best practice, organisation data standards, supporting the introduction of new systems, analytics, and the development and delivery of training to empower the schools in use of systems and data.
Marian’s passion for MIS systems and data comes from many years working in the education sector, within Local Authorities, as well as schools. Her experience includes over a decade working for a Local Support Unit, supporting around 800 schools in their use of the SIMS MIS, with a particular focus on the use of systems and data to support school improvement and consultancy work advising SLT on exploiting MIS technology to its fullest potential.
What best practice processes would you recommend for structuring and storing data outside an MIS?
Marian: Having clear and consistent naming conventions and capturing meaningful data is essential. It’s about not making assumptions that everyone understands what you’re talking about. You need to have a good foundation with a common understanding of data, so that it’s valuable to everyone.
We’re doing a lot of work at the Girls' Day School Trust (GDST) on our Data Cloud project. We’ve built a data warehouse in Microsoft Fabric, and we’re using Power BI to visualise it. We’re sourcing data from lots of different places, such as financial systems, HR systems, and MIS.
Part of this work is around creating a data dictionary so that you give users a clear explanation of what a particular field means. If you’ve got calculated data you need to explain the formula; the data dictionary will help you clearly define the calculations. It’s something I’ve noticed in SIMS Next Gen, where there are definitions provided for the calculations around attendance data.
That’s one of the most important pieces – because if you don’t get that right in terms of labelling and clear understanding – then you’re going to be in danger of making that data meaningless because it’s possible that the data and analysis is open to interpretation. People may draw conclusions that aren’t necessarily correct, and therefore, your decision-making can become flawed. So that’s essential.
We’re using cloud storage for our source data files, which is the most cost-effective way for us and also means that it’s scalable and accessible when we need it.
Clear use of metadata also ensures that we’re descriptive about our data, can categorise it clearly, and that it is searchable.
These are just some of the main things in terms of good practice around structuring and storing your data outside of your MIS which will give you a good foundation.
How do you maintain data quality?
Marian: This is a key part of the data governance piece. Within my remit in the GDST, I’m ensuring that we have good quality data because if we don’t, we’re basing decisions on data which could be incomplete. At the GDST, we have multiple schools, meaning it becomes much more critical to have data standards which allow us to clearly identify specific data items that we see as key to our decision-making and also ensure that we understand exactly what they mean and how they’re recorded across all our schools. This could include things like the frequency of data recording. I think this applies equally to a single school. It’s about having clear documentation and setting standards around specific data, which would also include the intent. So ask yourself, ‘How will we analyse it [the data]? Why is that data important?’
There are lots of tools that we’re using, within SIMS itself and also using Power BI, to look at data quality, like ensuring our data is complete, because you don’t want to base decisions on data where you think you’ve got your entire cohort included, but actually only 25% of them have data. So, looking where data is missing is key.
We’re also looking at the validity of data. That includes data fields that enforce a choice, a lookup in other words. So, it could be a choice of yes, no, or maybe, and at some point in time, we’ve decided not to record ‘maybes’ anymore. We need to understand whether the data we have is still valid, so we can identify potentially invalid data items. You get that a lot, particularly in schools, with some of the key data fields that the DfE collects, for example, ethnicity and languages. They will have defined a list of values that everyone should be recording against, and then the DfE may well change that list of values at some point. So that can mean that historic, as well as current data, is potentially invalid, and you might need to consider whether you’re actually going to use that in your data set or whether you need to update it.
The other side of the coin with data validity, is ensuring that data is sense-checked. We all use lots of systems now where we’re collecting data from different people directly – where parents, pupils, or staff are actually filling the data in rather than us seeing it first. It’s just going straight into our systems, and because of that, we need to be careful. Although the values they’re recording may be valid, it doesn’t mean that the data is necessarily correct. One simple example could be recording sex as male or female, and also recording someone’s title: Mr, Mrs, etc. Cross-checking those two fields against each other can highlight if somebody could have selected Mr for the title, when their sex is female. It can significantly affect your overall analysis if you don’t think to look at the quality of data you’re working with.
We also regularly perform data cleansing activities, looking for duplicates and clearing down historical data, both of which could introduce bias and make your analysis less meaningful. If we have data going back 20 years or so, it may no longer be something that we want to include within the data set that we’re analysing.
Data standards are about having a clear purpose for why we’re collecting the data, how we are processing it, what we’re planning to do with it, and what decisions that involves. I think it’s crucial that you have that well laid out, to ensure that you get good-quality data.
One of the big things that we are actively always working on is understanding the ownership of data systems or key areas of data. For example, you might have your Deputy Head (Pastoral) being responsible for all attendance, welfare data, etc. So, from that perspective, they should be leading the effort to share understanding with the people who may be working with that data on a day-to-day basis, to ensure that they understand why it’s important to have good-quality data and thereby enforcing data standards.
One key piece in all of this is ensuring that you have the skills, knowledge, and resources to support your data strategy. For example, if you’re considering working with an MIS, you need to ensure that you have adequately skilled people who know what they’re doing with the system. Person specifications in job descriptions should reflect the skills required to effectively carry out the role and these should be reviewed on a regular basis. Skills audits and other similar activities all feed into overall data quality. This is not about barriers to recruitment, but about ensuring staff are appropriately empowered and training is put in place where necessary.
We also need to ensure that when we examine our data strategy, including how we collect, process, and analyse data, we are sufficiently resourcing it and ensuring the intended outcomes are achievable. If we’re not, then again, that’s likely to impact the quality of data and the insights that we gain from it. We should be doing things like regularly reviewing job descriptions and skills audits, identifying where there are gaps so that we’re doing effective planning around data; not only for what we’ve got now but also what we’re thinking of implementing going forward. This means we’re always in a good position and can be confident that the foundations of our data insights are reliable. I think it’s a critical piece, and it’s something that people don’t always think about.
We are seeing such a big shift now in education generally to being much more data-focused and making decisions based on it. Therefore, we need to have a data-literate workforce that understands the analysis and dashboards presented to them.
One of the critical pieces we are working on within the GDST is around ensuring that when we make dashboards available to different groups, we’re supporting those groups in using those insights and providing tailored training sessions. We can do all of this wonderful work and have this great data; it can be lovely, clean, and of good quality, and we can create amazing insights from it – but if the people who are making the decisions at the end of the day aren’t confident and don’t have the right skill sets to work with the data, that’s a real problem. There’s a whole piece of work in education about understanding who our audiences are and tailoring analysis appropriately. All of this feeds into getting good-quality data insights.
Read next: Why data matters for school leadership
Marian’s Linkedin profile: https://www.linkedin.com/in/marian-wheeler/
The GDST website: https://www.gdst.net/about-us/about-the-gdst/