The Future Of

Big Data and Our Health

Episode Summary

How do health, economics and supercomputing combine to make us healthier? Professors Suzanne Robinson and Andrew Rohl discuss health economics in the digital age.

Episode Notes

The allocation of public health services is no guessing game. The government relies on vast datasets – and the analysts who can identify patterns within them – to understand the health of our population and where services are most needed.

In this episode, David is joined by Professor Suzanne Robinson, from Curtin University’s School of Public Health, and Professor Andrew Rohl, from the Curtin Institute for Computation, to discuss health economics in the digital age.

How do the worlds of economics, health and supercomputing combine to make us healthier? [00:30]
How do health authorities get their data and how do they keep it private? [01:33]
What sort of information are we learning from big data that we didn’t know before? [04:03]
How do computers find patterns? [08:06]
How is machine learning applied in a health context? [10:07]
What are we learning about our health from the data that’s been gathered? [13:43]
Are we any closer to learning why Indigenous Australians and those in regional areas have poorer health outcomes? [14:41]
What’s next with health data analytics? [16:15]
What impact will the changes to My Health Record have? [20:08]

Learn more

Got any questions, or suggestions for future topics?

Email thefutureof@curtin.edu.au.

Curtin University supports academic freedom of speech. The views expressed in The Future Of podcast may not reflect those of the university.

Music: OKAY by 13ounce Creative Commons — Attribution-ShareAlike 3.0 Unported — CC BY-SA 3.0 Music promoted by Audio Library

You can read the full transcript for the episode here.

Episode Transcription

Intro (00:00):

This is The Future Of, where experts share their vision of the future and how their work is helping shape it for the better.

David (00:09):

I'm David Blayney. The allocation of public health services isn't a guessing game. The government relies on vast datasets and the analysts who can make sense of them to understand the health of our population and where services are needed most. To discuss this topic, with us today are Professor Suzanne Robinson from Curtin University's School of Public Health and Professor Andrew Rohl from the Curtin University Institute for Computation. Thank you very much for joining us today. How do the worlds of economics, health and supercomputing combine to make us healthier?

Professor Suzanne Robinson (00:46):

Should I start? I'd say that economics is in everything that we do. So health's no exception. In terms of what it does for, for health is, there's scarce resources in health. So we don't have ... we have a limited number of hospitals. We have a limited number of staff. We have lots of limitations in terms of our resource. So economics can help us understand where we might want to allocate our resources to be both efficient - so, to make sure we spend our money wisely - and to have very good health outcomes for our population.

Professor Andrew Rohl (01:14):

And so computing now comes into the picture because there's so much more data being collected and perhaps more importantly people understanding there might be value in that data and then start making that data available, understanding that with these large computing systems, we can now extract data that just, sorry, extract information that wasn't possible in the past.

David (01:33):

How do the health authorities get their data and how do they keep it private? There's a lot of people who'd quite like to get a hold of our of our information, insurance companies being one ... how do they make sure that it's being used properly?

Professor Suzanne Robinson (01:50):

So data is a very serious business. Lots of companies, not just health, are very keen to get consumer information so they can organise their business. In health, the health system takes it very seriously, especially in the public sector. We get information from multiple places. So from clinical records, when doctors write things on paper, or into their iPads, it goes, gets uploaded onto systems. We collect every time people have a transaction for Medicare, which is how we pay for funds, we capture that sort of activity-type data. So a lot of what we capture does relate to how we fund healthcare. So it's activity-based funding and we're very good at capturing that type of information. So we have a vast amount of data. In terms of privacy, there's privacy laws and legislation in Australia from the federal government and some states, and we have to work within those privacy, that privacy legislation. So we do take it seriously and Curtin take that very seriously as well.

Professor Andrew Rohl (02:47):

So also, from the technology perspective, we can protect data by implementing, you know, good practices. And in fact, Curtin is one of the few universities that is ISO 9001 compliant ... is that the right number?

David (02:58):

Tell us more!

Professor Suzanne Robinson (03:02):

ISO 27001 certification. So we have, Curtin are one of the few universities across Australia that have ISO 7 ... 7,000 .... 2000, oh gosh, sorry! ISO certification - I'll leave it there. ISO certification that means that we take the business of data very seriously. And we recently had our sort of inspection by our auditors to see how well we do against the criteria that are set by the certification. And one of the comments which I was very pleased and proud of, was that we do have a culture - it's not just a tick box - we have a culture in terms of our privacy and how seriously we take our information. It does mean we sit behind locked doors and often, because the way we analyse and look at data, the rest of the world aren't allowed to see. So we have lots of patient-level information.

Professor Suzanne Robinson (03:49):

So it's important that we take that seriously. And we also, I should say, on that then, support our health services. Some of the services that aren't as sophisticated in terms of their own technology to be able to act and comply around privacy legislation.

David (04:03):

What sort of information are we learning from big data that we didn't know before? What can we learn from big data and how can we learn it?

Professor Suzanne Robinson (04:14):

Lots. So there's lots of in big data. What's very, quite exciting, is our ability to link data set. So our data sets could be very large so we can link GP - so when people go to see their general practitioner - we can link that data with hospital data. What that allows us to do is it allows us to look across the sort of the continuum of care. So look, a patient doesn't just touch one service - a patient with chronic disease will touch multiple services. And what we can do is see how that patient interacts with the service. For example, if somebody hits an emergency department and that's quite a costly event and often not a nice event for that individual, we can start to look at where we might intervene earlier to support people to, to keep them out of emergency services and actually out of hospital. So data allows us to do those things. It also allows us to do ... to have new discoveries, and we're looking at work now with chronic kidney disease and we're looking at whether or not we should fly patients down from country services, can we ... from the country regions of Australia. Can we do things differently? Can we get access to services in a different way? I suppose Andrew, you might want to think about from that sort of machine learning and AI perspective what you can do with the data and how exciting that is from, from your team.

Professor Andrew Rohl (05:30):

So I think the primary outcomes that are happening from AI at the moment are really around the ability of computers to be very good at recognising patterns, providing you've got a large number of patents to show them in order to train them. And so there's a particularly interesting project that I've learnt in. So one issue we have in the health service is, people make mistakes, right, for a whole variety of reasons. And so one thing that people often want to do is say, well, we can protect that by making two people do an assessment of any particular thing. If we look at radiology, because that's one area that has been completely transformed by this, then actually the model some people are doing is no, you'd be looked at by a radiologist and by a computer, and so it turns out actually the computer often picks things up that even trained radiologists don't.

Professor Andrew Rohl (06:19):

But more importantly you've got that kind of check that you can have - that the diagnosis that has been made by the radiologist is correct and by bringing the fact that they're the ones with the human expertise, where they can see the computer might have put something up that is unusual or not very often seen, but the computer will have seen it because it's looked at a massive training set, on one hand, but then also on the other hand, it may be pointing out to the radiologist that maybe there was something that they hadn't thought of.

David (06:48):

Forgive me for asking perhaps a bit of a dumb question. When does data become big? What exactly is big data?

Professor Andrew Rohl (06:57):

So, there's no real definition of 'big'. It's not a certain minimum size.

David (07:03):

It's not like you get up to one terabyte and it's like, ‘ah, that's big!’

Professor Andrew Rohl (07:04):

No. So our radio astronomers, which they can - it's a very large group doing a lot of data analysis, data science at Curtin - you know, they have petabytes of data. So they have a phenomenal amount of data. So clearly they are in the big data realm. I think mainly what people are thinking about when they talk about big data is a mix of data coming in probably in some live mechanism. So you've got this kind of flow of data that you need to interact with as it comes in. Typically it's more than one data source so it's a bit different. Radio astronomers only look at one data source - it's coming from their telescope - but many of the things we're talking about today, you need to bring in data from, from different things. And so actually resolving ... so some of the big challenges are resolving the fact you've got data coming from different sources that are telling you different things that might be measured on different time scales. How do you stick that into a coherent system so you can make sensible decisions?

David (08:06):

How exactly do, well not exactly - we'll go for sort of high-level explanation - how do computers find patterns? Do we have to sort of say, 'right, find a pattern between this data set and this data set please'? Or do we just sort of throw the data in like sort of raw effluent and then it sort of ... out comes a nicely sort of sorted graph or spreadsheet for us to use?

Professor Andrew Rohl (08:29):

I'm not sure that raw effluent's a good example! But in terms of putting data in, so, the most commonly used form of machine learning is what we call supervised machine learning. And what that means is that you have a large amount of data. It can be image data, but not necessarily. But the image data one is perhaps easier to conceptualise. So you have this large amount data and then you have a label associated with each of those images. And so, for example, it might be that you're going to label in a whole bunch of images, the animal that is in that image. So you just have an image of a giraffe and then in the computer it knows that that's a giraffe. Okay? So you go through, give the computer, say a million bits of data that might have 20 or 30 different labelled animals in it. And then the computer from using machine learning can actually be trained to recognise each of those animals. However you give it an animal that's not in your data set, it has no idea.

David (09:31):

So is this why whenever I log onto Ticketmaster or any website, it says like, 'right, click on all the cars and stop signs, please'. That sort of thing. Am I helping to train a computer by doing that?

Professor Andrew Rohl (09:43):

You absolutely are. So you're helping Google further train their machine learning models that are identifying different bits and pieces. I think - self evident - the reason why they're asking you about, identify cross walks and traffic lights, is they're spending a lot of money in doing automated cars. And so they need very good and very well trained machine learning to identify all of the things that you come across in a typical car ride.

David (10:07):

So you've talked about images, well, putting labels on images is the example that you used. How about in a health context? What sort of data is being, being fed into these computers and what kind of information are we getting out of them?

Professor Andrew Rohl (10:21):

So there's a lot of data. I think Suzanne's already discussed about the sorts of data we get from GPs, the sorts of data we get from Medicare. So every time that you go and get a prescription filled, for example, that creates a record in Medicare that we know what's there. It turns out that it doesn't really matter what the data is from that supervised learning thing. It's just the fact that you've got some data and then associated with a label and then you can provide that to the computer to answer those sort of things.

Professor Andrew Rohl (10:54):

The other aspects I think around data science are things like looking for kind of patterns in data. Ok, what I just said was one, one of those. But you know, for example, we can look for patterns if we've got data on people presenting to the GP and then what sort of drugs that are being prescribed. Okay. And so from there, there might be very interesting links between the illness that people are presenting with and then what's being prescribed, which may or may not be appropriate. I said ... I think we all know ... that advances in medicine we made at a pretty staggering rate. And so, you know, how does your GP keep up with all of those things that are happening? These are the sorts of things that you can see in that data.

Professor Suzanne Robinson (11:37):

And we do. We are currently looking at Curtin in terms of inappropriate antibiotic prescribing. And we are using population-based datasets to look at that. So when we say that, back to that big data, what's a large data set? You're talking whole populations and looking again for patterns. We might not be doing it with machine learning necessarily, from my groups and the economists, but we do draw on that sort of idea and approach to, to look for these types of patterns to see why people behave in a certain way and can we see anything from the data. And then of course it's important to go and explore that with our general practitioners to see if the data matches what their viewpoints might be.

David (12:12):

So what have we ... well, you mentioned antibiotics. What have we learned from this sort of analytical research about how antibiotics are being prescribed correctly or incorrectly?

Professor Suzanne Robinson (12:25):

So we call it inappropriate antibiotic prescribing and basically it's just about prescribing. And we have guidelines. So the research looks at against guidelines and how clinicians are prescribing against the guidelines, which are based on scientific evidence. Sometimes clinicians will prescribe and although it's against the guidelines, it's quite the right thing to do. And other times we know that it's not. So we know that there's a bit, you know, there's this, we know internationally the issues with resistance and inappropriate prescribing. So we can see ... I suppose that what allows us to do is, it gives us the evidence to say, to go out and have a conversation with clinicians around 'this is what's happening'. And then we can explore why that might be and hopefully change practice. So we're also then thinking of next stages of that research. Where there's visualisations or there's information shown on screen as to question why somebody might be prescribing a type of a certain type of drug. So the data allows us to do that. But the evidence is really important. So bringing evidence together.

Professor Suzanne Robinson (13:24):

Clinicians and policymakers do understand information and when you present information back to people, it's not just 'somebody said' that might have happened, you know, one person said, 'I think I've been inappropriately given these antibiotics' - it's actually saying, 'at a population level, this type of activity is happening' and that makes it much more powerful.

David (13:43):

And what are, what else are we learning about healthy outcomes from the, from the data that we've gathered?

Professor Suzanne Robinson (13:49):

So we gain data from lots of different places. We know that the population is well, we know that we've got a very good health system from the data because we're living a lot longer, which is one indicator of a good outcome. So length of life has increased. We know that quality of life has increased. We do also know that quality of life has not increased for all of the population. So for example, indigenous population, we know that the life expectancy of people living in country WA - indigenous or non-indigenous - is much lower than the life expectancy of people that live in the metro region. So when we have that type of information, what that does is allow policy makers to think about what resources or services they might need to target for those particular populations. So having that information around health outcomes can allow us to then think about how can we improve health outcomes and equity and other factors.

David (14:41):

It won't come as a surprise to anyone who pays attention to health that Indigenous Australians and people who live in regional areas tend to have worse health outcomes than people who live in this, in the city, or people who are not Indigenous. Are we any closer to figuring out why?

Professor Suzanne Robinson (14:58):

So that's a really complex question that the data allows us to go some way to answer. And when we talk about data, and we've talked about health data today, what we really need to explore and rethink about country health is to look at the broader social determinants of health and data around education, crime and other datasets. And we are starting to look at that data and to try to link that data with health data to give us a greater understanding because we need to target, like, really health is the, the sticky plaster at the end. We need to target ahead of that in terms of services. So providing good shelter for people, providing them with healthy food as an option that's not too expensive and provide an education. We we know and we've known that for a long time, that those things are important. So health data is important for understanding health services, but we need to go broader than that to be able to understand the complexities of why we've got different health outcomes in contrast to metro. The other thing to say around that is access to services and sometimes, well, we spend a lot of money bringing people from country into the metro region and that's not always the appropriate thing to do for that individual. So using the evidence and the data we're allowed to, we can track people and we can look at whether those decisions were a good decision or not, clinically. And other factors.

David (16:15):

What's next when it comes to health data analytics? What's left to learn? Everything obviously, but what are we going to learn?

Professor Suzanne Robinson (16:22):

It's broader than just the data. So digital health is very broad. So I talk about data as part of digital health. There's a whole industry out there. We've got wearables, we've got new technologies. We've got infrastructure for data where you start into look at using big data sets to, for clinical decision making. So when a clinician makes a decision for patient, our datasets are massive, but that clinician only needs one or two pieces of information. So we're currently working with our clinical colleagues to think about what pieces of information are important at what points in the decision making with the patient. And also too, other things we're doing, we're looking with, with consumers or patients to think about how we can use activities that we use - data from wearables and other technology - to help them to shape their own behaviours. Because it's all about behaviours, you know, when we think about people's health and wellbeing, a lot of it is about our behaviours and our decisions that we make. So getting people to understand that a little bit better and then to think about how can we use that sort of technology alongside the data. And that technology just produces lots of data. But actually the technology itself is important for people's health.

Professor Andrew Rohl (17:30):

So I think on the kind of the technical side with, say, especially in machine learning, is that ... I've talked about supervised learning and as I said that's kind of, has, taken the world by storm and it really is impacting almost every activity, human activity, that's out there. So the big next thing there is, as I said before, if you give the computer something that it's not seen before, it doesn't really know what to do. And so there's a whole new area which has been around just as long as supervised learning called unsupervised learning, but it's in unsupervised learning - where you're not giving it label data - where what the computer is trying to do is actually detect anomalies or something that is an unusual behaviour, that we think some of the next big steps in artificial intelligence will be made.

David (18:13):

And how are we doing in terms of getting towards unsupervised learning?

Professor Andrew Rohl (18:18):

So as I say they exist and in some cases they work quite well for a specific problem. But they are quite some way off being general in the sense that the supervised learning is general, right? We can chuck any, any labeled data set of any type. So it doesn't have to just be images, it can be movies, it can be, let's say, in a health context we've got traces of people's heartbeats for example. All of these things can be put into supervised learning. So unsupervised learning needs to be - at the moment is much more narrow-focused - and so we really want to make that a much more general - a much more general set of algorithms that you can, as I say, put almost any data set and then it'll start to tell you what in that data set is unusual compared to the rest of the data set.

David (19:06):

And one final question for both of you. What are you, what are you working on now? What are you looking forward to, in terms of research in the future?

Professor Suzanne Robinson (19:15):

So we're part of a digital health CRC and both Andrew's group and my group are involved in a project that's looking at chronic kidney disease across Western Australia and has potential to look across other states in Australia. And we're using all of the different types of skills that we've got. So it's exciting that we've got our economists and statisticians working with our data scientists and with our clinical colleagues to think around how we can support people living with chronic disease, kidney disease, to have better health outcomes. We can't change the fact they've got chronic disease. But from using that data and the evidence and using that with clinical - the clinical world - we can certainly start to slow the progression of the disease, hopefully. So that will be from a clinical point of view, but the data's really important to clinicians understanding the progression of the disease and how that impacts on people's health and wellbeing to help them make good decisions.

David (20:08):

I have one actually final, final question, so I'm going to be a little bit naughty and tack on a little bonus question at the end. Suzanne, last year the My Health Record system went from being opt in to opt out, and so now unless you asked not to have one, we all have a digital record of our health information. What impact does that have on your discipline?

Professor Suzanne Robinson (20:25):

So if it's successful, if the My Health Record is successful, and it's - that's a big if, I think - if it is successful then that empowers a lot of things. It empowers research in that we will have information that (we will not just able to readily access, but we will have to go through different processes to access that information) but if we can access that information, we actually have a whole round set of information about an individual as they track through the system. Currently it doesn't capture all the information that will be relevant so we would still have to link to other datasets from a consumer perspective that's really helpful. In that when you go from one clinician to the next, you have that information to share. It also means that it could impact on clinician, on patient behaviour. So consumers might start to behave differently if they too can access that data. And I suppose from our perspective, we can start to analyse that and see if things like the medical health record are making any difference on behaviours. So are we seeing people accessing services more because they've got information, or are we seeing them access services less, and those types of questions.

David (21:28):

Sounds like there's a lot of lot of work to be done in this field. I think we'll leave it there. Thank you very much Suzanne and Andrew for in coming in and for sharing your knowledge on this topic.

Professor Suzanne Robinson (21:37):

Thank you.

Professor Andrew Rohl (21:37):

Thank you.

David (21:38):

You've been listening to The Future Of - a podcast powered by Curtin University. If you'd like to learn more about anything that we've discussed today, you can get in touch with us by following the links in the show notes. Bye for now.