WEBVTT

1
00:00:12.280 --> 00:00:14.730
Dr. Suchi Saria: Mike, can I ask a quick question while we're waiting.

2
00:00:14.960 --> 00:00:22.050
Michael Littman: Yeah, yeah, though, there are people's site. People are arriving. So we're we're talking at the front of the auditorium while people filter in, go ahead.

3
00:00:22.050 --> 00:00:27.830
Dr. Suchi Saria: Okay, got it. I was just curious, is it? Are we sort of aiming for an hour with QA.

4
00:00:29.187 --> 00:00:33.469
Michael Littman: I think we have. The slot goes for an hour 15, including Q. And a.

5
00:00:33.470 --> 00:00:34.599
Dr. Suchi Saria: Okay, perfect.

6
00:00:35.770 --> 00:00:37.890
Michael Littman: I can double check that in my calendar.

7
00:00:38.250 --> 00:00:39.340
CISE AD | Gregory Hager: No, that's correct.

8
00:00:39.610 --> 00:00:41.609
Michael Littman: All right. We have confirmation.

9
00:00:42.280 --> 00:00:47.400
Michael Littman: and then there's a there's a follow up meeting with Nsf. Staff. That's a separate thing.

10
00:00:48.140 --> 00:00:48.970
Dr. Suchi Saria: Perfect.

11
00:00:49.930 --> 00:00:50.919
Michael Littman: Cool, cool, cool.

12
00:00:51.930 --> 00:00:55.939
Michael Littman: all right. So looks like people are still filtering in.

13
00:00:56.190 --> 00:00:57.202
Michael Littman: so I will

14
00:00:57.730 --> 00:01:00.480
Michael Littman: wait a moment or 2 for the numbers to level off.

15
00:01:01.000 --> 00:01:02.520
Dr. Suchi Saria: Are you there in person.

16
00:01:03.000 --> 00:01:05.500
Michael Littman: I am in the building at Nsf right now. Yeah.

17
00:01:05.970 --> 00:01:09.779
Michael Littman: And I was actually I was on campus at Hopkins on Monday.

18
00:01:10.410 --> 00:01:11.150
Dr. Suchi Saria: Nice.

19
00:01:11.150 --> 00:01:13.140
Michael Littman: Were you? Are you there, or are you in New York?

20
00:01:13.140 --> 00:01:15.990
Dr. Suchi Saria: San Diego right now, speaking from a hotel.

21
00:01:16.120 --> 00:01:18.150
Dr. Suchi Saria: because I'm in a different meeting.

22
00:01:19.320 --> 00:01:20.150
Dr. Suchi Saria: Sadly.

23
00:01:20.150 --> 00:01:20.710
Michael Littman: Joining.

24
00:01:21.350 --> 00:01:30.910
Michael Littman: Okay, I feel like we're at sort of trickle level of increase at the moment in terms of participants. So I will kick this thing off.

25
00:01:30.960 --> 00:01:38.399
Michael Littman: Thank you. Everybody for coming to the National Science Foundation Size Directorate Distinguished Lecture Series.

26
00:01:38.570 --> 00:01:52.139
Michael Littman: and I'm very honored for today's speaker to introduce Professor Suchi Saria. She is the John C. Malone, Associate Professor of Computer Science and of Statistics, and also Health policy at the Johns Hopkins University.

27
00:01:52.220 --> 00:02:01.900
Michael Littman: and one of the reasons that we were very excited to invite Suchi as an Nsf. Size distinguished lecturer was because she's working on so many interesting things, and of course.

28
00:02:01.900 --> 00:02:26.529
Michael Littman: that also made scheduling her talk a bit of a challenge because she's doing so many interesting things. And in fact, I found out that she's calling in today from San Diego, because there's another set of meetings that she's involved in there, and she's taking time out to kind of talk with us. But we're here now, and I'm really glad that she's able to join us. Her work has received recognition in numerous forms, including best paper awards at machine learning, informatics and medical venues.

29
00:02:26.530 --> 00:02:50.390
Michael Littman: But since this is an Nsf. Event, I'll highlight her Nsf. Computing innovation fellowship which she got in 2011, and it's also worth mentioning that she is the founder and CEO of Bayesian health, which is just a great name. I love that name, and in the words of Forbes Magazine founder, Suchi Saria's startup Bayesian health offers software to help hospital staff identify high-risk patients.

30
00:02:50.420 --> 00:03:01.789
Michael Littman: its products, evaluate health history and medical records to empower healthcare providers to take timely lifesaving actions for patients at risk of critical conditions, like sepsis, deterioration and pressure injuries.

31
00:03:01.940 --> 00:03:11.389
Michael Littman: The company has raised 15 million dollars from and from investors, including Andreessen, Horowitz, and another 15 million in grants from the National Institutes of Health.

32
00:03:11.420 --> 00:03:29.219
Michael Littman: Darpa, and, most importantly, and I'm sure that it says that in the Forbes magazine blurb the National Science Foundation. She's given various flavors. This is now me. Again. She's given various flavors of Ted Talks as well as keynote slots at many major conferences in AI machine learning and medicine.

33
00:03:29.220 --> 00:03:42.949
Michael Littman: I, personally have been following Suchi's work since she was a Phd. Student with Daphne Kohler, and I'm eagerly anticipating hearing what she'll share with us today. So let's all give a warm, albeit silent, welcome to Dr. Suchi Saria. Thank you for being here. Suchi.

34
00:03:43.910 --> 00:03:51.869
Dr. Suchi Saria: Thank you for having me. I'm very excited to be able to join. I wish I could see the folks who are there in person or the folks who are attending

35
00:03:52.772 --> 00:03:55.339
Dr. Suchi Saria: I structured my talk.

36
00:03:55.540 --> 00:03:56.709
Dr. Suchi Saria: you know, when

37
00:03:57.180 --> 00:04:04.570
Dr. Suchi Saria: And as I've sent the invite out, I got a whole bunch of folks from completely different areas of computing and math and Cs

38
00:04:04.590 --> 00:04:09.950
Dr. Suchi Saria: and other areas, who wrote to me saying, Oh, you're giving this lecture, which made me think the audience that is

39
00:04:10.674 --> 00:04:18.479
Dr. Suchi Saria: hearing this wise, the intended audience is very broad and not just. People who do AI machine learning research all day long.

40
00:04:18.519 --> 00:04:25.269
Dr. Suchi Saria: So with that in mind, I've structured my talk where it won't be entirely boring to people who don't do aiml all day.

41
00:04:25.300 --> 00:04:29.029
Dr. Suchi Saria: It will be a collection of broad description of problems.

42
00:04:29.080 --> 00:04:34.889
Dr. Suchi Saria: a little bit of an overview of sort of a 10 year journey and sort of bringing some ideas to life.

43
00:04:34.970 --> 00:04:40.990
Dr. Suchi Saria: And then in some areas we'll do a deep dive. In recent state of the art ideas

44
00:04:41.170 --> 00:04:43.120
Dr. Suchi Saria: on the aiml front.

45
00:04:44.450 --> 00:04:55.089
Dr. Suchi Saria: with that in mind, I'll jump in. So here's a here's a sort of a paper generally. You know. I wouldn't have had the dates there and done a real time. Poll

46
00:04:55.200 --> 00:04:58.750
Dr. Suchi Saria: on this, and and the paper is basically talking about

47
00:04:58.780 --> 00:05:06.219
Dr. Suchi Saria: how rapid advances information science, computer science is going to absolutely completely transform the field of medicine and the practice of medicine.

48
00:05:06.410 --> 00:05:10.640
Dr. Suchi Saria: and you know, reimagine the role of caregivers.

49
00:05:10.940 --> 00:05:21.420
Dr. Suchi Saria: and you know, when you read it, you think at least for me, I feel like it's the kind of thing I would expect to be written today because of the amount of excitement for what computer science can do

50
00:05:21.480 --> 00:05:25.559
Dr. Suchi Saria: except the article was written in 1970,

51
00:05:25.570 --> 00:05:41.399
Dr. Suchi Saria: which you know as a researcher, I want to kind of learn from history. So when I when I read that, I was kind of disappointing to me because I'm like Holy Majulie. What if 30 years from now, you know, we've made all these promises now and 30 years from now we look back, and we sort of still feel like.

52
00:05:41.800 --> 00:05:46.980
Dr. Suchi Saria: you know, we were so excited. And then, you know, not as much materialized. Here's another

53
00:05:47.130 --> 00:05:50.300
Dr. Suchi Saria: article from 1994,

54
00:05:50.660 --> 00:05:55.500
Dr. Suchi Saria: which says a report card of computer assisted diagnosis grade C,

55
00:05:55.530 --> 00:06:13.879
Dr. Suchi Saria: so as a young researcher or younger researcher, when I was younger, looking into this, this sort of really bothered me. And so one of the things that I often think about is how to make sure we don't repeat this and that sort of inspired a lot of the trajectory of my work. But part of

56
00:06:13.920 --> 00:06:40.000
Dr. Suchi Saria: part of what's also exciting is, it's not just about the methods work. It's not just about the tech itself. It's also about the ecosystem within which the tech is be getting embedded, and a lot has changed about the ecosystem in this, in which the tech is getting embedded, which is, I think, finally, we're at this turnaround point where we are going to see this crazy, rapid adoption, or at least I can palpably see

57
00:06:40.010 --> 00:06:42.200
Dr. Suchi Saria: very meaningful difference in the last

58
00:06:42.290 --> 00:06:45.530
Dr. Suchi Saria: 4 to 5 years. So 1st

59
00:06:45.640 --> 00:06:58.650
Dr. Suchi Saria: evidence of that is this slide is already outdated. There are upwards of about 500, maybe 600. Now, maybe more. AI enabled devices that have been approved by the FDA in the last.

60
00:06:59.180 --> 00:07:01.770
Dr. Suchi Saria: you know, in the last few years.

61
00:07:02.172 --> 00:07:09.480
Dr. Suchi Saria: Many of the these are in imaging. I'll talk a little bit more about, and sort of opportunities and multimodal AI.

62
00:07:09.530 --> 00:07:10.164
Dr. Suchi Saria: But

63
00:07:10.890 --> 00:07:18.180
Dr. Suchi Saria: there's definite evidence, for you know, ideas going from what we would call the lab bench

64
00:07:18.230 --> 00:07:19.540
Dr. Suchi Saria: to the bedside

65
00:07:20.340 --> 00:07:26.710
Dr. Suchi Saria: in terms of the opportunity. Real world practical impact these ideas can have like, here are some real

66
00:07:26.840 --> 00:07:29.398
Dr. Suchi Saria: technologies today that are live.

67
00:07:29.940 --> 00:07:34.590
Dr. Suchi Saria: all invented in the last, like 5 to 6, 7 years

68
00:07:34.730 --> 00:07:36.849
Dr. Suchi Saria: and really core fundamental

69
00:07:37.260 --> 00:07:47.990
Dr. Suchi Saria: nsf funded technology translating. So the 1st one is basically the ability. Today, you know, when you have today's ultrasound machines, often very expensive.

70
00:07:48.260 --> 00:07:51.609
Dr. Suchi Saria: They're hard to access, but turns out like

71
00:07:51.670 --> 00:08:05.559
Dr. Suchi Saria: there are new kinds of low cost, handheld ultrasounds getting built coupled with AI on it, which now makes it so that you don't have to go for an ultrasound necessarily to a very expensive like facility

72
00:08:05.580 --> 00:08:10.780
Dr. Suchi Saria: and slice in, you know, ultrasound experts or specialists.

73
00:08:10.830 --> 00:08:40.799
Dr. Suchi Saria: you can start to get screening for complications that could have been caught earlier at the urgent care center, which is sort of your nearby urgent care center which you know, is, you're much more likely to frequently visit. So in in a very good example of how point of like AI, combined with novel point of care. Diagnostics is dramatically improving access to high quality care by both making it cheaper and making it easier to access.

74
00:08:41.360 --> 00:08:56.100
Dr. Suchi Saria: The 3rd example here of diabetic retinopathy is one where today patients have to go to again, a specialist to get diagnosed for diabetic retinopathy again, making it less and less likely that they know about it.

75
00:08:56.280 --> 00:09:04.819
Dr. Suchi Saria: you'd have to have a premeditated reason for being there. In the 1st place, this is a fully autonomous device that you can now put in a primary care provider's office.

76
00:09:04.900 --> 00:09:31.120
Dr. Suchi Saria: And what the device does is it allows you to take a retinal scan and automatically identify. If the person has diabetic retinopathy again. The whole thing is cheaper than you would have gone. It would have been going to a specialist. But what's also exciting is again you go to a primary care provider way more often than you go to a specialist. So your chance of basically accessing the technology, knowing if you're at risk is much higher, which means you can more proactively treat.

77
00:09:31.120 --> 00:09:43.750
Dr. Suchi Saria: Last example is an example. In the middle of stroke diagnosis, where essentially, you know, in in stroke, time is money every every, you know, like minute matters.

78
00:09:43.750 --> 00:10:02.480
Dr. Suchi Saria: So the ability to take an image for AI to automatically interpret it, flag high risk cases and make sure those patients are getting transferred to the right center in a timely fashion means you're not waiting for the specialist to be able to have the time to go read it, it dramatically accelerates our ability to get treatments, and as a result.

79
00:10:03.062 --> 00:10:04.780
Dr. Suchi Saria: can be lifesaving

80
00:10:05.240 --> 00:10:15.591
Dr. Suchi Saria: so really compelling examples of solutions that leverage AI to improve you know, meaningfully improve

81
00:10:16.850 --> 00:10:19.349
Dr. Suchi Saria: quality of care and patient outcomes.

82
00:10:20.570 --> 00:10:47.679
Dr. Suchi Saria: I'm going to talk a little bit more about multimodal AI. So like another super exciting thing that's happened since 2010, which is not that far back in the future in the past is the advent of electronic health records. Right? So prior to 2010, a lot of our infrastructure in the Us. Was very focused on paper records. You go to a clinician's office. They would be writing your notes and paper records, or some local infrastructure, and then effectively.

83
00:10:47.680 --> 00:11:12.719
Dr. Suchi Saria: you'd go the next time they would hear you again and respond. But what's very, very much changed in the last 14 to 15 years is now most health systems in this country, like hospitals, patient practices, etc. Provider practices have gone electronic, which means all the all of the information about like where you know your history, what was given to you, how you responded, etc, is now in computable form.

84
00:11:12.810 --> 00:11:19.372
Dr. Suchi Saria: and what that now makes possible is suddenly giving us a view into what is,

85
00:11:19.870 --> 00:11:23.709
Dr. Suchi Saria: you know, not just a patient's journey, but also the quality of care.

86
00:11:23.810 --> 00:11:50.370
Dr. Suchi Saria: new opportunity for use, this data to invent new treatments for earlier diagnosis, more precise targeting of treatments, but then also deliver care in a much more precision. Way, by using this information to, you know, in real time change the way we practice. So this is the what I view as one of the most interesting, compelling ecosystem shifts that have happened, that suddenly made medicine, which is a massive enterprise.

87
00:11:50.370 --> 00:11:59.680
Dr. Suchi Saria: Suddenly the use of computing within medicine to meaningfully transform both the way care is delivered and human outcomes.

88
00:12:00.690 --> 00:12:27.110
Dr. Suchi Saria: After this slide I'll start going into more of the details of like the technical details. This is my last high level overview. But a lot of what you'll see is very heavily colored from also my real world experience of taking research, almost a decade of research and methodological ideas. We developed in the lab and then translated it through this company that Michael described as Bayesian Health

89
00:12:27.488 --> 00:12:53.180
Dr. Suchi Saria: when we call it Bayesian, why do we call it Bayesian? It's not necessarily because it's using purely Bayesian techniques. It's because the way to think about this is, you know, the Bayesian way of reasoning, when possible, is the most sound optimal way of reasoning where you can incorporate prior data with real time data do uncertainty, quantification, know what data to trust, what not to trust and update as new information arise, and the idea is to be able to give this kind of

90
00:12:53.641 --> 00:13:03.240
Dr. Suchi Saria: quality and capability to our frontline care providers in, you know, one of the most areas important areas that matters to all of us, which is our human health.

91
00:13:03.480 --> 00:13:25.730
Dr. Suchi Saria: And how do we do this in a way that uses state of the art technology to do that? And so that's sort of the company. We started Bayesian health and through Bayesian I've had the chance to learn and develop a lot of real world experience partnering with health systems nationally, both big and small. Not just sort of like, you know, very rich

92
00:13:25.870 --> 00:13:43.070
Dr. Suchi Saria: academic medical centers, but also rural community hospitals, where, you know, there's really great need for these kinds of technologies to up level the quality of care. And so you'll hear a little bit of like some learnings there that has then informed a lot of the methodologic research. You'll see that I'll talk about.

93
00:13:43.420 --> 00:13:49.909
Dr. Suchi Saria: Okay. So with that in mind, I'll dive in into sort of more of a concrete example. I've worked.

94
00:13:49.990 --> 00:13:52.700
Dr. Suchi Saria: I've been really lucky to be funded by Nsf.

95
00:13:52.880 --> 00:14:03.469
Dr. Suchi Saria: And on my methodologic work, and have partnered in a variety of flying these methods in a variety of practical disease areas and autoimmune diseases and

96
00:14:03.570 --> 00:14:16.259
Dr. Suchi Saria: neurologic diseases and infectious diseases, and so on so forth. As inspiration for the foundational work we do today. For the purpose of this talk, I will pick one example area called sepsis.

97
00:14:16.430 --> 00:14:20.080
Dr. Suchi Saria: I've done about a you know, really gone deep in this space

98
00:14:20.190 --> 00:14:34.979
Dr. Suchi Saria: and done about a you know, a decade of work to build sort of really solutions that are going to, you know, meaningfully advanced standard of care. So I'll I'll use this as a an example to kind of describe how 10 years of like

99
00:14:35.070 --> 00:14:41.840
Dr. Suchi Saria: bunch of ideas fitting together to to get you to where we are today. So just a little bit of background

100
00:14:41.850 --> 00:14:48.440
Dr. Suchi Saria: sepsis is an example of a diagnostic error today. Diagnostic errors are still considered the 3rd leading cause of death.

101
00:14:48.797 --> 00:14:55.970
Dr. Suchi Saria: These are often because the patient didn't get the right diagnosis. They got a delayed diagnosis, which then, you know, maybe the treatments were there. Had they gotten diagnosed.

102
00:14:56.110 --> 00:15:09.340
Dr. Suchi Saria: and there would be an opportunity to meaningfully change the outcome. Why does it occur? All sorts of reasons, right like, it's easy to miss data. It's easy to be biased in interpreting results. It's easier to over rely on impression. Burnout

103
00:15:09.410 --> 00:15:11.020
Dr. Suchi Saria: failure to order tests.

104
00:15:11.270 --> 00:15:24.529
Dr. Suchi Saria: Sepsis is an example of an area where delays in diagnosis or misdiagnosis leads to a massive toll on human outcomes, and it turns out in hospitals. This is the leading cause of hospital death.

105
00:15:25.950 --> 00:15:32.510
Dr. Suchi Saria: for sepsis in particular. Once you're in this area, once you, you know again, light stroke in sepsis time is money.

106
00:15:32.630 --> 00:15:51.880
Dr. Suchi Saria: Every hour in sepsis. There's data showing has meaningful association with increase in mortality. So turns out in sepsis, we do have some available treatments, but the treatments are more effective. You can identify the condition early. So the big bottleneck we want to solve is.

107
00:15:51.880 --> 00:16:18.479
Dr. Suchi Saria: how do we make sure we identify a patient who is septic as early as possible in the course, so that we basically can have a shot at rescuing them, one in 3 patients once they reach septic shock which is like a patient gets infection. Infection leads to systemic response which starts to attack your organ systems leading to sometimes organ failure and organ damage and death, if not controlled in a timely fashion.

108
00:16:18.480 --> 00:16:32.140
Dr. Suchi Saria: This is your immune system becoming overactive in response to tackling your infection. Now it turns out when you're in that reactive state where you reach shock mortality rates are like one in 3. So really, really high.

109
00:16:33.970 --> 00:16:42.800
Dr. Suchi Saria: here, we we started working this about in 2013, 2014, and 2015, wrote one of the earliest papers showing how you could take

110
00:16:42.840 --> 00:17:02.179
Dr. Suchi Saria: emr data like electronic health record infrastructure that is now commonplace or becoming commonplace at the time, combined with machine learning techniques that make sense of this data to be able to identify subst early. So that was like, you know, 2015. It showed promise. But then, to get it from

111
00:17:02.240 --> 00:17:13.369
Dr. Suchi Saria: this could be promising to get it to something that is very practical, deployable, and usable was another 10 years. And I'll talk a little bit about what it did take.

112
00:17:14.450 --> 00:17:25.760
Dr. Suchi Saria: And here I have photos here of some of the students who contributed a great deal, and now have very good careers that they're pursuing post their graduation.

113
00:17:26.230 --> 00:17:37.339
Dr. Suchi Saria: And this work has now led to like, you know a huge amount of downstream, like thousands of now publications. The site is now outdated in terms of the amount of activity in this space.

114
00:17:37.430 --> 00:17:47.519
Dr. Suchi Saria: I'll start with an end result which is from a recent result. This is a study. These are 3 studies we published on the cover of Nature Medicine. They came out in 2220, 20, through summer.

115
00:17:47.770 --> 00:17:51.729
Dr. Suchi Saria: What the paper showed was super high, level.

116
00:17:51.820 --> 00:18:17.569
Dr. Suchi Saria: massive. These were pretty large studies. These were pragmatic studies. They were done across 5 hospital sites, both academic and community overall, spanning nearly 3 quarters of a million patients. So 750,000 patients. This was work funded by an sbir grant, in part to do the pragmatic study and some of the analyses of the pragmatic study.

117
00:18:17.880 --> 00:18:26.509
Dr. Suchi Saria: There were 4,400 clinicians who participated as part of the study. It's 1 of the biggest practical trials in medicine with AI,

118
00:18:26.710 --> 00:18:47.120
Dr. Suchi Saria: and what we showed was the 1st one was, you know, clinicians would often say, especially at a place like Hopkins. Right? Ranks very well in safety quality. Well, we're already very good. I don't know the extent to which AI can really help me. And so here what we showed was essentially using AI running in the background. At 1st the ability to identify sepsis

119
00:18:47.190 --> 00:18:58.390
Dr. Suchi Saria: meaningfully earlier than when physicians did on average 5.7 h earlier on patients who ended up becoming septic and eventually died in the hospital. So, like

120
00:18:58.540 --> 00:19:24.420
Dr. Suchi Saria: the frontline physicians. So this is the system running in the background. It's not impacting standard of care yet. So that's sort of like 10 x higher performance than alternative tools that exist and current standard of care. The next thing was, well, okay, well, maybe this is possible to identify. The next question is, will clinicians adopt. Right? So the bottleneck of clinician adoption or physician adoption has been a big one, and part of it is like physician trust

121
00:19:24.420 --> 00:19:38.799
Dr. Suchi Saria: in this technology. And how can it be delivered in a way that they trust? So with Nsf, we're funded on this grant called human machine teaming. How do we enable humans and machines to be more effective together than each individually alone.

122
00:19:38.810 --> 00:19:49.019
Dr. Suchi Saria: And using some of the ideas in that grant we build this sort of, we we did sort of both a pragmatic human factor study as well as quantitative evaluations, and showed

123
00:19:49.040 --> 00:20:11.710
Dr. Suchi Saria: over 2 and a half year period in a large, you know, 4,400 clinicians using the platform and tool, nearly 89% physician adoption. This is like, you know, typical adoption for Cds tools or alerts, alarms. And the Emr is something like 10%. So a very meaningful improvement over current current approaches.

124
00:20:11.710 --> 00:20:27.640
Dr. Suchi Saria: And then in sepsis, we know if you have earlier detection, and you have meaningful adoption. Then you expect to see changes in outcome, and some of the outcomes you expect to see. Change is reductions in mortality reductions in complication rates which tends to also then cut

125
00:20:28.128 --> 00:20:40.490
Dr. Suchi Saria: utilization. And and that's sort of what we saw really, personally rewarding to me to see the real impact on human life with sort of nearly 20% reductions in mortality

126
00:20:40.590 --> 00:20:42.770
Dr. Suchi Saria: in this programmatic study.

127
00:20:43.850 --> 00:20:44.750
Dr. Suchi Saria: So

128
00:20:45.113 --> 00:20:59.019
Dr. Suchi Saria: let's talk a little bit about sort of what was some learnings along the way, like, what did it take us to get there? And I could be here for 2 days giving lots of talks and lots of components. So I'll highlight just sort of a few example ideas of the rest of this talk.

129
00:20:59.190 --> 00:21:03.819
Dr. Suchi Saria: but super high level. I put sort of challenges in 4 big buckets.

130
00:21:03.830 --> 00:21:07.559
Dr. Suchi Saria: The 1st were around modeling, which is, you know.

131
00:21:07.740 --> 00:21:35.910
Dr. Suchi Saria: ultimately, we're looking for. So there were a huge number of modeling challenges we needed to solve from sort of how current state of the art looked like. So the 1st one was, you know. Ultimately, sepsis is still 2 to 3% of the population. It's not like something that you see in 30% or 40%. So it's a what I would call a needle in a haystack problem. So you've got to find the signal to noise problem is real here, because one, it's not that prevalent, but 2. There are lots of things that mimic

132
00:21:35.910 --> 00:21:39.889
Dr. Suchi Saria: sepsis, so that it's an easy way to miss, and

133
00:21:39.890 --> 00:22:04.109
Dr. Suchi Saria: you know, to misidentify or to have a very high, false, alerting rate which basically makes the system not very useful in practice, and very likely low likelihood of adoption. So how do we solve the signal to noise problem? The second thing was that you know a lot of times when we are learning from data having high quality access to ground. Truth is very important. But what should ground truth look like.

134
00:22:04.120 --> 00:22:25.777
Dr. Suchi Saria: And how do we develop ground truth in a scalable fashion? Right? So we can build high quality AI models. But part of the challenge is if they need millions of labels, and we need our physicians to sit down and chart review each chart and each chart review could easily take 30 min. How do you then get access to high quality labels from which you can learn? So the opportunity to do

135
00:22:26.240 --> 00:22:54.180
Dr. Suchi Saria: you know, build AI replay machines that basically are human in the loop that allow you to generate high quality gold standard data that is the equivalent of human adjudicated data or expert adjudicated data. It's very easy to overfit to multimodal data. So a lot of the opportunities and leveraging ideas from causal inference, combined with high dimensional learning or high, dimensional, multimodal learning to be able to build models that

136
00:22:54.250 --> 00:23:04.809
Dr. Suchi Saria: are more grounded in causal reasons, we know, can impact, which then allows to build models that are more intelligible and actionable and higher accuracy as well.

137
00:23:05.333 --> 00:23:26.800
Dr. Suchi Saria: Then, there are lots of things around bias and data collection missingness. And so, you know, like, for instance, a common common assumption we often make when we do machine learning algorithms is, think about data as missing. You know, missing. Not at sorry missing at random, or missing completely at random.

138
00:23:26.810 --> 00:23:30.687
Dr. Suchi Saria: turns out in these scenarios. Often, when a measurement is made

139
00:23:31.460 --> 00:23:59.110
Dr. Suchi Saria: it was intentional or measurement is not made, it can be intentional, isn't informative. So you want to take into account missingness. So the missingness, completely at random or at the Mcar Assumption is often false in practice, and you can actually do a whole lot better if you think harder about the missingness patterns. And so real opportunities for leveraging, for creating methodology, that leverage the complexity of the domain, to be able to build methods that are far more accurate.

140
00:24:00.610 --> 00:24:29.020
Dr. Suchi Saria: and a lot of these ideas led to a collection of papers in Europe, and Icml, and ancient medicine, and new and general medicine, and you'll hear about a few of these. The second bucket of challenges was around what I would call safe, reliable. And now, robust learning, which is, it was one thing to take data from a lab and learn these models, and to be able to show an analysis that the method performs really well. But when we move these systems from the lab to the real world, and across many sites

141
00:24:29.160 --> 00:24:39.789
Dr. Suchi Saria: you suddenly have to think about. You know. How do we monitor for ships and drifts? How do we monitor for things that are changing. And how do we update over time?

142
00:24:40.100 --> 00:24:42.140
Dr. Suchi Saria: And so I'll go a lot deeper into that.

143
00:24:42.260 --> 00:24:49.530
Dr. Suchi Saria: The next piece is delivery. How do we deliver it away. So the human machine teaming component, how do we deliver it in a way that builds clinician trust?

144
00:24:49.650 --> 00:24:51.259
Dr. Suchi Saria: And finally.

145
00:24:51.290 --> 00:24:59.780
Dr. Suchi Saria: how do we do real world pragmatic evaluations, where I think it's such an exciting opportunity, or historically, we've relied on like massive.

146
00:24:59.860 --> 00:25:10.049
Dr. Suchi Saria: premeditated Rcts that are ridiculously expensive, and as a result extremely hard to do, and a key bottleneck for bringing new ideas

147
00:25:10.050 --> 00:25:31.860
Dr. Suchi Saria: into the field. So one of the questions is, now that we have Emrs. We have the ability to embed technologies within the Emr, there's a lot of variation in provider practice pattern. Can we take advantage of the variation to be able to design trials that are far more statistically efficient, but also can run in real world scenarios that more represent the real world.

148
00:25:32.235 --> 00:25:44.650
Dr. Suchi Saria: And can still give you really high quality evidence. So that now, suddenly, the cost of the trials that you're implementing go way down and the ability to leverage these trials to show impact

149
00:25:44.740 --> 00:25:48.709
Dr. Suchi Saria: improves. The speed at which we can show impact improves dramatically.

150
00:25:49.530 --> 00:25:51.989
Dr. Suchi Saria: So I'll start with the 1st component first.st

151
00:25:52.240 --> 00:26:12.700
Dr. Suchi Saria: So let's talk a little bit about the modeling pieces first.st So traditional state of the art. Traditionally, what people were doing was, I kind of can bucket them into, you know, 4 key areas, you know the quality of the inputs, which is what are the kinds of data you're putting into your machine the quality of the labels, which is how much label data. You have

152
00:26:12.800 --> 00:26:35.810
Dr. Suchi Saria: the quality of the learning strategies, which is, how are you building learning strategies that like, really push for the kinds of complexity you have in this domain. And then, finally, rather than a frozen system, the ability to build in monitoring, tuning and bias, mitigation strategies, to be able to get to systems that are far more accurate than what exists. So current state of the art. Typically, you have

153
00:26:35.810 --> 00:26:44.230
Dr. Suchi Saria: limited set of inputs because they're often thought of as diagnostics. You want to collect only a small number of inputs that you can collect in a very controlled fashion.

154
00:26:44.380 --> 00:26:53.000
Dr. Suchi Saria: and then you're combining them often with labels. Very often you have very limited data to train from because of the burden of label acquisition.

155
00:26:53.080 --> 00:27:02.790
Dr. Suchi Saria: And then when people have tried to do large label studies. They're basically collecting really noisy labels based on what is called billing codes, which is not very good.

156
00:27:03.030 --> 00:27:25.819
Dr. Suchi Saria: 3, rd they're often using off the shelf learning strategies which are generic learning strategies and not really tuning and tailoring to the kinds of complexity we see in this data. And, 4, th most existing models are frozen, and they don't really sort of adapt to the real world. So we push on all 4 of these dimensions to be able to see the quality of results. I showed you up front.

157
00:27:26.440 --> 00:27:54.710
Dr. Suchi Saria: And so here's a really simple like pragmatic, like understanding of like, how much each of these dimensions matter. So here. What I'm showing you on the X-axis is sensitivity, which is your detection rate, and on the y-axis is positive predictive value, which means false, true, alerting rate or the opposite of false, alerting rate. Right? We want a positive predictive value to be high. We want a sensitivity to be high and ideally. We want to operate in a region here

158
00:27:54.740 --> 00:28:09.699
Dr. Suchi Saria: and here. What we're showing is basically, I took those 4 axes, and I said, all else equal. If all we did was go from these sort of curated, narrow set of inputs to dramatically expand the set of inputs to include the vast complexity of multimodal data that exists.

159
00:28:09.850 --> 00:28:26.739
Dr. Suchi Saria: how would we do in terms of improved performance? And what we're showing here is nearly at 80 to 90% sensitivity range, the ability to improve alarm rates or reduce false alarm rates by nearly 20 to 30%. So very, very meaningful.

160
00:28:27.000 --> 00:28:42.750
Dr. Suchi Saria: The second slide. What I'm showing here now is that? Okay? All else equal. If I kept the inputs the same, the learning strategy. Of course, you know, learning strategies interact with all of these. But learning strategy is the same. And what I did is change the quality of the labels, the quality of the data I'm learning from.

161
00:28:42.860 --> 00:28:48.050
Dr. Suchi Saria: And so this isn't changing the size of the data. But the quality of the actual label data.

162
00:28:48.260 --> 00:28:56.369
Dr. Suchi Saria: What we're seeing here is basically in the same sensitivity range, nearly 40 to 70% improvements in positive predictive value. So very meaningful.

163
00:28:56.420 --> 00:29:11.150
Dr. Suchi Saria: Now, in this 3rd slide, what I'm showing here is that what if I kept all else equal and combined them. So both had better, richer inputs and better labels. The 2 actually interact. And we almost see 200 to 300% improvements in Dpv.

164
00:29:11.400 --> 00:29:13.959
Dr. Suchi Saria: And then finally, sort of

165
00:29:16.350 --> 00:29:21.160
Dr. Suchi Saria: what other questions I can hear a little bit of noise. I assume I should ignore it and keep going.

166
00:29:22.240 --> 00:29:25.329
Michael Littman: Yes, it turns out I was unmuted the entire time, and

167
00:29:25.860 --> 00:29:26.610
Dr. Suchi Saria: Okay? Bye.

168
00:29:26.610 --> 00:29:28.059
Michael Littman: There with background. I'm so sorry. Please.

169
00:29:28.060 --> 00:29:30.269
Dr. Suchi Saria: No worries at all, Michael. We couldn't hear anything.

170
00:29:30.690 --> 00:29:42.400
Dr. Suchi Saria: So so that's so I spoke a little bit about the inputs and labels. I'm going to talk a little bit about the learning strategies. This seems most intuitive to this audience that the quality of the learning strategy should make a huge impact.

171
00:29:42.480 --> 00:29:48.190
Dr. Suchi Saria: though in practice what is really interesting is, you know, if you remember this very famous Peter Norwick code.

172
00:29:48.450 --> 00:30:03.459
Dr. Suchi Saria: Peter Norwig is at Google in the 2,010 s. Sort of as Google grew, there sort of grew this understanding that actually, I don't know the extent to which models really matter and the extent to which the learning strategies really matter.

173
00:30:03.650 --> 00:30:07.759
Dr. Suchi Saria: Ultimately, maybe it's just about the amount of data you have and the size of the data you have

174
00:30:07.970 --> 00:30:17.659
Dr. Suchi Saria: and turns out in this domain because the data is so complex because there's so much richness in the data. If you can truly exploit it it dramatically

175
00:30:17.750 --> 00:30:22.873
Dr. Suchi Saria: in impacts the quality of the models. So here's a paper

176
00:30:23.750 --> 00:30:27.290
Dr. Suchi Saria: and I'm just gonna present a couple examples to show.

177
00:30:27.510 --> 00:30:55.420
Dr. Suchi Saria: This was a paper published in Tpami. What it talked about was intelligently leveraging missingness. So typically the way people think about it is, you take missing data and you either do last one carry forward, or you basically you know, ignore the samples where there's, you know, either you don't include the modalities where there's a lot of missingness. Or if you include them, you basically use some simple imputation strategies for filling in the data.

178
00:30:55.987 --> 00:31:11.830
Dr. Suchi Saria: By contrast, if you're able to basically leverage, the ability to use context to do intelligent interpolation where you're not just filling it in. But you're maintaining uncertainty over what that value might have been. So that's number one and number 2.

179
00:31:11.900 --> 00:31:22.159
Dr. Suchi Saria: You leverage the uncertainty in each of the missing measurements to then project your uncertainty into the actual output. So let's say in this example, you're trying to forecast

180
00:31:22.180 --> 00:31:51.650
Dr. Suchi Saria: sepsis. No sepsis. You're giving your uncertainty calibrated uncertainty intervals around your prediction in this your probability of having sepsis. And then what's and you can leverage that to now create what is an optimal stopping problem? Right? So you can think about in a base decision theory. Theoretic way. Is it better for me to, you know? Stop! Which is to alert, or am I unsure enough that maybe I should collect a little bit more data, and

181
00:31:51.750 --> 00:31:55.029
Dr. Suchi Saria: then reconsider whether or not to alert it the next time.

182
00:31:55.060 --> 00:32:07.850
Dr. Suchi Saria: And when you take an approach like this turns out you can dramatically improve again, accuracy. So here what we're showing is compared to standard approaches. We used a whole slew of standard approaches. So I've simplified the slide greatly.

183
00:32:08.070 --> 00:32:20.429
Dr. Suchi Saria: Results and paper. We show that this sort of more intelligent weight and watch type strategy that leverages. you know, intelligent uncertainty, quantification. Can actually.

184
00:32:20.560 --> 00:32:22.247
Dr. Suchi Saria: at the same

185
00:32:22.820 --> 00:32:31.090
Dr. Suchi Saria: at the same ppv. Dramatically improves increased sensitivity. So almost increased sensitivity by 30 to 40%.

186
00:32:31.100 --> 00:33:01.070
Dr. Suchi Saria: Sorry, 300 to 400%. So 3 x to 4 x improvement over standard approaches. So just using this one example, one type of strategy as an example, this is another example here. What we did was basically say, when you're looking at large scale observational data from Hrs, what we are really looking at is sequential data. And when we're looking at sequential data. A lot of the ways people would often learn predictive models is, they would say, Okay, let me look at the outcome

187
00:33:01.220 --> 00:33:09.209
Dr. Suchi Saria: and then work backwards to say, what is the data? Let's say you wanted to predict, should this person is this person at risk for mortality?

188
00:33:09.300 --> 00:33:26.210
Dr. Suchi Saria: So patient comes into the hospital at a given time. You're trying to predict risk of mortality. The standard practice was to say, let me look at whether or not they died in the hospital. That's the outcome. Let me look at all the inputs, all the variables that exist to date as the inputs and then train a model to do prediction.

189
00:33:26.480 --> 00:33:27.360
Dr. Suchi Saria: The

190
00:33:28.340 --> 00:33:58.069
Dr. Suchi Saria: now this has become well understood, though wasn't well understood at the time, is that you know this is a scenario where the actions the physicians take dramatically impact the next step and the next step and the next step. So you're really doing learning in a sequential decision-making, setting, but sequential decision making setting where there's a model that is driving the actions. So if we can instead train sort of a risk objective, which is y given x, but instead do a counterfactual objective, which is

191
00:33:58.310 --> 00:33:59.480
Dr. Suchi Saria: why

192
00:33:59.570 --> 00:34:08.248
Dr. Suchi Saria: given X. And do what actions they chose to do, which means you can now start to think about what was some alternative actions?

193
00:34:08.620 --> 00:34:11.950
Dr. Suchi Saria: you can actually 1. 1st of all, it's sort of

194
00:34:12.010 --> 00:34:33.500
Dr. Suchi Saria: theoretically the right thing to do empirically. You learn models that are way more sensical because it now actually understands the impact of actions on how that impacts predictive state. And now your forecast on the predictive state isn't just about the inputs that you happen to be in at the time of prediction, but also the possible action sequences that might go into impacting the outcome.

195
00:34:34.380 --> 00:35:00.379
Dr. Suchi Saria: So the ability to leverage, causal inference and counterfactual objectives, to be able to improve training, another strategy dramatically improve both the intelligibility, actionability, and quality of the models, and then, finally, the last slide on modeling. This is more recent work from my lab. This is on taking transform based models that we have today which have been shown to be very effective in language data and really adapting it to multimodal data.

196
00:35:00.380 --> 00:35:15.000
Dr. Suchi Saria: And the adaptation to multimodal data with this kind of high dimensional multimodal data isn't just as simple as let's just take the models. Let's just take all our streaming data, turn it into inputs and go

197
00:35:15.200 --> 00:35:17.170
Dr. Suchi Saria: either regress or classify.

198
00:35:18.170 --> 00:35:21.400
Dr. Suchi Saria: But the challenge here is again taking into account

199
00:35:21.870 --> 00:35:33.489
Dr. Suchi Saria: the complexities of the individual modalities, and how they relate in order to improve the quality of the resulting models and the intelligibility and actionability of the existing models. This is very new work.

200
00:35:34.650 --> 00:35:38.339
Dr. Suchi Saria: We have a number of papers here, but also lots in progress.

201
00:35:38.700 --> 00:35:40.327
Dr. Suchi Saria: Okay, so that's

202
00:35:41.160 --> 00:35:47.099
Dr. Suchi Saria: a little bit on the modeling front. I'm going to now talk a little bit about generalization when you move from lab to the real world setting.

203
00:35:48.284 --> 00:35:49.780
Dr. Suchi Saria: So in 2020,

204
00:35:50.080 --> 00:35:54.270
Dr. Suchi Saria: we wrote this paper. This was an invited perspective in the New England Journal of Medicine.

205
00:35:54.380 --> 00:35:59.239
Dr. Suchi Saria: and what it talked about was today in medicine, we're starting to consider.

206
00:35:59.450 --> 00:36:09.679
Dr. Suchi Saria: You know, there's been a history of applying data driven tools in the form of like what is called clinic clinic clinical decision support tools. But these Cds tools.

207
00:36:09.820 --> 00:36:17.350
Dr. Suchi Saria: you know, we don't have. We don't really think about them as like tools that like performance can really change and adapt and degrade over time.

208
00:36:17.540 --> 00:36:24.749
Dr. Suchi Saria: As the you know, the shifts in the environment shifts in the population shifts in practice patterns that concept in

209
00:36:25.682 --> 00:36:35.399
Dr. Suchi Saria: sort of medicine was relatively underutilized, and not, as broadly understood. So in 2020, this perspective sort of

210
00:36:35.530 --> 00:37:03.129
Dr. Suchi Saria: took our experience in the AI front and married it with the experience in deploying practical tools to say in practice, here are the 20 ways in which things can actually drift and shift. And these are different kinds of drifts and shifts with examples. And therefore, in order to build a way to really deploy AI robustly, we need the ability to think about more holistically and systematically, how we put a monitoring system in place that detects these

211
00:37:03.280 --> 00:37:13.478
Dr. Suchi Saria: and put a correction loops in place that allows us to really give guarantees either on our system that it won't drift or shift, or that it can auto tune in order to get to a level of performance.

212
00:37:14.190 --> 00:37:24.180
Dr. Suchi Saria: that is as expected, and it's not behaving in unreliable ways. So it's a paper that's then spawned off a whole bunch of subsequent work.

213
00:37:25.440 --> 00:37:28.419
Dr. Suchi Saria: Here are some examples of shifts in practice.

214
00:37:28.750 --> 00:37:30.199
Dr. Suchi Saria: So, for instance.

215
00:37:30.380 --> 00:37:35.040
Dr. Suchi Saria: this, all this work in sepsis, in around around 2016,

216
00:37:35.060 --> 00:37:49.270
Dr. Suchi Saria: 2017, led to a big shift in policy where health systems started the center for Medicare Medicaid started realizing we actually should be able to improve sepsis outcomes. If we can figure out a way to incentivize providers to be more vigilant. And so

217
00:37:49.430 --> 00:38:04.090
Dr. Suchi Saria: they put requirements in place where they wanted to make sure there's there's certain tests were being measured in order to show that the providers were being vigilant, which means that the frequency with which certain tests were ordered when

218
00:38:04.710 --> 00:38:06.419
Dr. Suchi Saria: and the patterns of

219
00:38:06.430 --> 00:38:11.020
Dr. Suchi Saria: practice in how those tests were ordered, and when those tests were ordered changed.

220
00:38:11.080 --> 00:38:15.999
Dr. Suchi Saria: So a system that leveraged as inputs test values

221
00:38:16.530 --> 00:38:39.900
Dr. Suchi Saria: and the frequency and measurement of those test values would suddenly no longer be like reliable. So here below is a paper where we talk about this, and we show how, if you had instead implemented systems with more robust and learning algorithms that leverage contractual objectives and causal inference. You could have built systems that are more robust to these kinds of practice pattern changes.

222
00:38:40.684 --> 00:38:42.740
Dr. Suchi Saria: I've got another example.

223
00:38:42.950 --> 00:38:59.749
Dr. Suchi Saria: This is a paper from Mount Sinai, where they showed that a system in imaging by mistake learned sort of a lot of local patterns around the device rather than the disease itself, and as a result, led to a model. That sort of did very well in Washington Hospital but didn't generalize.

224
00:39:00.500 --> 00:39:17.289
Dr. Suchi Saria: So in order to be able to tackle these, there's sort of like 2 classes of categories 2 categories of solutions. One is, what can we do prior to deployment to mitigate these kinds of shifts? The second is, what can we do?

225
00:39:17.300 --> 00:39:23.829
Dr. Suchi Saria: Post deployment to monitor, detect, and enable learning for these kinds of shifts.

226
00:39:24.150 --> 00:39:37.210
Dr. Suchi Saria: We've spent quite a bit of time in both, and I'll talk. Give a little bit of glimpse into these areas. So the 1st area is sort of in machine learning. AI, we often think of it as invariant learning or

227
00:39:37.720 --> 00:39:51.250
Dr. Suchi Saria: or shift stable learning or robust learning where you want model, the idea being you want model performance to not deter in unanticipated ways. You want to be able to give guarantees about a model's performance under shifts.

228
00:39:51.470 --> 00:40:03.220
Dr. Suchi Saria: And sort of the question is, how do we do that? And so, under invariant learning or shift stable learning, there's a vast variety of methods that people have invented in order to be able to do that.

229
00:40:03.320 --> 00:40:04.530
Dr. Suchi Saria: One big

230
00:40:04.580 --> 00:40:14.640
Dr. Suchi Saria: class of these methods of what are called data driven domain unaware. So essentially, the way it works is, you know, you might say, Okay, let me go collect data from lots of different environments

231
00:40:14.800 --> 00:40:38.430
Dr. Suchi Saria: by collecting data from lots of different environments. I can. I can. My data inherently captures the kind of variation that exists. And I can now take my learning algorithm in order for it to generalize across these different environments by construction, it'll be forced to learn ideas that are invariant or ideas that generalize across these different domains so pros super easy to use.

232
00:40:38.430 --> 00:40:56.659
Dr. Suchi Saria: Cons, you really require data collection from lots of different environments which in medicine turns out to be a really expensive enterprise. But also you don't have sort of this is mostly empirically motivated. Right? Like you don't have guarantees on performance. The guarantees are basically reliant on you collecting enough data

233
00:40:57.050 --> 00:41:03.790
Dr. Suchi Saria: that you can capture the desired invariances. So a different parallel type of approaches

234
00:41:03.840 --> 00:41:14.080
Dr. Suchi Saria: actually using domain knowledge to be able to in enforce invariances. So here the idea is using calls and graphs. You can use domain knowledge to specify

235
00:41:14.100 --> 00:41:22.632
Dr. Suchi Saria: shifts of interest, but also that you want to be invariant too, and then learn models that are stable against them. So

236
00:41:23.916 --> 00:41:34.679
Dr. Suchi Saria: as an example like here. So there's a class of work we did in this area where essentially you take data from. And so very often, when anybody wants to do causal inference

237
00:41:34.820 --> 00:41:42.120
Dr. Suchi Saria: practically, or will people get very scared because they're like causal influence is very theoretical. We very rarely have access to a causal tag.

238
00:41:42.390 --> 00:42:04.920
Dr. Suchi Saria: It's not very relevant, you know. It's very hard to use in practice. It also requires understanding your data and domain very painful exercise. So, as a result, historically, these kinds of roaches have had less traction in practical environments, since I have access to both the methods and the practice. I'm in a unique position to challenge that myth. So here, what we did was basically

239
00:42:05.320 --> 00:42:32.530
Dr. Suchi Saria: we took data, real world practical data and learned partial dags from that data. So while you don't run the full causal graph, you can get partial dags from the data that, combined with your own understanding of the domain to start identifying desired invariances. So, for instance, I won't go into the details of this. But basically, in the examples that I just gave you. You can identify what are dependencies in the Pdag that you would like to be invariant to.

240
00:42:32.640 --> 00:42:38.199
Dr. Suchi Saria: and you can. And then, essentially, you can think of any learning problem.

241
00:42:38.230 --> 00:42:58.670
Dr. Suchi Saria: any learning model like this. This approach is basically model agnostic it, you know, it doesn't matter whether you're using transformer based models or just simple logistic regression. A hierarchical mixture of experts, independent of the choice of model used. Sort of the intuition of the method is as follows, you learn a pdac, you specify the dependencies you don't want to learn.

242
00:42:59.250 --> 00:43:08.139
Dr. Suchi Saria: Typically people think of this as pruning. So then you remove all those variables. But here, instead of removing those variables, what we're gonna do is remove those edges on the graph.

243
00:43:08.300 --> 00:43:13.109
Dr. Suchi Saria: which means remove the learning methods, ability to learn those dependencies.

244
00:43:13.300 --> 00:43:14.400
Dr. Suchi Saria: And now

245
00:43:14.660 --> 00:43:19.260
Dr. Suchi Saria: effectively, what you end up learning is, instead of learning one big consolidated model.

246
00:43:19.570 --> 00:43:29.630
Dr. Suchi Saria: which is one big joint distribution using whatever advanced or fancy model you want to use. You are instead learning a collection of models that stitch together.

247
00:43:29.680 --> 00:43:36.220
Dr. Suchi Saria: And by doing so you can actually build, give guarantees that the resulting method. 1st of all, it's a wrapper method. It can be

248
00:43:36.330 --> 00:43:40.119
Dr. Suchi Saria: used with any state of the art model that exists.

249
00:43:40.567 --> 00:43:54.052
Dr. Suchi Saria: So it's very flexible. You can. You can give technical guarantees. You can give theoretical guarantees around the procedure sound, which means the return distribution is going to be invariant to the ships that we just specified as

250
00:43:54.650 --> 00:44:02.290
Dr. Suchi Saria: using the invariant spec diagram is desired. The procedure is complete means. If it fails, then there's no estimate estimable

251
00:44:02.777 --> 00:44:16.372
Dr. Suchi Saria: in varying distribution. And you have to basically relax your constraints. You basically want to be able to say, Okay, well, I can't get something that is, you know. You you're willing to take in more.

252
00:44:17.122 --> 00:44:33.797
Dr. Suchi Saria: you're willing to take in more willingness to for the to be noisy if you will, as you move from one environment to the other, and it's distribution and it's also efficient. It's the best, most efficient possible estimation estimator. You can get

253
00:44:34.450 --> 00:44:35.340
Dr. Suchi Saria: so

254
00:44:35.480 --> 00:44:38.049
Dr. Suchi Saria: super exciting, practical.

255
00:44:38.070 --> 00:44:40.890
Dr. Suchi Saria: flexible, and can be applied in practice.

256
00:44:41.216 --> 00:44:45.959
Dr. Suchi Saria: I link to a collection of papers on this topic here, if anyone is curious.

257
00:44:46.110 --> 00:44:58.540
Dr. Suchi Saria: And you know, there's been a fair amount of interest and follow up in this. And so then, moving to sort of the second topic here, which is more around mitigation post deployment, an area where we're doing a whole lot of active work now.

258
00:44:58.540 --> 00:45:17.740
Dr. Suchi Saria: And so I'll talk a little bit about this. So here's kind of the idea. We have systems that are making prediction. The question is, can we put uncertainty, use uncertainty, quantification methods to figure out, you know, predictive confidence interval around the prediction. How much do I trust it?

259
00:45:17.940 --> 00:45:21.149
Dr. Suchi Saria: But then the exciting question, then here is

260
00:45:21.260 --> 00:45:42.040
Dr. Suchi Saria: one. Can we get? You know, you could imagine if you could do this well, I already gave you all the examples of how you can improve accuracy by basically building more intelligent policies that leverage this to change system behavior. But I'll show you later results. But you can also use this to now change the way the humans are teaming with the system right? Like you can. Essentially the

261
00:45:42.360 --> 00:45:49.400
Dr. Suchi Saria: enable intelligent suppression you can enable more transparency. And I'll talk a little bit about that.

262
00:45:50.410 --> 00:45:57.639
Dr. Suchi Saria: So here again. There's been sort of in the last 4 to 5 years. Great interest in again wrapper methods. So

263
00:45:57.800 --> 00:46:12.519
Dr. Suchi Saria: 1015 years ago, uncertainty, quantification, we should often think of it as Bayesian methods. Right? So the model is Bayesian. We're able to, you know. If you can take a fully Bayesian view, you can propagate uncertainty all the way into any variable that you care to estimate.

264
00:46:12.720 --> 00:46:20.989
Dr. Suchi Saria: but turns out like a lot of you know, we want to be able to. You know, doing full Bayesian inference is extremely expensive in practice.

265
00:46:21.020 --> 00:46:40.649
Dr. Suchi Saria: We want to be able to leverage, you know, other forms of modeling approaches that don't give you that aren't Bayesian in nature, and still be able to do some level of uncertainty, quantification. So these wrapper methods, the benefit is they do not constrain the model to a particular form.

266
00:46:42.190 --> 00:46:44.690
Dr. Suchi Saria: you, in these rival methods.

267
00:46:44.750 --> 00:46:51.840
Dr. Suchi Saria: You want these wrapper methods to be also accurate and informative. So what does it mean to be accurate, informative? It should contain the true label.

268
00:46:51.850 --> 00:46:54.250
Dr. Suchi Saria: We want to make sure that the

269
00:46:54.550 --> 00:47:03.160
Dr. Suchi Saria: well containing the true label is easy. If I say my uncertainty estimate is like, you know, very get wide. So you also want the uncertainty estimate to be not too wide.

270
00:47:03.750 --> 00:47:08.319
Dr. Suchi Saria: And so you want it to be narrow and narrow is informative.

271
00:47:08.930 --> 00:47:22.059
Dr. Suchi Saria: and then you you want it to be sample efficient, which means, if you know, different shifts do occur. You want to be able to recognize it as quickly as possible. You want it to be computationally efficient means. You can implement these kinds of practices and approaches in real world.

272
00:47:22.120 --> 00:47:24.320
Dr. Suchi Saria: Otherwise it's expensive.

273
00:47:24.330 --> 00:47:30.110
Dr. Suchi Saria: and you want to have guarantees to the extent possible, which means you want to be able to say something about the quality of your estimates.

274
00:47:31.330 --> 00:47:49.310
Dr. Suchi Saria: So in practice, sort of the the challenges and the methods that exist has been twofold. First, st a lot of the methods, you know, where people were working in theory were often assuming data. Iid. But you know, real world data like I showed you with the drifts and shifts are not Iid.

275
00:47:49.390 --> 00:48:16.969
Dr. Suchi Saria: And then, second, is that you want to be able to to be able to use in real time world in real world. You kind of need to do a trade off between the computational demands as well as the statistical demands means. If something does drift as early as possible, you identify it. And the reason you want that is because, you know, in many of these high stakes applications we're talking about, there could be patient harm. So the ability to know early or as quickly as possible is actually quite helpful. So

276
00:48:17.490 --> 00:48:42.560
Dr. Suchi Saria: here, what I'm doing is basically the work in our lab, essentially worked in a few different areas. So what I'm doing here is kind of laying out the landscape approaches in distribution, free uncertainty, quantification here on the Y-axis is statistical efficiency on the X-axis is computational efficiency, and traditional methods were either statistically very efficient, but computationally inefficient or computationally very efficient. But statistically inefficient.

277
00:48:43.012 --> 00:48:47.700
Dr. Suchi Saria: So the question was, is there a way to like, you know, trade off the 2

278
00:48:47.800 --> 00:49:07.169
Dr. Suchi Saria: number one? So that's sort of one area where we pushed, and second, to be able to do it in a way where we can relax assumptions around the data not being Iid, but that there are other forms of shifts. And so in work published here in papers in Europe and Icml and a few others. What we showed is methods that allow us to both relax

279
00:49:07.764 --> 00:49:28.885
Dr. Suchi Saria: assumptions. And you know, we point to other related work here, super exciting area, lots of exciting work to be done in terms of how we can achieve both better trade-offs, but also relax the assumptions to be more favorable to real world scenarios like standard covariorship, feedback, covariate shift, and then, more recently,

280
00:49:30.010 --> 00:49:32.669
Dr. Suchi Saria: in a recent paper that came out in Icml

281
00:49:33.029 --> 00:49:46.929
Dr. Suchi Saria: talk about how we can actually really do this in to get these kinds of estimates to be high quality, with guarantees under not just standard and covariate, but multi feedback, covariate, shift, and really other kinds of more complex

282
00:49:46.940 --> 00:49:48.999
Dr. Suchi Saria: drifts and shifts that can occur.

283
00:49:49.258 --> 00:50:00.950
Dr. Suchi Saria: I know, in the interest of time. I won't have the chance to go into any one of these papers in great levels of detail, but I'll sort of tie it all together, and how these methods can now be in practice. What does this enable? So

284
00:50:01.520 --> 00:50:02.913
Dr. Suchi Saria: here? Essentially

285
00:50:04.050 --> 00:50:13.625
Dr. Suchi Saria: one way to think about this is you know, we're we're now able to use these methods to start implementing more intelligent interfaces for suppression. So

286
00:50:14.280 --> 00:50:23.080
Dr. Suchi Saria: here's actually, I'll I'll come back to that in 2 2 seconds. I want to talk about this one more work around, drift and shift detection before we talk about the use of these methods. For

287
00:50:23.160 --> 00:50:25.139
Dr. Suchi Saria: in practice. So

288
00:50:25.450 --> 00:50:32.011
Dr. Suchi Saria: another another type of example, so previously, we were talking about uncertainty, quantification, and

289
00:50:32.560 --> 00:50:37.470
Dr. Suchi Saria: getting access to predictive intervals, we which we can now use for designing more intelligent interfaces.

290
00:50:37.803 --> 00:50:43.956
Dr. Suchi Saria: This is another piece of work. This is with my postdoc. You are, Walt, where what we basically looked at is

291
00:50:44.350 --> 00:51:01.060
Dr. Suchi Saria: in during Covid. One of the interesting things that happened was, you know, like, there's a an algorithm for subset detection that many hospitals used and turns out covid mimics a lot of the rules that these or predictors that the existing algorithm was using. So as a result.

292
00:51:01.070 --> 00:51:09.509
Dr. Suchi Saria: there was it, you know. It started firing all over the place leading to alarm fatigue, but also risk of harm to patients. And so

293
00:51:10.130 --> 00:51:17.339
Dr. Suchi Saria: one of the so naturally, you can think of this as open, set domain adaptation, the idea of like, can we identify novel classes

294
00:51:17.450 --> 00:51:42.419
Dr. Suchi Saria: and novel classes that are coming up and identify them again as quickly as possible. But traditionally, when people are talking about novel category detection or open set domain adaptation there again, thinking about it in scenarios where there's very rigid assumptions about the background distribution, which is the background. Distribution isn't shifting at all. But that's again not true in the real world right? When you go from the winter to the summer, there's shift in populations,

295
00:51:42.750 --> 00:51:48.970
Dr. Suchi Saria: to like. There could be other kinds of shifts and practice patterns, etc. So the idea here was to think about.

296
00:51:49.290 --> 00:51:51.758
Dr. Suchi Saria: can we relax this assumption around?

297
00:51:52.530 --> 00:52:10.186
Dr. Suchi Saria: you know the background distribution to enable or allow benign shifts or shifts that are more reflective of the real world. And so here, what we came up with was, some really nice or more like you have came up with those really really nice ways to

298
00:52:10.830 --> 00:52:14.190
Dr. Suchi Saria: what we call the scar assumption, which is the the

299
00:52:14.440 --> 00:52:17.550
Dr. Suchi Saria: let's assume that in the background distribution.

300
00:52:17.580 --> 00:52:37.562
Dr. Suchi Saria: the appearance of novel subgroups or certain kinds of anomalies are rare, and so, if we can say that certain kinds of anomalies are rare, then under that scenario. We can basically give guarantees on our ability to detect and detect as quickly and early as possible.

301
00:52:38.120 --> 00:52:47.691
Dr. Suchi Saria: and it's called scarcity of unicorn unicorns assumption. And dramatically improve again, our ability to detect in real world scenarios these kinds of changes in

302
00:52:49.640 --> 00:52:55.289
Dr. Suchi Saria: okay, so let's now talk a little bit about how we doing on time. I can't see my clock.

303
00:52:55.802 --> 00:52:59.860
Michael Littman: Yeah, it's a it's 7 min before one pm. East coast time. So.

304
00:52:59.860 --> 00:53:00.250
Dr. Suchi Saria: But if you.

305
00:53:00.250 --> 00:53:01.860
Michael Littman: Kind of land. The plane.

306
00:53:02.020 --> 00:53:15.150
Dr. Suchi Saria: Lovely. Okay, so I'll try to land the plane in the next 7 min. I think I should be successful. So here, what? We're going to talk a little bit about is we talked a lot about like methods in terms of translation from the lab to the real world.

307
00:53:15.547 --> 00:53:19.240
Dr. Suchi Saria: Now, a little bit about human machine teaming and use. And so

308
00:53:20.020 --> 00:53:29.330
Dr. Suchi Saria: here, essentially, this was a paper that came out in nature digital medicine, where we basically spoke about where we sort of did a study on

309
00:53:29.720 --> 00:53:38.309
Dr. Suchi Saria: what are mental models of physician trust, and how you know what are barriers to getting Physician trust. And you know, what are some roadblocks we need to uncover

310
00:53:38.400 --> 00:53:41.795
Dr. Suchi Saria: who come in order to build trust with

311
00:53:42.370 --> 00:53:44.040
Dr. Suchi Saria: machine learning systems

312
00:53:44.060 --> 00:53:47.319
Dr. Suchi Saria: super interesting because it taught us a lot about

313
00:53:47.780 --> 00:53:55.850
Dr. Suchi Saria: both models of trust, but also, like some barriers we would need to think about in order to be able to build systems that would get adopted.

314
00:53:55.880 --> 00:54:09.609
Dr. Suchi Saria: We tackled several of those learnings, and then building a system that we then presented within that pragmatic study and trial that I spoke about and showed that, and in a quantitative study then showed we were able to drive very high adoption.

315
00:54:10.130 --> 00:54:15.239
Dr. Suchi Saria: And so I'll talk a little bit about. There are some kinds of learnings, which was more around.

316
00:54:15.808 --> 00:54:21.940
Dr. Suchi Saria: The the you. You need the system to be accurate. If the system

317
00:54:22.060 --> 00:54:27.380
Dr. Suchi Saria: has very high, false, alerting rate, or it doesn't give valuable insight early enough.

318
00:54:27.390 --> 00:54:35.960
Dr. Suchi Saria: Ultimately, clinicians are very smart. They're not going to use it because they don't see the data being informative. So that's sort of one very high level number 2,

319
00:54:36.260 --> 00:54:39.529
Dr. Suchi Saria: you know, like they have mental models of how

320
00:54:39.560 --> 00:54:47.290
Dr. Suchi Saria: the disease works. So if it contradicts the way it works. It's, you know, they're less likely to trust it. So again, thinking about

321
00:54:47.340 --> 00:55:04.990
Dr. Suchi Saria: intelligibility is very important. Turns out they don't care a whole lot about interpretability. So, for instance, the example they would often give is, I don't need to know how every bit of my Fmri machine works, or I don't need to know. I don't know what neurons are firing for my colleague.

322
00:55:05.090 --> 00:55:13.679
Dr. Suchi Saria: who, I trust to give me a diagnosis or to help me with the case, but I do understand the way they think and why they think what they think, or

323
00:55:13.880 --> 00:55:19.510
Dr. Suchi Saria: I have the ability to take what they're telling me and use it in enhancing my own way of reasoning.

324
00:55:19.530 --> 00:55:38.920
Dr. Suchi Saria: And so those were some really interesting ideas and basically being able to build a teaming model where there's a very clear role for what information the system is providing, and how is it providing, and does the clinician understand what information is being provided in a manner that they know how to ingest and incorporate into their own reasoning procedure.

325
00:55:39.350 --> 00:55:42.730
Dr. Suchi Saria: and then that allows them to be more effective with it.

326
00:55:42.820 --> 00:55:51.549
Dr. Suchi Saria: Now, when you go down, that line turns out, there are also lots of ways to in which you can create under reliance. So under reliance is basically the system, does it?

327
00:55:51.640 --> 00:56:00.060
Dr. Suchi Saria: The clinician doesn't trust the system and won't use it. But there's also over reliance, which is, you start trusting it too much, which is a problem when the AI is not right.

328
00:56:00.070 --> 00:56:13.020
Dr. Suchi Saria: So this piece of these collections of work kind of stem from that which is starting to think about under reliance, normal reliance. And you know, interventions we can do based upon the outputs we've generated to

329
00:56:13.110 --> 00:56:23.269
Dr. Suchi Saria: improve, you know, to study over over and under reliance. And so a couple of quick takeaways again in the interest of time, won't go into depth. But here, what we're doing is again.

330
00:56:23.890 --> 00:56:45.469
Dr. Suchi Saria: it's an Rct. It's a randomized trial with like where we are presenting AI advice versus we're presenting the AI, and we're presenting different forms of advice, correct advice versus incorrect advice. And then we're also changing the interface which is in, we're presenting different types of explanations along with the advice we're also changing whether or not we give

331
00:56:45.798 --> 00:56:55.010
Dr. Suchi Saria: the the whether the AI's confidence is exposed to the to the user. This is a paper in radiology that's coming up. If for anyone who's curious.

332
00:56:55.010 --> 00:57:17.770
Dr. Suchi Saria: a bunch of really interesting experiments here. We also from a user perspective, had experts and non-experts. So experts means in medicine. They see this. These are specialists. They see this as the domain non-experts, meaning they're still medical experts, but not maybe specialists in this domain. So like emergency medicine physicians would be considered non-task experts, radiologists would be considered task experts for imaging data.

333
00:57:17.810 --> 00:57:20.659
Dr. Suchi Saria: And so some couple of interesting takeaways here.

334
00:57:21.440 --> 00:57:25.849
Dr. Suchi Saria: So turns out, basically, there's some kinds of explanations.

335
00:57:26.000 --> 00:57:34.840
Dr. Suchi Saria: It physicians. like. So we measured impact on diagnostic accuracy. And then what we saw was basically under correct advice.

336
00:57:35.090 --> 00:57:58.139
Dr. Suchi Saria: They will, you know, diagnostic performance improved a great deal more when the explanation was a local type of explanation as opposed to a global type of explanation. So local type of explanation means that localizes. Where in the image something is wrong. And why? And global is it says, you know, because of other images like this image. Let me give you some examples of other images where this thought to be true. Now, local image turns out local advice. They both.

337
00:57:58.260 --> 00:58:05.329
Dr. Suchi Saria: they they know how to ingest it better. They know how to reason it with it better, and as a result their performance is impacted more.

338
00:58:05.400 --> 00:58:14.400
Dr. Suchi Saria: Now the good news is when the AI is correct, it actually improves the diagnostic performance because they're likely to rely on it compared to when you're given global advice.

339
00:58:15.520 --> 00:58:19.669
Dr. Suchi Saria: So high high confidence, local AI explanations now even more interesting.

340
00:58:19.750 --> 00:58:40.110
Dr. Suchi Saria: They're more likely to persuade they're more likely to sway non-task experts than task experts. There's, you know, now, evidence in the literature that task experts. They're more likely to be biased against AI, which is, if they know it's AI advice they don't take to tend to take it as seriously. But if the same advice came, but you told them it's from a human, they're more likely to take it.

341
00:58:40.680 --> 00:59:08.249
Dr. Suchi Saria: Turns out non-task experts have less of this bias and non-task experts. The local explanations are very, very helpful. So in a lot of these access type applications where we're moving applications into a setting like the primary care, primary care scenario, or the Ed scenario where we are trying to move from specialist to there's sort of this exciting insight here that you can design them in a way that non-task experts are likely to rely and use. That helps improve diagnostic performance.

342
00:59:08.540 --> 00:59:14.550
Dr. Suchi Saria: There's also sort of this second really interesting concept of like trust and simple trust.

343
00:59:14.580 --> 00:59:20.784
Dr. Suchi Saria: And the way we measured simple trust was this, you know, in a given image there's a whole bunch of concepts.

344
00:59:21.600 --> 00:59:31.339
Dr. Suchi Saria: you know, they're they're basically annotating the image and not annotating, but giving the impression on this image of a chest X-ray of a number of different clinical things that could be going on.

345
00:59:31.440 --> 00:59:45.789
Dr. Suchi Saria: So what we're measuring is basically the AI, let's say there are 6 things going on. Let's say the AI identified 4 of them, and the expert identified 2 of them, and they agreed on 2 of the 6. So that would be percentage. Alignment is 2 out of 6.

346
00:59:45.860 --> 00:59:51.000
Dr. Suchi Saria: And then the question is, how quickly did they align like, how long did this hover on the image?

347
00:59:51.110 --> 00:59:58.239
Dr. Suchi Saria: And if you could take the percentage alignment divided by the amount of time it took. It's a metric for measuring simple trust.

348
00:59:58.634 --> 01:00:00.700
Dr. Suchi Saria: And so the idea is like.

349
01:00:00.850 --> 01:00:12.730
Dr. Suchi Saria: if the explanations make sense to them, and they're able to get to the right answers faster. That's good. That's that makes the numerator go up. If it takes them, they can get there quickly. That makes the denominator go down. Simple trust goes up.

350
01:00:13.250 --> 01:00:14.250
Dr. Suchi Saria: And so

351
01:00:14.780 --> 01:00:31.609
Dr. Suchi Saria: I think this is the last data slide. But basically again, interesting takeaways here. Local explanations, easier drive, simple trust means physicians are more likely to both agree, but also get there faster compared to global explanations

352
01:00:31.630 --> 01:00:40.910
Dr. Suchi Saria: on the flip side. There's it's also possible to be overly reliant. And one of the interesting next steps from here is something like

353
01:00:42.160 --> 01:01:02.040
Dr. Suchi Saria: Can we like maybe take other mechanisms, other mechanisms like maybe thinking of the uncertainty, interval, or the confidence of the AI as way to modulate, you know, when to trust and when not to trust, so that they're less likely to over align the scenarios where the AI is wrong, and then they are more likely to, you know, they continue to adopt and

354
01:01:02.270 --> 01:01:11.921
Dr. Suchi Saria: perhaps improve reliance in the scenarios where the AI is correct, because that would lead to a system that would be more performant. So with that, I'll start wrapping up, which is to say, you know,

355
01:01:12.700 --> 01:01:37.399
Dr. Suchi Saria: it's it's extremely hard to do research, but it both has good, sound theoretical, methodologic foundations, but also is married to practical, real world problems. It's interesting. I feel like as a faculty as a professor of computer science. It's extremely easy to be carried away or to be incentivized to just write, you know, papers and Icml and Europe's, and

356
01:01:37.400 --> 01:02:04.969
Dr. Suchi Saria: you know, machine learning venues where you know the bar for the quality in terms of you know our deep understanding of the problem, and whether the quality of the evaluations are really all that good can really sway us into believing something is actually working when it's not, or can sway us into believing. Something is very useful when it's not so. It takes a lot more work to be able to understand the real world. But deep understanding of real world requirements often inspires.

357
01:02:04.980 --> 01:02:31.689
Dr. Suchi Saria: I think, some of the most exciting foundational work. It's slower, it's more painful. But I think it's it's just exciting what is possible. And so I want to encourage and embrace the marriage of the 2. You know, being able to. The latter is real painful work. But again, I think it's from a reward perspective very valuable in terms of leading us to innovations that really matter in practice.

358
01:02:31.700 --> 01:02:34.510
Dr. Suchi Saria: And then just a last thread about

359
01:02:35.902 --> 01:02:38.857
Dr. Suchi Saria: treating algorithms like prescription drugs. I think.

360
01:02:39.590 --> 01:02:48.670
Dr. Suchi Saria: as I spend more time. You know, moving. My early research was in AI and machine learning, but more sort of focused in robotics or

361
01:02:48.910 --> 01:02:54.330
Dr. Suchi Saria: kind of domain agnostic. And in the last decade I've spent more and more time learning the innards of medicine.

362
01:02:54.470 --> 01:03:07.589
Dr. Suchi Saria: Kind of see the rigor with which people pursue, and you're deeply understanding whether or not something is working. And I think, as the 2 get married, there's such an excite like as we see are getting out in the real world more and more.

363
01:03:08.104 --> 01:03:20.499
Dr. Suchi Saria: Real opportunity to bolster AI safety research. Today, a lot of AI safety research is what is focused on what I call AI alignment, which is, you know, for a very highly capable AI system.

364
01:03:20.984 --> 01:03:31.630
Dr. Suchi Saria: Is it showing undesirable behavior? I mean, like, for instance, it's taking control. It's starting to do things that we want meaning for it to do by taking control.

365
01:03:31.730 --> 01:03:37.900
Dr. Suchi Saria: But I also think in parallel, there's the significant opportunity to understanding. I risk leveraging existing regulatory lens of

366
01:03:37.930 --> 01:03:39.509
Dr. Suchi Saria: risk benefit, trade-off.

367
01:03:39.620 --> 01:04:03.159
Dr. Suchi Saria: And then really thinking about this notion of reliability around intended use. So we had sort of an expectation of what we wanted it to do. The question is, is it doing that in the real world and that sort of notion of reliability is you know where a lot of the second half of my talk focused and where I think there's a lot to be done in accelerating AI adoption. This is my last slide. Thank you very much.

368
01:04:07.110 --> 01:04:32.150
Michael Littman: All right. There's thunderous applause that is being completely filtered out by the Internet. But I feel it in my bones. Thank you so much for working through that with us. Yeah, there's a lot of material, a lot of technical stuff, because there's the whole medicine side of things. There's the whole statistics side of things. And then there's the whole computer science side of things. And you really do bring all those things together to yeah, to have a positive impact. So I think

369
01:04:32.150 --> 01:04:56.469
Michael Littman: it's just delightful to get to hear about it. So I'm very glad that we've got some questions that have cropped up. We've got about 10 min that we could engage with you on on questions. So let me kick things off. One of the things that we ask our speakers to do often is to say, just a little bit about how you got here. So what was the path that you took that brought you to this particular topic, and and studying it in this particular way.

370
01:04:57.560 --> 01:05:07.289
Dr. Suchi Saria: Yeah, so my, my early background. So I grew up in India. And it was perfectly reasonable to be a nerd in India. So I got into computer science and AI research quite early.

371
01:05:07.613 --> 01:05:32.810
Dr. Suchi Saria: And in particular, you know, this is like 96, 97, 98. I know that people in size who are doing it. But it certainly wasn't a very popular field to the same extent it is today. And most of my interested was in building machines that was smart and intelligent, could do things for me, because I'm a pretty lazy person, so that I thought was like the coolest thing ever, and then fast forward, actually had

372
01:05:32.810 --> 01:05:43.780
Dr. Suchi Saria: almost little to no interest in biology and medicine, which is kind of pathetic, given sort of what you heard today on the talk, and then around 2,008, 9 10, while I was at Stanford

373
01:05:44.020 --> 01:05:46.910
Dr. Suchi Saria: High Tech Act was just about to be and sort of

374
01:05:47.150 --> 01:05:55.750
Dr. Suchi Saria: got interested in the sort of got exposed to the kinds of challenges that I thought would become challenges. As these new kinds of data came to be.

375
01:05:56.050 --> 01:06:03.250
Dr. Suchi Saria: and the need for Aiml to work in these kinds of settings. And that was just fascinating to me. The the

376
01:06:03.280 --> 01:06:30.559
Dr. Suchi Saria: potential for impact was fascinating, but also realizing that it would require really getting in and understanding the weeds, a different field, which is highly complicated and often daunting, I think, as a computer scientist. But I was going through an early midlife crisis around, you know. What was I going to be? As I grew up? And it felt like this was a huge area, untapped area where we really ought to be spending more time, anyway. So that's how I got interested and then had fascinating collaborators who brought me along.

377
01:06:31.480 --> 01:06:48.870
Michael Littman: That's great. Thank you. Thanks for sharing that with us. Because, yeah, I think you're right about the really having to roll up your sleeves and become an expert in this other domain is so daunting to so many people. But I think what you've helped highlight is how important it is to actually have that kind of impact.

378
01:06:48.980 --> 01:07:05.949
Michael Littman: It's a computer scientist who just say, Hey, I'm just. I'm inspired by health or or electricity grids, or whatever it is, and then just make kind of a formal model of it, and study that formal model are never going to have the same kind of real world impact as the people who get in there and figure out what's actually going on. So.

379
01:07:05.950 --> 01:07:18.940
Dr. Suchi Saria: I wanna add one little line there which is not really a plug in any way. But I actually felt like for me moving to a place like Hopkins was also very critical for that like once I started to feel like this is an area

380
01:07:19.598 --> 01:07:35.040
Dr. Suchi Saria: I wanted to learn more, I think, what has been really exciting for me at a place like Hopkins to see like they really encourage, like because of Apl, because of all of the history there of collaborating with Darpa and various agencies around real world problems.

381
01:07:35.200 --> 01:07:48.360
Dr. Suchi Saria: the opportunity to like an environment that encouraged that broad curiosity which there are departments where that would not be. You know that that's not valued to the same extent. So that really sort of also helps shape.

382
01:07:48.460 --> 01:07:51.720
Dr. Suchi Saria: You know how I developed as a, as a researcher.

383
01:07:52.090 --> 01:07:59.520
Michael Littman: Do you have any advice for people who are out there who want to study these things? But they're afraid that it's going to have negative career repercussions.

384
01:07:59.680 --> 01:08:01.419
Dr. Suchi Saria: I mean, I think, like.

385
01:08:01.610 --> 01:08:09.419
Dr. Suchi Saria: so, this is gonna also sound, probably quite terrible on a call on this kind of audience. But sort of. I used to often say.

386
01:08:09.510 --> 01:08:15.329
Dr. Suchi Saria: I care about the work and the results almost more than I care about being a researcher professor and a successful.

387
01:08:15.450 --> 01:08:22.039
Dr. Suchi Saria: you know, like, I was okay. If I didn't get tenure, I was also completely okay. If I didn't.

388
01:08:22.060 --> 01:08:28.979
Dr. Suchi Saria: 6, you know, sort of the the I mean. It all turned out to be highly like in my favor.

389
01:08:29.000 --> 01:08:43.660
Dr. Suchi Saria: Having said that I felt like that helped relax some of the anxieties. But sort of the advice is, just really go find other people you can partner with and come along and then find mentors, senior mentors

390
01:08:43.700 --> 01:08:48.369
Dr. Suchi Saria: who have been successful in this way, and sort of you know, try to understand.

391
01:08:48.420 --> 01:09:12.400
Dr. Suchi Saria: because there are also lots of ways in which you can do applied research gone wrong. Right? So the scenarios where you're basically not able to do impactful work in either, because you've spent so much time being so far diluted that you weren't able to really leverage your strengths in either, and that's the failure scenario you're trying to avoid. And so, you know, if you can have the right mentors, and you're sort of being very. It's certainly the harder thing to do, but it's doable.

392
01:09:13.630 --> 01:09:35.959
Michael Littman: Outstanding. I we got a couple of questions specifically about the recording of this talk and the slides and if you didn't see it. Edgar, who? Who's our our it specialist kind of handling these talks said that the those will become available on the website. So so know that those are covered. We're not gonna ask such to directly email each with the slides. So we we have them. We'll have them posted

393
01:09:36.109 --> 01:09:54.660
Michael Littman: all right. So a lot of the questions came in anonymously. But this one has a name on it, Syed? Ramis Nakvi asked. I keep coming across discussion on AI's impact on healthcare, especially patient safety. The impact can be negative or positive. What do you think the most important negative impact or concern in this context is nowadays.

394
01:09:55.200 --> 01:10:07.369
Dr. Suchi Saria: I think the most likely negative impact is basically over reliance when the AI is incorrect. Right? So we want to build AI in a way where it's deeply validated to be correct.

395
01:10:07.470 --> 01:10:16.810
Dr. Suchi Saria: We want the build, the to build the interfaces in a way that we can promote reliance and reliance in the right scenarios, and an over reliance. Avoid over reliance in the wrong scenarios.

396
01:10:18.200 --> 01:10:18.890
Michael Littman: Very good

397
01:10:19.801 --> 01:10:28.759
Michael Littman: this. This one, maybe, is more technical than I actually understand. But hope maybe you'll understand it. What were the performance metrics to compare against Cds tools.

398
01:10:28.760 --> 01:10:37.780
Dr. Suchi Saria: Yeah, so interestingly, it's extremely common when we try to compare performance to use like what I call lab metrics.

399
01:10:37.840 --> 01:10:45.510
Dr. Suchi Saria: So lab metrics are like typical model metrics like accuracy, and things like Roc curve Auc Roc and things like that.

400
01:10:46.016 --> 01:10:56.140
Dr. Suchi Saria: We definitely want to do that. But that alone is not enough. And that was actually one of our very early learnings that shaped the methods that we developed. So, for instance.

401
01:10:56.500 --> 01:11:03.619
Dr. Suchi Saria: So we so I have. My answer to this is actually pretty long if I go into the details of every metrics. But the super short answer was.

402
01:11:03.860 --> 01:11:06.800
Dr. Suchi Saria: we really wanted to understand current standard of care

403
01:11:07.040 --> 01:11:13.070
Dr. Suchi Saria: and think about working backwards from current standard of care like, how do we measure performance against

404
01:11:13.110 --> 01:11:22.360
Dr. Suchi Saria: physician today's standard of care, which is physician performance. So that's number one number 2 against other alternative tools. So other tools they might be using.

405
01:11:22.430 --> 01:11:49.010
Dr. Suchi Saria: And then based on that, come up with meaningful metrics that actually quantify impact. So that means, is it about earlier detection? Is it about earlier detection in all types of cases? Is it about reducing a false alarm. Why is that a problem? Maybe because it's causing context switching, which is costing time. So we work backwards to come up with sort of metrics that are more relevant in practice. And then we kept going backwards to say, How does this, then, translate into model metrics.

406
01:11:49.090 --> 01:11:52.779
Dr. Suchi Saria: and how we measure it in terms of model metrics to be able to get to the result.

407
01:11:52.810 --> 01:11:56.720
Dr. Suchi Saria: And then ultimately, you can do all the positing you want. There's

408
01:11:56.800 --> 01:12:09.329
Dr. Suchi Saria: eventually, because of the lack of very clean gold standards, the real world implementations where you actually get to see in the diversity of cases, what happened in practice. And so that was sort of the way the real World Trial really came to be.

409
01:12:10.380 --> 01:12:22.370
Michael Littman: So there's a couple versions of this question, and I think this related to your answer to the previous question. So maybe this is a quick one. But how are physicians convinced to accept the technology thought readily.

410
01:12:22.800 --> 01:12:27.950
Michael Littman: Like you said, 89% versus normal, 10% for other tools. I believe you said.

411
01:12:27.950 --> 01:12:32.150
Dr. Suchi Saria: Yeah, that I think that's actually a pretty like there are. Probably

412
01:12:32.480 --> 01:12:40.100
Dr. Suchi Saria: if there was one hill you could climb, then we would all climb that hill and we'd be done and turns out there are like 15 or 16 hills we have to climb along the way.

413
01:12:40.290 --> 01:12:46.349
Dr. Suchi Saria: and some of the intuitions I gave was around improving performance. Some of it was around the way the information was delivered.

414
01:12:46.740 --> 01:12:48.750
Dr. Suchi Saria: What was the teaming interface?

415
01:12:48.870 --> 01:13:02.599
Dr. Suchi Saria: How you implement matters? A lot in terms of gaining trust. Is it easy to use? Is it in the way they expect it to be the things that are novel? Do they understand why it is normal novel like, is it in a language that they understand? Does it follow their trust model?

416
01:13:02.800 --> 01:13:11.390
Dr. Suchi Saria: And that's those are sort of the kinds of things we have to do to drive adoption. And you know that paper we have that I pointed to would be an interesting one to look at just a starting point.

417
01:13:11.570 --> 01:13:12.170
Michael Littman: Nice.

418
01:13:13.130 --> 01:13:32.160
Michael Littman: all right, we're almost out of time. But let me see if I can squeeze in another question Sanjana Mendu asked. Thank you for the insightful talk. How do you recommend navigating health-related Ml. Problems with a much narrower time window for both monitoring behavior and adapting to model responses, for example, how do you foresee these models impact clinical practice in domains like surgery?

419
01:13:32.370 --> 01:13:49.220
Dr. Suchi Saria: I actually think the setting in which the the setting that I use is actually quite close to the setting where it's real time impacts in real time, and the window of opportunity is like short, relatively short on the typical, like, you know, this chronic disease, like diabetes, etc.

420
01:13:49.370 --> 01:13:57.960
Dr. Suchi Saria: or Copd, where maybe there's short term effects, but there's lots of long term effects. But in the scenarios I spoke about like surgery would be an example.

421
01:13:58.293 --> 01:14:08.799
Dr. Suchi Saria: You know, there's lots of shorter term things you're looking at so hopefully. The work I presented here was actually more of a model where, even in the most toughest on environments where you need rapid response.

422
01:14:08.840 --> 01:14:10.220
Dr. Suchi Saria: You can actually

423
01:14:10.240 --> 01:14:12.599
Dr. Suchi Saria: kind of make these kinds of methods come to life.

424
01:14:14.780 --> 01:14:20.769
Michael Littman: Do you? This is a variation of a question from mutin Yilmaz.

425
01:14:20.970 --> 01:14:25.589
Michael Littman: Uncertainty. Modeling is really important in the work that you're doing. And of course, in this domain as a whole.

426
01:14:26.080 --> 01:14:36.270
Michael Littman: where, in the modeling process, do you think are the most important places to consider uncertainty? So can you imagine, considering uncertainty at the very beginning, before you've even done the modeling.

427
01:14:37.390 --> 01:14:39.550
Dr. Suchi Saria: Oof I think

428
01:14:39.710 --> 01:14:46.845
Dr. Suchi Saria: pretty much everywhere, is my answer to this question. Like, the more you do, the better you do. So it's more like,

429
01:14:47.290 --> 01:14:53.640
Dr. Suchi Saria: how much money do you have from the various agencies to think about what to model? And then you can kind of push your

430
01:14:53.690 --> 01:14:55.639
Dr. Suchi Saria: limit on taking advantage of it.

431
01:14:56.400 --> 01:14:58.059
Michael Littman: Alright! Fair enough.

432
01:14:58.260 --> 01:15:05.179
Michael Littman: fair enough, all right. So we are out of time. I wanna again thank you so much for for coming, participating in this and sharing your insights with folks. There are.

