WEBVTT

1
00:00:00.020 --> 00:00:02.130
Wendy Nilsen: Where we are, live.

2
00:00:10.820 --> 00:00:11.869
David Berkowitz: All right.

3
00:00:12.830 --> 00:00:14.840
David Berkowitz: Well, welcome everyone.

4
00:00:14.970 --> 00:00:26.370
David Berkowitz: My name is Dave Berkowitz, and I'm the Ad. Or assistant director for the Us. National Science Foundation's Directorate for mathematical and physical Sciences or Mps.

5
00:00:26.630 --> 00:00:35.729
David Berkowitz: And I'm delighted to welcome you to today's smart Health frontier symposium that has the title Driving Innovation with Fundamental Science

6
00:00:35.910 --> 00:00:40.260
David Berkowitz: precision medicine through mathematics and statistics.

7
00:00:40.710 --> 00:00:44.809
David Berkowitz: We have 3 distinguished panelists who'll be speaking today, all

8
00:00:45.050 --> 00:00:52.099
David Berkowitz: at the interface of mathematical and statistical science and biomedical technology

9
00:00:52.350 --> 00:00:55.020
David Berkowitz: before we get into their talks.

10
00:00:55.160 --> 00:01:02.389
David Berkowitz: I want to highlight some of the features of the smart health program that undergirds their work.

11
00:01:03.020 --> 00:01:08.719
David Berkowitz: We are hosting this symposium to celebrate the 10 plus anniversary

12
00:01:08.850 --> 00:01:16.460
David Berkowitz: 10 year plus anniversary of the Smart Health Program is a multidisciplinary collaboration between the National Science Foundation

13
00:01:16.680 --> 00:01:18.669
David Berkowitz: and the National Institutes of health.

14
00:01:18.850 --> 00:01:25.290
David Berkowitz: The Smart Health program is designed to accelerate innovative research in this area, bridging the gap

15
00:01:25.400 --> 00:01:29.610
David Berkowitz: between biomedical, fundamental, biomedical research.

16
00:01:29.690 --> 00:01:53.600
David Berkowitz: the mathematical sciences and transforming biomedical and science and public health in the United States. In so doing, a key component of this has been to create sustainable partnerships among mathematicians, statisticians, scientists, engineers, and biomedical researchers, so that we can ensure the most innovative and transformation

17
00:01:53.670 --> 00:02:00.110
David Berkowitz: research going forward and targeting the most relevant biomedical questions.

18
00:02:01.010 --> 00:02:12.630
David Berkowitz: We would like biomedical research to take full advantage of the power of mathematics and statistics as they build their programs as they build out their research directions.

19
00:02:12.740 --> 00:02:17.599
David Berkowitz: advances in mathematics, AI and other technological informations

20
00:02:17.720 --> 00:02:26.419
David Berkowitz: are poised today to make widespread changes in the delivery of medicine, ultimately and in biomedical research.

21
00:02:26.860 --> 00:02:38.729
David Berkowitz: But the integration of these advancements in fundamental science Mps. Science with health related science has been somewhat slow, and it is our intent to catalyze this.

22
00:02:39.330 --> 00:02:52.869
David Berkowitz: Today's symposium is going to showcase how advances in math can develop novel approaches to precision, medicine that have the potential to be paradigm, shifting in healthcare and improve the health of

23
00:02:52.930 --> 00:03:18.289
David Berkowitz: millions of Americans. Wider adoption of such emerging approaches will not be possible without a principled assessment of such new techniques, to ensure their reliability, safety, and fairness. So this seminar will showcase how math and statistics and innovations in these fundamental sciences help to develop such foundational approaches in precision. Medicine

24
00:03:18.410 --> 00:03:39.850
David Berkowitz: attacking such diverse problems as individualized risk assessment for glaucoma AI assisted stroke, rehabilitation, and predictive models for maternal health, for women of color together, mathematical and statistical researchers in the biomedical community can bring this research to fruition to generate a healthier

25
00:03:39.960 --> 00:03:53.470
David Berkowitz: United States of America, and with that I would like to pass the baton to my esteemed colleague, Wendy Nelson, who, from the size directorate who will introduce our speakers? Wendy.

26
00:03:53.780 --> 00:04:13.039
Wendy Nilsen: Great. Thank you, David. We're so grateful to be partnering with the Mps directorate as well as with our colleagues at Nih, so it's an exciting effort. So we're going to have 3 wonderful speakers today. Carlos Fernandez. Sorry, Carlos. I messed that one up

27
00:04:13.432 --> 00:04:25.199
Wendy Nilsen: and Annie Chu, and then Christopher Weichel will be presenting. And I'm gonna I'm gonna introduce each one specifically as they go. But I'm gonna take these slides down now.

28
00:04:25.756 --> 00:04:38.199
Wendy Nilsen: And I'm gonna let Annie put her slides up while I give you a brief on Annie Annie, Q. Phd. Is the Chancellor's Professor Department of Statistics at the University of California in Irvine.

29
00:04:38.450 --> 00:05:03.280
Wendy Nilsen: her research focuses on solving fundamental issues regarding structured and unstructured large scale data and developing cutting edge statistical methods and theory and machine learning and algorithms for personalized medicine, text mining recommender systems, medical imaging data and mobile health data for complex heterogeneous data. Before joining Uc. Irvine, Dr. Chu was the Data

30
00:05:03.280 --> 00:05:08.920
Wendy Nilsen: Science, founder, Professor of Statistics, and the Director of Illinois Statistics Office

31
00:05:09.000 --> 00:05:28.990
Wendy Nilsen: at the University of Illinois, Urbana-champaign. She also serves as a journal of the American Statistical Association theory and methods. Co-editor from 2023 to 2025. And as an Ims program secretary from 2021 to 27. With that I'm going to hand it over to Dr. Chu.

32
00:05:30.270 --> 00:05:44.320
Annie Qu, PhD: Thank you, Wendy, for the very kind introduction. Also, I'm very grateful for this opportunity, and which allow me to share my research also, before I started. Also, I'd like to thank Yuri gear. Organize this event.

33
00:05:44.950 --> 00:05:46.130
Annie Qu, PhD: So

34
00:05:47.838 --> 00:06:03.970
Annie Qu, PhD: okay, so first, st let me give you some introduction to wearable health technology, wearable device, actually are transforming our healthcare. More and more. People are wearing a wearable device, such as Smartwatch.

35
00:06:03.970 --> 00:06:28.920
Annie Qu, PhD: Smart Ring Fitbit. I can just show you. And this is a smart ring I'm currently wearing every day. So there are a lot of applications for wearable device. And today I'm going to just focus on the wearable device which can track our stress and sleep and also physical activity. So you can see this screenshot, this actually from

36
00:06:28.920 --> 00:06:44.319
Annie Qu, PhD: my own data. For one night. You can see there's my sleep parameter, which have a deep sleep, rapid eye movement, time, light, sleep, and total duration time, and also has a minimal heart rate.

37
00:06:44.320 --> 00:07:08.210
Annie Qu, PhD: And so I didn't just have all the screenshots actually, for this data, you also have heart rate variability, which is a physiological measurement to track our physiological stress. So you see that you wear the smart ring just for one day you get a lot of information and also have much step, track.

38
00:07:08.210 --> 00:07:34.189
Annie Qu, PhD: movement and exercise intensity on the right screen. I also show you this is the mobile app we developed. We call this ecological momentary assessment Ema tracking, which also this is kind of subjective measurement for our emotion and well-being. So how happy you are! How content you are so forth.

39
00:07:34.190 --> 00:07:49.640
Annie Qu, PhD: So for this type of a data, we encounter some unique challenges. I'm going to talk some of the challenges, probably not list of all. And this challenge will also motivate us to develop innovative statistics, methodology.

40
00:07:49.890 --> 00:08:03.239
Annie Qu, PhD: First, st we realize we have this irregularity due to this kind of we call multi resolution observation. So this occur like, suppose you have this non-uniform

41
00:08:03.240 --> 00:08:25.799
Annie Qu, PhD: time intervals within each time series. So this is not surprising, because always keep in mind that we are doing observational study. You have no control people when to wear and when not to wear. Okay, people just take off their ring or take off their watch when they take a shower or do the battery charge.

42
00:08:25.800 --> 00:08:49.870
Annie Qu, PhD: And also we have this varying time intervals across multiple Time series. And I'm going to give a little bit more details a little bit later, and in addition, on the top of that, we also have a lot of irregularity among subjects, in particularly this high heterogeneity among the subjects we also encounter large missing data

43
00:08:49.870 --> 00:09:13.500
Annie Qu, PhD: we have already mentioned, due to the observational studies and some data in some project. We also have very small sample size, for example, for the pregnant woman, we are only able to recruit about 20 something subjects, and that's all we have. So getting a little bit more details about multi resolution time series data. So we have mentioned many

44
00:09:13.500 --> 00:09:37.750
Annie Qu, PhD: measurements, including your movement, your heart rate, your heart rate variability to measure your stress, and also your well-being so very commonly that it's impossible you're going to measure them at the same resolution. There's no point you ask someone, are you happy every 5 min. Right? So it's a totally pointless

45
00:09:37.750 --> 00:10:02.380
Annie Qu, PhD: and also some longitudinal data could have a low resolution due to the technical technical limitation, because some data was calculated based on the other measurements, and therefore you need to require certain cumulated information to do the calculation. And here I give you example about the irregularity and heterogeneity among

46
00:10:02.380 --> 00:10:26.760
Annie Qu, PhD: subjects. So some people are morning Person. Some people are evening person. So you clearly see the stress level really vary across different subjects, and we also notice. On the right figure we notice the observation, time, and the frequency of measurements also fluctuate among the different individuals. So the 1st figure is about the heart rate

47
00:10:26.760 --> 00:10:37.580
Annie Qu, PhD: and the low figures about the stress. You see the different resolution and different time points and different gap, the points, the measurements are missing.

48
00:10:39.860 --> 00:10:48.990
Annie Qu, PhD: So how can we handle these challenges? So I'm talking about data integration in a sense that

49
00:10:49.487 --> 00:11:13.350
Annie Qu, PhD: we have this observed time series with all the irregularity. So the idea is that, can we sort of do some transformation in a sense that we can project this multi-resolutional time series into this dynamic latent space. And therefore this irregularity with a lot of missing data.

50
00:11:13.350 --> 00:11:37.680
Annie Qu, PhD: can be transformed to be more continuous, more informative, latent information. And this represent this dynamic latent representation. The reason we can do this because there are intrinsic correlation among all this time series. Therefore, we can borrow information among this correlation to do this projection.

51
00:11:38.030 --> 00:12:02.839
Annie Qu, PhD: In addition, we can also borrow information, cross subjects, because clearly certain physiological phenomenon also shared among all subjects. For example, if the room temperature is very high and everyone's heart rate is going increased. Okay, so this is applied for everyone. And this also, I learned over the time that when, if you want to have a good night's sleep. You try to

52
00:12:02.840 --> 00:12:13.109
Annie Qu, PhD: make sure your room temperature as low as possible as you can tolerate, and this really help your sleep, because make your make your heart rate slow down.

53
00:12:13.450 --> 00:12:14.180
Annie Qu, PhD: Okay.

54
00:12:16.100 --> 00:12:44.769
Annie Qu, PhD: okay, so I'll give you a little bit more details. So how can we handle this? Both individualized information and also population, wise common information. So here we do the model to this decomposition in a sense that we have a dynamic latent factor, Theta, which is, use information from individual information, and this can capture the individual trend.

55
00:12:44.850 --> 00:13:11.349
Annie Qu, PhD: On the other hand, we also can borrow information from the population, wise subject, and this is captured by latent factor F, and through this kind of a multiplication we also say in the product, we are able to capture this interaction effect among the individual effect and the common population-wise information. So we can better explain the data.

56
00:13:11.890 --> 00:13:30.940
Annie Qu, PhD: Okay, so give you more example. When we're doing this kind of modeling, we are able to achieve something other method that they are not able to achieve. So here the figure, the top figure, is the heart rate, and the blue dots is observed the training data. So suppose you know all the information

57
00:13:30.940 --> 00:13:42.570
Annie Qu, PhD: and the low figure is about the stress figure. And you notice that in one time period the data points are missing that we don't have the observation during this time period.

58
00:13:42.570 --> 00:13:49.750
Annie Qu, PhD: using our method that we able to able to capture that during this time period

59
00:13:49.750 --> 00:14:01.539
Annie Qu, PhD: the the stress level actually very high for this subject. Okay, so this is very useful. So in contrast to other methodologies, such as deep

60
00:14:01.540 --> 00:14:29.659
Annie Qu, PhD: recurrent neural network, the green line, the measurements, the prediction, pretty much flat. So they're not able to incorporate the information from the heart rate infer that the stress level is also very high during this time period. Now notice that during this time period there's no exercise, so there's no reason why the person suddenly have a heart rate except the stress level could be very high.

61
00:14:29.660 --> 00:14:54.509
Annie Qu, PhD: Okay, so if you're interested in more details, you can also check our paper. Another thing I want to talk about this is the auto ring data for the pregnant woman we find both homogeneous finding and heterogeneity finding. So, for example, it's actually not surprising that if you have a more deep sleep it associate, it leads to a low stress. And actually, if you have a

62
00:14:54.510 --> 00:15:18.699
Annie Qu, PhD: more rem sleep, which is, when you have a vivid dream, your body is frozen during the rapid eye movement your body is frozen, but meanwhile you could have vivid dream. If you have a more rem sleep, it links to high stress and also low resting heart rate links to the low stress and also older pregnant women tend to have a high stress and high pre-pregnant

63
00:15:19.010 --> 00:15:30.779
Annie Qu, PhD: Bmi, also linked to the high stress. On the other hand, we find some heterogeneous among this subgroup for the subgroup with

64
00:15:31.260 --> 00:16:00.510
Annie Qu, PhD: very high emotional distress level, measured by Ema data, we show that walk more, more steps help to relieve their stress level. But for the moderate and the low emotional distress group, the more walking doesn't have a significant impact to low their stress level. Here the imussd is a high is a better and low is a worse for the stress.

65
00:16:00.770 --> 00:16:26.439
Annie Qu, PhD: So come down to the conclusion and some discussion. So today, actually, I kind of really focus on about the stress management. We are aware that chronical stress could link to the mental health issue and also could link to the chronicle disease, even cancer. And therefore, if our methodology can help us interpret

66
00:16:26.440 --> 00:16:43.879
Annie Qu, PhD: the unobserved, the time points where the data was missing. But we're able to do a precise prediction for the stress prediction. This will be useful to do the better stress management. And

67
00:16:43.910 --> 00:17:08.569
Annie Qu, PhD: so and also we talk about some statistic methodology, for example, how to address, heterogeneity, multi resolution, time series, data, low sample size, informative, missing. I want to point out that data. Integration is a key. On the one hand, we can use data integration to alleviate heterogeneity. For example.

68
00:17:08.569 --> 00:17:33.490
Annie Qu, PhD: we can extract shared information from this heterogeneous source. For example, the multimodality and heterogeneous subjects for better pattern discovery and interpretation. On the other hand, we can also use data integration to harness. The heterogeneity sounds like a contradiction, but they are not. We can borrow information across heterogeneous sources

69
00:17:33.490 --> 00:17:49.730
Annie Qu, PhD: to improve the individual prediction, to enhance the position modeling. So also we could do some statistic inference, such as a conformal prediction. We can quantify this uncertainty on prediction.

70
00:17:49.780 --> 00:18:16.180
Annie Qu, PhD: So this is the last slide to talk about the current. We also recruit Uci, graduate students and postdoc from Stanfield and health science. We try to gain better insight about their stress variation. Using one look like this, and also the Ema Mobile app we have developed. So thank you for your attention, and this is the last acknowledgement.

71
00:18:18.730 --> 00:18:26.170
Wendy Nilsen: Thank you so much. Dr. I'm gonna we're gonna hold questions till the end. And I think

72
00:18:28.360 --> 00:18:38.600
Wendy Nilsen: we'll we'll hold some questions till the end, and unless you have a specific question about that one of the presentations, I think it'll be best if we save till the end. So.

73
00:18:39.147 --> 00:18:41.790
Wendy Nilsen: Thank you, Dr. Chu. That was fabulous.

74
00:18:41.960 --> 00:18:45.130
Wendy Nilsen: and I'm going to introduce our next speaker and and

75
00:18:45.230 --> 00:19:05.729
Wendy Nilsen: Dr. Dr. Weichel can start putting his slides up. There we go. Dr. Weichel. Christopher Weichel is Phd. Is a curator's distinguished professor of statistics at the University of Missouri. He obtained his Phd. From Iowa State University, and it has been on the faculty from the University of Missouri for 27 years.

76
00:19:05.760 --> 00:19:15.769
Wendy Nilsen: His research specialty is spatial temporal statistics, with applications to geophysical processes, complex biological processes, and the environment.

77
00:19:15.910 --> 00:19:41.620
Wendy Nilsen: He focuses on develop computationally efficient, deep hierarchical Bayesian dynamic, spatiotemporal models motivated by scientific principles with more recent work at the interface of deep neural modeling and statistics. He's a fellow of the Asa Ims Isi and Aaas, and has published 2 award-winning books in spatiotemporal statistics with that, Dr. Weichel, I'll let you go.

78
00:19:42.830 --> 00:20:03.569
Christopher K. Wikle, PhD: Great. Thank you so much. It's a pleasure to be here to celebrate the 10 year anniversary of this awesome collaborative program between Nsf and Nih, and for me to represent my colleagues in this interdisciplinary team. And with that let me just say this is my group, and I'm putting it up at the beginning, because

79
00:20:03.570 --> 00:20:15.209
Christopher K. Wikle, PhD: I want you to recognize that it's multiple institutions and multiple individuals across all sorts of disciplines, statistics, engineering, computer science applied math math

80
00:20:15.290 --> 00:20:19.219
Christopher K. Wikle, PhD: and physicians. And

81
00:20:19.270 --> 00:20:37.169
Christopher K. Wikle, PhD: it's a real collaboration in the sense that everybody is actually contributing to everything here. So it's really wonderful. So let me just get started. So what's glaucoma? Well, it's a degeneration of your optic nerve and the loss of cells on your retina. And this is sort of what it looks like

82
00:20:37.170 --> 00:20:50.840
Christopher K. Wikle, PhD: if if you have it, and as it progresses and you can see, you sort of your vision starts decreasing from the outside in, and it is the leading cause of irreversible blindness in the Us. And in the world.

83
00:20:50.840 --> 00:20:59.480
Christopher K. Wikle, PhD: and its prevalence changes, of course, depending on race ethnicity and location and age. But it's a major problem.

84
00:20:59.480 --> 00:21:12.250
Christopher K. Wikle, PhD: So how's it diagnosed? Well, you know. Obviously, if you start seeing vision problems like mentioned here, that would be a clue. But you can also test for that using visual field testing.

85
00:21:12.250 --> 00:21:36.119
Christopher K. Wikle, PhD: And that's a fairly non-invasive procedure, and many of you might have had that done before to really solidify it, though you would have to go in and look at the structural damage. And you would use cameras, for example, to look at the optic nerve and other things that are very specialized, and not a lot of clinics would not have access to that information. So what causes glaucoma?

86
00:21:36.480 --> 00:21:59.670
Christopher K. Wikle, PhD: Well, we don't really know. We do know that it's often associated with increased ocular pressure. So interocular pressure or Iop is just the pressure inside your eyeball. And when that starts pushing on the optic nerve, it can actually damage it. Often Iop is used as a surrogate for glaucoma, and that's because it's really the only treatable factor.

87
00:21:59.750 --> 00:22:02.490
Christopher K. Wikle, PhD: And so when you go to

88
00:22:02.510 --> 00:22:26.099
Christopher K. Wikle, PhD: an optometrist they'll check you for glaucoma. And what they're really checking is your iop. But unfortunately, it's not the only risk factor. And in fact, it's not a very good one in the sense that people with high Iop often do not develop glaucoma, and many of the people who have glaucoma don't have elevated Iop, and, in fact, if you treat it, if you treat high Iop

89
00:22:26.100 --> 00:22:35.179
Christopher K. Wikle, PhD: about a 3rd to a 4th of the people who have glaucoma still progress to blindness. So it has poor sensitivity and poor specificity.

90
00:22:35.180 --> 00:22:58.409
Christopher K. Wikle, PhD: So one of the things we're interested in is what other easy to measure. Risk. Factors might play a role here that would help us on an individualized basis to diagnose and project the progression of glaucoma. And so it's a multifactorial disease sort of the things, many things that that could be a factor. And I'm going to focus on blood pressure because it's so easy to measure.

91
00:22:58.820 --> 00:23:13.489
Christopher K. Wikle, PhD: And we know that blood pressure is a risk factor and is supported by many studies, but unfortunately these results are not very consistent. In some cases low blood pressure has shown to be a factor. In other cases high blood pressure has shown to be a factor.

92
00:23:13.620 --> 00:23:27.289
Christopher K. Wikle, PhD: So one of the main goals of what we're doing here is to come up with trying to understand the balance between Iop and blood pressure to help us with sort of individual diagnosis and treatment

93
00:23:27.400 --> 00:23:28.250
Christopher K. Wikle, PhD: plans.

94
00:23:28.430 --> 00:23:36.010
Christopher K. Wikle, PhD: So let's look at that blood pressure and Iop a little bit more. And so the can we sort of identify subgroups of glaucoma

95
00:23:36.280 --> 00:23:46.479
Christopher K. Wikle, PhD: levels or different stages of glaucoma through just these 2 measurements. And so there are studies that have looked at this.

96
00:23:46.480 --> 00:24:03.010
Christopher K. Wikle, PhD: And I'm focusing here on the Indianapolis glaucoma progression study, which was kind of unique because it was one of the few longitudinal studies. And so we have 7 years worth of data every 6 months on 1 15 individuals. And so

97
00:24:03.010 --> 00:24:17.390
Christopher K. Wikle, PhD: if you look at that, and you look at those individuals, and you try to cluster them with regards to Iop and and a measure of blood pressure, and the mean arterial pressure is just a linear combination of systolic and diastolic.

98
00:24:17.390 --> 00:24:21.479
Christopher K. Wikle, PhD: Unfortunately, there's no discernible clusters that can can be

99
00:24:22.130 --> 00:24:27.559
Christopher K. Wikle, PhD: that come out of that. And so the answer to this question seems to be, No, but

100
00:24:27.790 --> 00:24:39.650
Christopher K. Wikle, PhD: those things by themselves don't really talk about what's happening inside the vascular structure of your eyeball. And so the purpose of our of our

101
00:24:39.760 --> 00:25:03.369
Christopher K. Wikle, PhD: project is to look at physiology informed machine learning to see if we can help with this. And so the basic idea is, we obtain a suite of physiologically relevant features. And we get that through a mathematical model of the ocular hemodynamics of the eye. So basically, people might call this a digital twin these days, which is, we have a

102
00:25:03.370 --> 00:25:14.040
Christopher K. Wikle, PhD: numerical model of the vascular structure of the eye. And we use information from the real world to inform that. And then that tells us something about

103
00:25:14.060 --> 00:25:33.620
Christopher K. Wikle, PhD: the model, and we kind of go back and forth on that. So our team has developed that. And then so we use the features that can come out of that model in a machine learning or inferential framework. And so in particular, we we can take inputs that you could easily get from the

104
00:25:33.660 --> 00:26:02.820
Christopher K. Wikle, PhD: at your clinic. Iop and blood pressure run it through this mathematical digital twin. Get these physiology enhanced data sets? Do some machine learning, or your favorite type of clustering, and then show that that is significant in some sense and leading towards progression or structural progression and functional progression of glaucoma. And then, finally, we want to find out if clinicians will actually use this.

105
00:26:03.200 --> 00:26:08.987
Christopher K. Wikle, PhD: So so what did we find? Well, in that Indiana

106
00:26:09.700 --> 00:26:25.250
Christopher K. Wikle, PhD: study those data? We found that the 12 dimensional data set this physiology enhanced data set does, in fact, lead to discernible clusters. And this is sort of projecting that 12 dimensional

107
00:26:25.778 --> 00:26:36.959
Christopher K. Wikle, PhD: cluster space into 3 dimensions. So you can visualize it. And you can see there's there's 3 distinct clusters there. And the important thing about those clusters is that we looked into that

108
00:26:37.100 --> 00:26:44.930
Christopher K. Wikle, PhD: in the study we actually had other measurements on subjects, so we could actually identify the stage of glaucoma they were at.

109
00:26:45.010 --> 00:27:09.050
Christopher K. Wikle, PhD: And so what we find is that those 3 clusters actually do correspond to significantly different vascular behavior and glaucoma behavior in these individuals. And so the nice thing about that is suggest a very simple way. If this holds up a very simple way that clinicians could actually evaluate this on the spot.

110
00:27:09.050 --> 00:27:31.140
Christopher K. Wikle, PhD: And so, for example, if I take this 3 dimensional cluster cloud, and I project it into the blood pressure. Iop plane again. You can see now there are those clusters now. They're not perfectly separated here in this only 2 dimensions, but it's pretty good. And in fact, we could. If we do something like a support vector machine to identify

111
00:27:31.260 --> 00:27:53.479
Christopher K. Wikle, PhD: regions in that space, then we find that a clinician could actually use this quite simply, they could measure your Iop. They could measure your blood pressure, convert it, to map real quickly, and then see where you fall in here, and then, based on that, we know what the progression is likely to be from our results. For

112
00:27:53.490 --> 00:28:01.989
Christopher K. Wikle, PhD: where you fall in this group, and, for example, the Green Group, there is kind of the the best group in the sense that they typically do not

113
00:28:02.030 --> 00:28:06.660
Christopher K. Wikle, PhD: find that we find that they do not progress

114
00:28:06.890 --> 00:28:24.540
Christopher K. Wikle, PhD: in their glaucoma, whereas other ones are much worse. And so what we're doing now is, we're sort of using transfer learning to other studies here to see if this holds up. And we're also trying to understand a little bit better the progression of the disease based on our

115
00:28:25.250 --> 00:28:26.270
Christopher K. Wikle, PhD: physical model.

116
00:28:26.430 --> 00:28:51.070
Christopher K. Wikle, PhD: So another component of this work is about uncertainty, quantification, and this is one of the parts that's dear to me, and what I work on a lot. And the reason for this is because we have a mathematical model that's deterministic. But yet we know there's all sorts of uncertainties in various places, including the inputs. And so, for example, if I'm interested in thinking about how somebody's

117
00:28:51.240 --> 00:28:51.930
Christopher K. Wikle, PhD: oh.

118
00:28:52.370 --> 00:29:07.690
Christopher K. Wikle, PhD: hemodynamics in their eye is going to change throughout the day as a function of their blood pressure and heart rate and Iop, those things change throughout the day. We know that there's diurnal cycle. In fact, Annie's talk kind of shows that to some extent, too.

119
00:29:07.690 --> 00:29:30.700
Christopher K. Wikle, PhD: that that there's uncertainty here. And so we could say, Well, let's just do a Monte Carlo analysis by running our mathematical model many times with these inputs over this diurnal cycle throughout the day, and we could see what happens. But the problem is, the mathematical model is expensive to run. So what we do is we build a surrogate statistical model to emulate the mathematical model which is very fast to

120
00:29:31.070 --> 00:29:56.900
Christopher K. Wikle, PhD: to simulate. And so, in particular, it's a what I would call a hybrid, statistical, extreme learning machine sort of hybrid, neural, statistical model, which is the things that my group works on. And then what it gives us is this is just one of the variables from the mathematical model. But now not only do we get uncertainties, but we can start looking at different scenarios very quickly, like what happens under the case where you have high blood pressure

121
00:29:56.900 --> 00:30:04.270
Christopher K. Wikle, PhD: and normal Iop, or high blood pressure and extreme Iop.

122
00:30:04.270 --> 00:30:30.049
Christopher K. Wikle, PhD: and vice versa. You can look at all these different things with uncertainty. And so then we can start actually making some imprint about that, and how things are likely to change. So I just want to emphasize. It would be impossible to measure these things with current technology. Something like the mean blood pressure in the central retinal artery over this many subjects over this amount of time that would not be possible.

123
00:30:30.670 --> 00:30:48.379
Christopher K. Wikle, PhD: So the last thing that I wanted to say that we're doing that I'm excited about is that we also want to know how physicians would react to this, what's their take? And so we did this pilot study. So to understand whether

124
00:30:49.630 --> 00:31:04.969
Christopher K. Wikle, PhD: clinicians and ophthalmologists would use AI if it came about. And you can see some of the questions here and again. This is a small study with only 18 participants. But it's just an idea to get an idea what people would say.

125
00:31:04.970 --> 00:31:24.599
Christopher K. Wikle, PhD: And you can see by looking at some of those quotes that basically, if you summarize it, that they do believe AI is vital to ophthalmology and machine learning is vital and that it will inform their practice. But they still think there needs to be a balance between the computer, what the computer tells them, or what the AI or the statistics tell them.

126
00:31:24.600 --> 00:31:36.380
Christopher K. Wikle, PhD: and then what they see themselves in their clinical practice, and they recognize there's some challenges still to this in terms of integrating this into their actual practice, even though we've shown it can be quite simple.

127
00:31:36.380 --> 00:31:50.609
Christopher K. Wikle, PhD: but also just making sure that everyone would have access to this. And so we have an ongoing study. That's also trying to understand how ophthalmologists currently use blood pressure to inform their assessment of patients.

128
00:31:51.106 --> 00:32:00.480
Christopher K. Wikle, PhD: And that study is still, we're wrapping up recruitment. And that'll be done really quickly here and surveys are going out.

129
00:32:00.660 --> 00:32:26.649
Christopher K. Wikle, PhD: So just to conclude, you know, this sort of physiology enhanced digital twin machine learning statistical approach. I don't know. We don't have a good title for the whole thing, but it's very promising, and it's really exciting, because there's all this cross multidisciplinary collaboration going on for each one of these components. I personally knew nothing about the hemodynamics of the eye, even though I'm

130
00:32:26.650 --> 00:32:37.639
Christopher K. Wikle, PhD: kind of a dynamicist by training. I found it fascinating. And so I'm super interested in the mathematical model as much as I am the machine learning and the statistics of this project.

131
00:32:37.680 --> 00:32:59.280
Christopher K. Wikle, PhD: So some of the things we have left to do, some transfer learning to other studies, getting better use of the uncertainty that comes from our fuzzy, clustering mechanism, building a more complicated peode to simulate the space-time structure of the eye mathematically, and building a statistical emulator of that.

132
00:32:59.360 --> 00:33:05.610
Christopher K. Wikle, PhD: and then finishing this last study or this second study on ophthalmologist use of blood pressure.

133
00:33:05.700 --> 00:33:11.432
Christopher K. Wikle, PhD: So that's where we are. And it's like you said, I just want to emphasize how

134
00:33:11.920 --> 00:33:18.609
Christopher K. Wikle, PhD: fun this project is, and if you have any questions, feel free to email me, and I'm happy to send you some references or request.

135
00:33:20.830 --> 00:33:41.949
Wendy Nilsen: Dr. Weichel. Thank you so much. That was fabulous. And last, but definitely, not least, is Carlos Fernandez, granda, and he's the associate professor of mathematics and data science at New York University. During his Phd. He developed a mathematical theory of super resolution methods based on convex optimization.

136
00:33:42.130 --> 00:33:55.960
Wendy Nilsen: Since joining Nyu, his group is focused on the design and analysis of data, science methodology with particular emphasis on machine learning motivated by applications in medicine, climate, science, and scientific imaging.

137
00:33:56.510 --> 00:33:59.000
Wendy Nilsen: I'm going to turn it over to you now. Thank you.

138
00:33:59.270 --> 00:34:02.260
Carlos Fernandez-Granda, PhD: Thank you very much for the kind introduction. Can you see my slides.

139
00:34:05.440 --> 00:34:06.120
Wendy Nilsen: Yes.

140
00:34:06.250 --> 00:34:15.609
Carlos Fernandez-Granda, PhD: All right. Thank you. So today, I'm going to talk about a new method for anomaly detection based on model confidence that we apply to a medical application.

141
00:34:16.040 --> 00:34:21.240
Carlos Fernandez-Granda, PhD: Let me begin with the motivating application we're interested in stroke

142
00:34:21.510 --> 00:34:27.490
Carlos Fernandez-Granda, PhD: stroke, as many of you probably know, corresponds to lack of blood flow or bleeding in the brain.

143
00:34:27.790 --> 00:34:39.619
Carlos Fernandez-Granda, PhD: And unfortunately, it's a very serious medical problem in the United States and worldwide in the Us. There were. There are more than there are around 800,000 strokes a year currently

144
00:34:40.697 --> 00:34:44.199
Carlos Fernandez-Granda, PhD: a lot of patients that suffer from stroke

145
00:34:44.310 --> 00:34:57.300
Carlos Fernandez-Granda, PhD: afterwards have to endure serious, long-term disability, which is a terrible problem for them and their families. It reduces mobility in more than half of stroke survivors, age 65, and older.

146
00:34:58.440 --> 00:35:09.959
Carlos Fernandez-Granda, PhD: A key challenge that we looked at in this study is how to quantify and or monitor impairment in stroke patients in a practical way.

147
00:35:10.540 --> 00:35:21.380
Carlos Fernandez-Granda, PhD: So first, st I'm going to tell you how impairment is quantified in stroke patients right now. And the way it is is these patients.

148
00:35:21.750 --> 00:35:24.210
Carlos Fernandez-Granda, PhD: basically there needs to be a technician

149
00:35:24.590 --> 00:35:32.469
Carlos Fernandez-Granda, PhD: or an expert, rather, perhaps, who interviews these patients and sees how they move different limbs.

150
00:35:32.820 --> 00:35:40.940
Carlos Fernandez-Granda, PhD: different parts of the body like their shoulder their elbow, and essentially writes down the mobility. For each of these joints.

151
00:35:41.470 --> 00:35:52.029
Carlos Fernandez-Granda, PhD: and this can take up to 15 min, and again requires a trained expert. So it's very costly in terms of human resources, and also in terms of time.

152
00:35:54.790 --> 00:36:09.279
Carlos Fernandez-Granda, PhD: Current assessment, as I said, is is time consuming, and requires an expert. Our goal is to try to perform this quantification of impairment directly from video or wearable sensor data.

153
00:36:09.720 --> 00:36:38.999
Carlos Fernandez-Granda, PhD: This could enable monitoring patients in a like at a higher time resolution so more often so they don't have to come into the clinic to have some some expert do the assessment. It would perhaps be more objective, as it wouldn't depend on what expert is is making the assessment, and it would be affordable because it wouldn't rely. It wouldn't require an expert to be involved or the patient to go to the clinic.

154
00:36:39.210 --> 00:37:00.490
Carlos Fernandez-Granda, PhD: This is a visualization of a wearable sensor data here on the left, and also a video of a patient performing a rehabilitation task. You can see the sensors on the patient's back and on the patient's arms, and the data on the left correspond to accelerations and rotations of the sensors.

155
00:37:02.110 --> 00:37:08.400
Carlos Fernandez-Granda, PhD: The idea is to try to automatically quantify the degree of impairment of the patient from such data.

156
00:37:10.840 --> 00:37:26.459
Carlos Fernandez-Granda, PhD: Let's see if I can. Yeah, okay, so we run into a big problem when we try to apply standard machine learning methodology to solve this challenge, which is that the largest publicly available data set

157
00:37:26.740 --> 00:37:31.415
Carlos Fernandez-Granda, PhD: consists of data from 51 patients. So

158
00:37:32.210 --> 00:37:39.800
Carlos Fernandez-Granda, PhD: for a machine learning in order to train and test a machine learning model. This is way too too little.

159
00:37:40.160 --> 00:37:53.039
Carlos Fernandez-Granda, PhD: And this is a pervasive problem there. There are basically very little data available with the corresponding impairment impairment level of the patients.

160
00:37:53.680 --> 00:37:56.049
Carlos Fernandez-Granda, PhD: Therefore, we had to get a bit creative.

161
00:37:56.580 --> 00:37:58.870
Carlos Fernandez-Granda, PhD: And we developed a framework

162
00:37:59.080 --> 00:38:08.110
Carlos Fernandez-Granda, PhD: which uses AI models that are not trained on the stroke patients, but rather they're trained on a healthy population.

163
00:38:08.720 --> 00:38:20.779
Carlos Fernandez-Granda, PhD: And then we use those AI models to quantify the deviation of a patient's movement from normal motion, and that allows us to quantify their degree of impairment.

164
00:38:21.710 --> 00:38:24.279
Carlos Fernandez-Granda, PhD: Let me explain in more detail.

165
00:38:24.500 --> 00:38:37.809
Carlos Fernandez-Granda, PhD: This is an anomaly detection problem. Because again, we're measuring the deviation from normality. We want to quantify to what extent data differ are different from our reference population.

166
00:38:38.030 --> 00:38:45.350
Carlos Fernandez-Granda, PhD: We call our method confidence-based characterization of anomalies. You will see in a moment where the confidence part

167
00:38:45.480 --> 00:38:50.370
Carlos Fernandez-Granda, PhD: comes in, and that's why there's a cobra dressed as a doctor here.

168
00:38:50.490 --> 00:39:04.689
Carlos Fernandez-Granda, PhD: The idea is actually relatively simple. So we train a model to perform a clinically relevant task to what we're interested in which in this case is stroke impairment caused by stroke.

169
00:39:05.470 --> 00:39:13.930
Carlos Fernandez-Granda, PhD: And we train this model to perform this related task which is going to be identifying what motions people are doing on a healthy population.

170
00:39:14.620 --> 00:39:22.359
Carlos Fernandez-Granda, PhD: And then we use the model confidence, when applied to a new patient, to determine to what extent

171
00:39:22.560 --> 00:39:31.369
Carlos Fernandez-Granda, PhD: the movements of these new patients are anomalous to what extent they deviate from normal movement, and that allows us to quantify the degree of impairment

172
00:39:31.530 --> 00:39:32.620
Carlos Fernandez-Granda, PhD: in the patient.

173
00:39:34.060 --> 00:39:44.870
Carlos Fernandez-Granda, PhD: The motion. The basic task that our AI model is performing is identifying what movements the patients are

174
00:39:45.190 --> 00:40:11.930
Carlos Fernandez-Granda, PhD: are performing during rehabilitation. Here I'm just showing you a hierarchy of rehabilitation, art activities where people mimic daily activities, such as dressing, bathing, meal, preparation. This involves some functional movements that are cutting vegetables, tasting sauce, stirring the pot, etc. We are interested in more basic movements that are just reaching to grab an object, repositioning an object, transporting an object, stabilizing an object, and doing math.

175
00:40:13.450 --> 00:40:19.620
Carlos Fernandez-Granda, PhD: So these are some examples of this movement. This is a reach. This is a patient that is going to reach

176
00:40:20.160 --> 00:40:28.490
Carlos Fernandez-Granda, PhD: an object. And we're we're going to train an AI model to automatically identify when the patient is doing that, when they're reaching to grab an object.

177
00:40:29.360 --> 00:40:41.380
Carlos Fernandez-Granda, PhD: This is a transport where the patient is moving an object. In this case, the the arm that we see on the right is the arm that was affected by stroke, and the one which we should look at, they just moved an object. This is called a transport.

178
00:40:41.520 --> 00:40:49.739
Carlos Fernandez-Granda, PhD: Stabilizing an object is keeping an object in without moving, while another, and is manipulating them. Again, we have to look at

179
00:40:49.930 --> 00:40:51.340
Carlos Fernandez-Granda, PhD: the arm on the right.

180
00:40:52.120 --> 00:40:59.669
Carlos Fernandez-Granda, PhD: And a this is idle. So basically, the the patient is doing nothing with their their paretica.

181
00:41:01.600 --> 00:41:13.100
Carlos Fernandez-Granda, PhD: So we trained a neural network to automatically identify which of these simple motions were being carried out by the the individuals.

182
00:41:13.340 --> 00:41:15.910
Carlos Fernandez-Granda, PhD: This is an example of how our model works.

183
00:41:16.820 --> 00:41:20.950
Carlos Fernandez-Granda, PhD: So you have to look at the arm that is enclosed by this red oval.

184
00:41:21.310 --> 00:41:26.610
Carlos Fernandez-Granda, PhD: and the model essentially tries to predict which of these actions is happening at each time.

185
00:41:30.370 --> 00:41:32.210
Carlos Fernandez-Granda, PhD: It works reasonably well.

186
00:41:33.570 --> 00:41:54.329
Carlos Fernandez-Granda, PhD: Now, I'm going to get to the confidence which is crucial for anomaly detection process. So typically neural networks when they try to or in general, machine learning algorithms or statistical models that are classifying between different classes, they typically assign probabilities.

187
00:41:54.640 --> 00:42:00.169
Carlos Fernandez-Granda, PhD: or to each class, are conditioned on the data that have been observed. So in this case.

188
00:42:00.330 --> 00:42:19.560
Carlos Fernandez-Granda, PhD: when the video of this patient is observed over a little amount of time. The model might say that it thinks that this is a reach with probability, 0 point 1 transport with probability, 0 point 0 5 reposition 0 point 1 stabilize 0 point 0 7 and idle 0 point 6 8. So in this case the probability of idle is higher.

189
00:42:19.820 --> 00:42:40.209
Carlos Fernandez-Granda, PhD: and this is the class that would be assigned to this at this time we can interpret this highest probability as the confidence of the model, if that probability sorry, is close to one. This means the model is very confident, if it's close to 0. That means that the level is not confident at all.

190
00:42:40.720 --> 00:42:59.780
Carlos Fernandez-Granda, PhD: What we realized is that when we trained a model on healthy patients and looked at the confidences over a rehabilitation session for a held out healthy patient, not seen sorry, healthy subject not seen previously by the model, and compared those confidences to the ones

191
00:42:59.840 --> 00:43:11.670
Carlos Fernandez-Granda, PhD: and produced by the model. When the data comes from an impaired patient. We realized that there was a lowering in confidence because patients are impaired and

192
00:43:11.960 --> 00:43:25.050
Carlos Fernandez-Granda, PhD: their movements are different from the ones from the healthy patients. Here you can see a histogram of the confidences for the stroke, patient in red and a histogram of the confidences for the healthy individual in blue.

193
00:43:25.900 --> 00:43:40.019
Carlos Fernandez-Granda, PhD: And that's basically our method. So in our method, we train a model on healthy subjects, and then we apply the model to different patients, and depending on how the confidence decreases. That gives us a measure of impairment.

194
00:43:40.880 --> 00:44:08.780
Carlos Fernandez-Granda, PhD: I'm going to finish by showing you the results. Sorry this went back on a way on an independent test cohort. Here you can see on the X-axis our automatic score that uses this AI model trained exclusively on healthy patients based on this confidence. It's just the average confidence on each of these subjects and the Y-axis, you can see the Fugel Meyer score, which is this score that I showed you at the beginning, that is computed, based on a 15 min interview

195
00:44:08.780 --> 00:44:13.559
Carlos Fernandez-Granda, PhD: with a trained expert. And you can see that the correlation is extremely high.

196
00:44:14.070 --> 00:44:25.459
Carlos Fernandez-Granda, PhD: This is the same for videos. In the case of videos, the correlation is a bit lower, because there are certain confounding factors such as when the patients are manipulating an object. Some objects are a little bit more difficult to see.

197
00:44:25.760 --> 00:44:44.239
Carlos Fernandez-Granda, PhD: So with that I will finish. We have developed an anomaly quantification method that is based on AI models that are trained exclusively on healthy patients, and we observe high correlation with expert-based metrics. The lessons learned are that model confidence can be very informative about deviations from normality

198
00:44:44.380 --> 00:44:59.350
Carlos Fernandez-Granda, PhD: on average, and the taxiary labels can be very useful, even if they are only available for healthy subjects. In this case these are the labels indicating what motions these healthy subjects were doing. And that's it. These are the papers related to this project.

199
00:44:59.800 --> 00:45:01.400
Carlos Fernandez-Granda, PhD: I want to thank

200
00:45:01.540 --> 00:45:17.869
Carlos Fernandez-Granda, PhD: my co-authors, especially Heidi Shamra, at the New York University School of Medicine, who who led the clinical side of this project, and I really want to acknowledge the support of Nih and Nsf. Without whom this this research would have been impossible. Thank you very much.

201
00:45:19.130 --> 00:45:32.790
Wendy Nilsen: Thank you so much. Dr. Fernandez. Granda. All right. So we have a few minutes left for some questions, so I'd ask all of my speakers to come back on on their cameras. And I'm I'm

202
00:45:33.210 --> 00:45:44.669
Wendy Nilsen: got. We've got a bunch of questions, but I think there's some that really cross all of these. So I'm gonna start with the 1st question, who are your biomedical collaborators? You doing this all on your own? Or

203
00:45:45.420 --> 00:45:46.420
Wendy Nilsen: what are you all.

204
00:45:46.660 --> 00:45:48.489
Carlos Fernandez-Granda, PhD: Carlos, do you want to start?

205
00:45:48.490 --> 00:45:50.340
Carlos Fernandez-Granda, PhD: I just mentioned mine. So Heidi

206
00:45:50.690 --> 00:46:03.099
Carlos Fernandez-Granda, PhD: is at the Nyu School of Medicine, and she's absolutely crucial for this because she gathered the data and had the pioneering idea of applying machine learning to stroke rehabilitation. And it's really been a wonderful collaboration.

207
00:46:04.470 --> 00:46:05.160
Wendy Nilsen: Thanks.

208
00:46:05.340 --> 00:46:07.520
Wendy Nilsen: Pub Dr. Chu.

209
00:46:07.770 --> 00:46:20.200
Annie Qu, PhD: Yeah, I collaborate with the school of a nursing. So we kind of keep track about the caretaker and also pregnant woman for the nursing subject.

210
00:46:22.540 --> 00:46:23.749
Wendy Nilsen: And Dr. Weichel.

211
00:46:23.750 --> 00:46:26.261
Christopher K. Wikle, PhD: Yeah. So our collaborators are

212
00:46:28.170 --> 00:46:37.860
Christopher K. Wikle, PhD: at the Mount Sinai School of Medicine, and Alon Harris, Dr. Alon Harris's group and ophthalmology. There.

213
00:46:38.960 --> 00:46:39.650
Wendy Nilsen: Great.

214
00:46:39.800 --> 00:46:40.450
Wendy Nilsen: Thank you.

215
00:46:40.690 --> 00:46:49.659
Wendy Nilsen: I guess the the point on here is that you can't do this alone. You've got wonderful collaborators to do what you're doing, and that brings out the best in everyone.

216
00:46:49.910 --> 00:46:50.580
Wendy Nilsen: So

217
00:46:51.520 --> 00:46:58.079
Wendy Nilsen: i i 1 of the questions that came up here. Somebody said, it looks like Nih research. And I'm just. You've all had

218
00:46:58.360 --> 00:47:06.370
Wendy Nilsen: collaborations across. What do you think makes this Nsf research that a fair question.

219
00:47:07.010 --> 00:47:12.559
Carlos Fernandez-Granda, PhD: Yeah. So in my case, an important part of this project was developing anomaly quantification

220
00:47:13.200 --> 00:47:20.154
Carlos Fernandez-Granda, PhD: methodology that is able to identify data that deviates from

221
00:47:21.500 --> 00:47:40.049
Carlos Fernandez-Granda, PhD: from normal populations. And this is a very fundamental statistical question that connects also to to machine learning, because we would want to do this from very high dimensional data. And that is, in my opinion, a very yeah. It is fundamental research as opposed to applied clinical research.

222
00:47:42.290 --> 00:47:49.330
Christopher K. Wikle, PhD: Yeah, in my case. Both the development of the mathematical model

223
00:47:50.490 --> 00:47:55.760
Christopher K. Wikle, PhD: and also the development of the emulation emulator

224
00:47:56.280 --> 00:48:02.459
Christopher K. Wikle, PhD: of the mathematical model are both novel, require novel

225
00:48:02.730 --> 00:48:12.639
Christopher K. Wikle, PhD: mathematics and statistical methods. And so I think in that sense it's very much nsf oriented. It's just the goal is really

226
00:48:13.280 --> 00:48:14.820
Christopher K. Wikle, PhD: much broader than that.

227
00:48:16.190 --> 00:48:41.089
Annie Qu, PhD: So for our research, we original submit to Nsf, but then, Nsf, think it's a great project. Also recommend Nci to fund us. So currently our project is founded by National Institute of Health, because the stress is also, you know, chronic stress related to trigger the cancer. So I would say, Mathematica

228
00:48:41.090 --> 00:49:02.760
Annie Qu, PhD: modelings. Machine learning is related to basic science for Nsf. And I think this, too, like Nih, is more care about science, medicine, discovery, and the conclusion. And how? What's the impact? But without a sound mathematical modeling and foundation of statistics, we cannot achieve this goal.

229
00:49:04.360 --> 00:49:05.935
Wendy Nilsen: Great. Thank you.

230
00:49:06.500 --> 00:49:35.189
Wendy Nilsen: you know this is. And just for our audience, I will say, even when these projects that come in through this mechanism are funded by Nih, they're they're picking for the same reasons. Nsf is so they're looking for the same fundamental science. We are, because this is the way they can bring it in and change and bring in some new scientific ideas. So it's not like, there's a separate Nih idea and an Nsf idea. It's fundamental science questions driving all of it.

231
00:49:35.946 --> 00:49:49.540
Wendy Nilsen: There's a there's a question here. There's many questions about missing data. How do you? How does your analysis assume? Does it assume data is missing at random, or somebody says, Is it informative missing this? How do you all deal with that.

232
00:49:50.370 --> 00:49:50.750
Annie Qu, PhD: Yes.

233
00:49:50.750 --> 00:49:54.299
Wendy Nilsen: So, Dr. Chu, I think it started with your presentation. So.

234
00:49:54.820 --> 00:50:17.559
Annie Qu, PhD: Yeah. So first, st we have to be realistic. We talk about missing data. But let's say, we have the chunk of the data. That's all missing that, let's say, today, I know what's going on the previous stock market going on. But if you want me to predict the stock market even a month later or a year later. There's no any information, I mean, during that period of time.

235
00:50:17.560 --> 00:50:41.670
Annie Qu, PhD: the accuracy we cannot guarantee. We have to be realistic. So here the missing data is more like we have some information from some measurements, but a certain resolution. They're missing, and you can borrow information. And also the nearby time points, so we can extrapolate or interpolate. Let's say I can predict pretty well

236
00:50:41.670 --> 00:50:54.259
Annie Qu, PhD: what's what will happen? Maybe the next hour next day. So in that sense that it's really have to be time dependent and also resolution, the frequency dependence. So

237
00:50:54.330 --> 00:51:20.129
Annie Qu, PhD: on the one hand, we can do it. But also we have to be aware the limitation. So it's not just informative missing, because when we talk about informative missing, it's more like you have the observed associated with the future one whether it's informative. So here sometimes, in reality, it's very difficult to verify. It's a missing mechanism.

238
00:51:24.290 --> 00:51:27.229
Wendy Nilsen: Do any of the other have comments on that one.

239
00:51:31.060 --> 00:51:34.987
Christopher K. Wikle, PhD: I mean, I don't really, for this particular study, because

240
00:51:35.810 --> 00:51:43.250
Christopher K. Wikle, PhD: that you know more of what we've done has been more exploratory at this point. So and and it's a well.

241
00:51:43.380 --> 00:51:52.439
Christopher K. Wikle, PhD: all use data set. So all those things have sort of been worked out by now. But it is, it can definitely be a problem. It's just not a problem for what we're doing right now.

242
00:51:54.600 --> 00:52:02.549
Carlos Fernandez-Granda, PhD: In our case, it's also not a problem. Although the scarcity of label data in terms of

243
00:52:02.780 --> 00:52:13.609
Carlos Fernandez-Granda, PhD: data of stroke patients for whom we know the impairment was actually a main motivation for applying anomaly detection and and trying to use models that are trained on healthy patients. Unhealthy individuals.

244
00:52:15.260 --> 00:52:15.969
Wendy Nilsen: Thank you.

245
00:52:16.340 --> 00:52:26.179
Wendy Nilsen: So there's a comment again, about kind of a collaboration question, how do you navigate the balance between computational and health contributions in your work.

246
00:52:26.370 --> 00:52:36.370
Wendy Nilsen: they were saying, do you start with a computation, approach and identify an appropriate health application? Or do you start with the health side and then build your computation out.

247
00:52:36.760 --> 00:52:41.390
Wendy Nilsen: Pretty sure I'm gonna get many different answers here. So how?

248
00:52:41.880 --> 00:52:43.159
Wendy Nilsen: Who wants to start.

249
00:52:43.360 --> 00:53:09.190
Christopher K. Wikle, PhD: I'll start with that one. I'm actually late to this collaborative team in the sense that Dr. Harris and Dr. Guidaboni had been collaborating, I think, for years before this, and it was very much driven by wanting to understand what's actually happening with hemodynamics in the eye with respect to glaucoma.

250
00:53:09.220 --> 00:53:23.050
Christopher K. Wikle, PhD: So it was very much driven by the scientific question and the medical question. And then over time, it became much more of a as data became available. It became a data question as well, and then

251
00:53:23.480 --> 00:53:28.160
Christopher K. Wikle, PhD: opened it up for machine learning and and statistics to to come in and play a role.

252
00:53:29.970 --> 00:53:54.099
Annie Qu, PhD: Yeah, I can follow the Chris point. I think we were 1st approached by the domain scientist. And then during this research, we discover you can do some abstract thinkings in kind of abstract problem for the statistics problem. So it's a real day. That's very messy. And you see, there's some interesting statistic problem.

253
00:53:54.210 --> 00:54:18.090
Annie Qu, PhD: Then, later, we also, I decide to collect the data ourselves, because the mobile health data is, I mean, compared to other domain signs may be more difficult to collect. But the mobile health is relatively easy to collect, because then we can do the smart design. We know what kind of data we want

254
00:54:18.090 --> 00:54:38.100
Annie Qu, PhD: to collect what subjects we want to approach. So we have better control even at the beginning. So it's more like I get motivated by the the domain science. Then I got motivated to develop statistic methodology. And then later, I also get involved that we actually also can collect the data ourselves. So it's a it's a really fun.

255
00:54:38.100 --> 00:54:39.260
Annie Qu, PhD: a project.

256
00:54:40.450 --> 00:54:41.780
Wendy Nilsen: Great. Thank you.

257
00:54:42.501 --> 00:55:02.649
Wendy Nilsen: All right. There's a it started out with Dr. Fernandez, granda, but I think you'll all get to this one, Dr. Fernandez. This would be great to see how this would translate to Parkinson's patients. Have you explored other applications? And I think I'd love to hear how you all see your work evolving over time. So.

258
00:55:03.050 --> 00:55:31.160
Carlos Fernandez-Granda, PhD: So we have not looked at partners. I think this is a great suggestion, and we should definitely look into it. We did look at an application to disease severity, assessment. Because this framework, where you say, well, I have a related task that is relevant. And then I'm going to look at the confidence of a model trained on healthy subjects, can be applied quite broadly in the application that we explored. It was we had people that suffered from knee osteoarthritis, and we had their knee, Mris.

259
00:55:31.240 --> 00:55:34.929
Carlos Fernandez-Granda, PhD: and we trained a model to segment, healthy knees.

260
00:55:35.070 --> 00:55:50.880
Carlos Fernandez-Granda, PhD: and then we applied that model to these patients, and we observed again that there was a correlation between a lowering of the confidence of the model with the degree of severity of knee osteoarthritis. But we haven't looked into Parkinson's. I think it's a great suggestion.

261
00:55:54.100 --> 00:56:02.741
Christopher K. Wikle, PhD: Yeah, I was in our case. Definitely, this methodology of using a physiology enhanced data set.

262
00:56:03.350 --> 00:56:26.030
Christopher K. Wikle, PhD: where the that enhancement comes from mathematical model. Digital twins, I think, is sort of untapped at this point. And we are actually doing this in other, in other areas, or starting to requires developing a new mathematical model for these things as a component of the biology. But yeah, it's super exciting. I think.

263
00:56:26.080 --> 00:56:37.069
Christopher K. Wikle, PhD: in a way, it's it's just a way to expand the dimension of your input data in a way that is scientifically meaningful. And I think there's all sorts of ways to do that that we haven't even thought of yet.

264
00:56:39.600 --> 00:56:40.260
Wendy Nilsen: Great.

265
00:56:40.430 --> 00:56:42.669
Wendy Nilsen: Did you? Do you want to weigh in on this one?

266
00:56:46.620 --> 00:56:47.660
Wendy Nilsen: Dr. 2.

267
00:56:48.070 --> 00:56:49.296
Annie Qu, PhD: Oh, sorry!

268
00:56:50.180 --> 00:56:51.820
Wendy Nilsen: Do you want to weigh in on this at all?

269
00:56:52.815 --> 00:56:54.809
Annie Qu, PhD: No, I'm okay. I can skip that.

270
00:56:54.810 --> 00:56:55.490
Wendy Nilsen: Okay.

271
00:56:56.113 --> 00:57:15.859
Wendy Nilsen: Somebody's asking about AI, which, having worked working in the division at Nsf, that there's the home there. I have to have to ask the AI question. It says AI has changed everything. Are we expected to conduct AI research, or is it a different approach to be further ahead? And I think you all have interesting ideas about that, because.

272
00:57:17.110 --> 00:57:24.309
Wendy Nilsen: much as we all live, AI, it's not the only thing in the world so wants to start.

273
00:57:24.690 --> 00:57:25.350
Annie Qu, PhD: I can.

274
00:57:25.350 --> 00:57:25.980
Wendy Nilsen: That's true.

275
00:57:26.360 --> 00:57:51.260
Annie Qu, PhD: Yeah. So I think AI really plays a role for the future. It's already plays a role for our health. For example, the smart ring. So we AI, what's AI, it's about we can think about for statistical way to thinking about the algorithm. If we can do automatic algorithm based on observe the data and give us some suggestion which I hope

276
00:57:51.260 --> 00:58:16.229
Annie Qu, PhD: it's a sound suggestion, and I think a lot of AI cannot take into about the precision part to the heterogeneity part. It's doing well for the homogeneous information. They can accumulate all a lot of information, a lot of the data and tell what's a general population, but for the individual this is a very, very challenging for AI to personalize AI.

277
00:58:16.230 --> 00:58:32.979
Annie Qu, PhD: It's extremely challenging. And as a statistician. And we can think about, how can we do even personalize a large language model and a personalized machine learning? So this I think it's a future red hot topic.

278
00:58:36.630 --> 00:58:46.949
Carlos Fernandez-Granda, PhD: In in our in our project. Ai is absolutely fundamental to deal with the high, dimensional, wearable sensor and video data with.

279
00:58:47.380 --> 00:59:08.989
Carlos Fernandez-Granda, PhD: with the methods that we used to have before the advent of deep learning. It would be almost impossible to identify these movements from this high dimensional time series, or from video automatically with high accuracy. But at the same time we run into a challenge that comes with using standard AI methods, which is, we cannot just

280
00:59:08.990 --> 00:59:11.600
Carlos Fernandez-Granda, PhD: apply a machine learning methods that automatically.

281
00:59:11.610 --> 00:59:16.529
Carlos Fernandez-Granda, PhD: it predicts impairment because we have 50 labels of impairment in our

282
00:59:16.550 --> 00:59:21.510
Carlos Fernandez-Granda, PhD: data set these 50 stroke patients. Instead, we have to get creative

283
00:59:21.620 --> 00:59:29.209
Carlos Fernandez-Granda, PhD: and combine AI with statistical ideas. In this case these are normally detection, confidence based

284
00:59:29.350 --> 00:59:32.799
Carlos Fernandez-Granda, PhD: a method in order to to use it effectively.

285
00:59:35.940 --> 00:59:47.959
Christopher K. Wikle, PhD: Yeah. And just to kinda echo that I see 2 things here, 1 1. If in our project we use whatever tools the best for each component of our project. So we have

286
00:59:48.080 --> 01:00:02.429
Christopher K. Wikle, PhD: deterministic mathematical modeling. We have more traditional machine learning. We have AI components, we have everything. Whatever we we need to solve that component and then to integrate them together. And I think that's

287
01:00:02.760 --> 01:00:16.889
Christopher K. Wikle, PhD: that's what real data analysis and data science is. My other view of it is, for as a statistician is that I believe that AI has a lot to teach us about modeling, and and we have a lot to teach

288
01:00:18.240 --> 01:00:25.430
Christopher K. Wikle, PhD: AI practitioners about modeling as well. And I like this notion of being a hybrid thinker about how those things interact.

289
01:00:27.480 --> 01:00:28.250
Wendy Nilsen: Great.

290
01:00:28.250 --> 01:00:29.240
Wendy Nilsen: Thank you all.

291
01:00:29.726 --> 01:00:33.090
Wendy Nilsen: I think it's 4 o'clock. So I'm gonna have to

292
01:00:33.350 --> 01:00:47.200
Wendy Nilsen: to shut off our questions. If you have, we'll be posting the video and so look for that because you can. You can watch it again and learn even more. I know I really enjoyed this, and I just want to give a rousing.

293
01:00:47.270 --> 01:01:07.359
Wendy Nilsen: It's always hard to give applause, because Zoom is going gonna kill my clap. But at least my hands are clapping so and I know everybody else that's on is is thrilled with this. Thank you. Thank you. Thank you for being here with us, and we look forward to to learning more about all the work that you're doing. So. Thanks everyone. Thanks for joining us.

294
01:01:07.890 --> 01:01:08.250
Carlos Fernandez-Granda, PhD: Very much.

295
01:01:08.250 --> 01:01:09.479
Wendy Nilsen: Thank you.

296
01:01:09.480 --> 01:01:10.180
Annie Qu, PhD: But.

297
01:01:10.180 --> 01:01:10.870
Wendy Nilsen: I.