WEBVTT

1
00:00:00.000 --> 00:00:03.550
Michael Littman: Down. So now, now things are being recorded

2
00:00:03.660 --> 00:00:06.189
Michael Littman: to the helpful pop up that I just got.

3
00:00:06.200 --> 00:00:19.790
Michael Littman: So it's great to everybody here today. So just as a reminder. My name is Michael Lindman. I'm. The division director for information and intelligent systems at the Nsf. And I went recently to the Aaa Conference,

4
00:00:19.800 --> 00:00:24.960
Michael Littman: and we presented there. We were engaged in a

5
00:00:25.010 --> 00:00:45.470
Michael Littman: a session where we were talking about shared infrastructure for Ai research, and what I learned is that not a lot of people knew about the National Ai research resource or the near, and we decided maybe it would be helpful to kind of get the word out a little bit better. So what we decided to do today is invite to the office hours

6
00:00:45.480 --> 00:00:57.380
Michael Littman: Haiti and Tpas, who is the director of the office of advanced cyber infrastructure at Nsf. This is a roughly a sibling unit to to Ais,

7
00:00:57.390 --> 00:01:08.710
Michael Littman: and traditionally her organization has been providing computing resources to scientists across disciplines. So folks like chemists and physicists and

8
00:01:08.720 --> 00:01:17.190
Michael Littman: geologists and stuff like that, and it's now becoming clear that computer scientists need some help with that, too. And so Katie was tasked with

9
00:01:17.410 --> 00:01:27.580
Michael Littman: basically spearheading. This this notion of a national Ai research resource pilot as part of the the executive order on Ai.

10
00:01:27.660 --> 00:01:49.169
Michael Littman: And so she's now given. I think if I have the count correct six hundred and eleven of these talks to different sub-communities that just need to know about what's going on, what the efforts are so far, where we are and where we're going. And so i'm delighted to give eighty, eight, six hundred and twelve chance to talk about this.

11
00:01:49.180 --> 00:01:52.920
Michael Littman: Yeah. So let me just let me turn it over to Katie and and we'll just take it from there.

12
00:01:52.930 --> 00:02:10.990
Katie Antypas (NSF - OAC): Okay, thank you so much, Michael and i'm so excited to be here just to to talk with you all and answer questions. You are really a key audience for the the near pilot hopefully. Ah, A user and a participant. And we've been working really closely with Wendy and Michael

13
00:02:11.000 --> 00:02:14.490
Katie Antypas (NSF - OAC): Um on the the launch and deployment of

14
00:02:14.500 --> 00:02:42.979
Katie Antypas (NSF - OAC): this pilot. What I have here today is, I did put in a few slides just to set up the context for everyone. Possibly you heard our larger, Webinar. I'm not going to cover everything, but I did want to just put in. Ah, Ah provide you a little bit of background, and then we really want the rest of the time to be open. Q. And A. What are you thinking of? We Ah, what are What are you hearing? What do you mean, and we want to make it very clear. This is a pilot, and the pilot

15
00:02:42.990 --> 00:02:49.490
Katie Antypas (NSF - OAC): is really needs community input. And we're going to all learn as we go.

16
00:02:49.500 --> 00:02:51.700
Katie Antypas (NSF - OAC): So with that,

17
00:02:51.960 --> 00:02:59.740
Katie Antypas (NSF - OAC): let me just go forward with a couple slides here again. Not not too many, just so we can get some

18
00:03:00.050 --> 00:03:01.780
Katie Antypas (NSF - OAC): some background.

19
00:03:01.790 --> 00:03:13.539
Zhiqian Chen: So I just want to start and level set with everyone with the vision for the ner. Okay. The vision for the near the National Ai Research resource is a now

20
00:03:14.120 --> 00:03:16.350
Zhiqian Chen: a national infrastructure

21
00:03:16.360 --> 00:03:34.400
Zhiqian Chen: that connects the research community, the research and education community to the necessary computing data sets, training materials, model software, and and user support necessary to really power. Ai innovation. I'm: Sure, You see, many of your

22
00:03:34.410 --> 00:03:52.579
Katie Antypas (NSF - OAC): um colleagues or collaborators have moved to industry. We want to make sure that the research community that you all have the resources that you need to really thrive in the this Ai field. So the

23
00:03:52.590 --> 00:04:10.909
Katie Antypas (NSF - OAC): a number of key goals that were articulated as part of the the vision. For the Nair really are just for innovation. Increase the diversity of talents in Ai, improve overall capacity for Ai, R. And D. And to advance a responsible or or trustworthy, trustworthy environment

24
00:04:10.990 --> 00:04:24.180
Katie Antypas (NSF - OAC): to this particular group. I think you under probably understand why an architecture or an infrastructure like the near is needed. These resources that

25
00:04:24.190 --> 00:04:49.540
Katie Antypas (NSF - OAC): ah, the community needs have become increasingly concentrated. They're expensive. It can be hard to use, and they can be um available to, you know, only the most well-resourced institutes or the large technology companies And so the near vision is really a first step towards bridging that gap and providing those researchers that are investigating Ai for the public good.

26
00:04:49.550 --> 00:05:07.230
Katie Antypas (NSF - OAC): Ah! To have access to resources, and also to make sure that our next generation of researchers have access, and in in their education to um, test and train these these models get their hands on resources.

27
00:05:08.120 --> 00:05:17.329
Katie Antypas (NSF - OAC): Okay, So I want to pause here because there is the vision for the full there. This is articulated in a task force report.

28
00:05:17.340 --> 00:05:29.109
Katie Antypas (NSF - OAC): It's a big vision. It's a vision for a two point, six billion dollars initiative over six years. And so this is what was articulated in this report.

29
00:05:29.120 --> 00:05:54.460
Katie Antypas (NSF - OAC): I will note that no funding has has yet been appropriated. Okay, but the division is is a large national scale. Infrastructure where we are today is that we were in and Nsf. Directed with our along with our partners, to contribute in kind resources, as well as in-kind contributions from the private sector and nonprofits to ah

30
00:05:54.660 --> 00:06:09.670
Katie Antypas (NSF - OAC): to launch a pilot for the narrow. So the pilot's going to be limited in scale. Okay, we want to demonstrate the potential impact of the concept of the near. You want to reach new communities, some support, some amazing research,

31
00:06:09.680 --> 00:06:21.950
Katie Antypas (NSF - OAC): and then, really, as a community, gain some experience what we might need. What are some lessons learned, some technical designs that we need to build a larger ner in the future.

32
00:06:22.360 --> 00:06:39.499
Katie Antypas (NSF - OAC): All right. So we launched this pilot on january twenty fourth with ten other agency partners and twenty five non-governmental partners, and you can go to near-pilot dog or that website. It's our our our lean portal at this time,

33
00:06:39.510 --> 00:06:45.489
Katie Antypas (NSF - OAC): and there were three initial opportunities that were open the first. There's a survey

34
00:06:45.500 --> 00:07:09.270
Katie Antypas (NSF - OAC): for researchers and educators to try to understand. What are your potential use cases for the near. We would love to hear from you. We want to hear from you. Okay. The survey actually closes close as soon it closes, but tomorrow I think it closes tomorrow. Hopefully. You've heard of it. It'd be great if you can take a snapshot of that Qr. Code and and fill it out for us.

35
00:07:09.280 --> 00:07:26.310
Katie Antypas (NSF - OAC): We also had our first opportunity for the research community to apply for computing resources is actually just closed on March, and the demand was overwhelming. Right? We had one hundred and fifty submissions in just five weeks. So it's very clear that the demand is

36
00:07:26.320 --> 00:07:40.030
Katie Antypas (NSF - OAC): is out there, and that's that's part of the reason. Of course we're building the pilot. There's also a number of other pilot resources, some data sets that are online from government agencies as well as some some training opportunities.

37
00:07:40.460 --> 00:07:48.019
Katie Antypas (NSF - OAC): A key point to make for you for you. Here is that a second open opportunity is coming in April

38
00:07:48.030 --> 00:08:01.900
Katie Antypas (NSF - OAC): and um! That second opportunity will include access to. I think I have it here, access to cloud computing resources. This includes access to more and a diverse set of

39
00:08:01.910 --> 00:08:19.649
Katie Antypas (NSF - OAC): kind of the agency supercomputing resources it also include will include access and credits for closed model access, such as those from anthropic or open Ai, as well as collaboration opportunities, because the the

40
00:08:20.040 --> 00:08:28.350
Katie Antypas (NSF - OAC): the demand is so large we've kind of scoped the initial themes really just to have a bit of

41
00:08:28.440 --> 00:08:44.670
Katie Antypas (NSF - OAC): a filter to prioritize some of the requests that come in. Those are around seem secure and trustworthy. Ai. Human health, environment, infrastructure, research and Ai education, and you'll see more details in the open opportunity calls.

42
00:08:45.330 --> 00:08:56.240
Katie Antypas (NSF - OAC): So what we're moving towards here is There's a set of researchers there's you. These are us-based researchers. They access these resource resources through

43
00:08:56.250 --> 00:09:11.329
Katie Antypas (NSF - OAC): a portal near pilot dot org, and they get access to the set of of resources. Some you have to apply to, because there's an unlimited amount. Some are open. Okay, some are open like like data sets. Compute is something you have to apply to.

44
00:09:11.630 --> 00:09:28.649
Katie Antypas (NSF - OAC): We've also organized the pilot around some key thrust areas. So one is near open which I've been talking a lot about which is really connecting the research community to these these open Um, ah computational and data resources.

45
00:09:28.660 --> 00:09:32.320
Katie Antypas (NSF - OAC): Another is thrust. Area is near secure.

46
00:09:32.330 --> 00:09:51.299
Katie Antypas (NSF - OAC): So the Department of Energy and Nih are are leading that effort. And so what that means is, if you have research or data sets that require a higher level of security or privacy, preserving that that that need to be taken into account. We do have some enclave that you can use

47
00:09:51.560 --> 00:09:59.790
Katie Antypas (NSF - OAC): um near software. This is a thrust area that will really be forward-looking, thinking about. How can we make this more of a

48
00:09:59.800 --> 00:10:21.830
Katie Antypas (NSF - OAC): kind of a coherent platform right now by necessity, like I call it a bit of a copper. Read right. If you are on an azure, a system, you're going to have that software stack. If you are accessing an and Nsf. Supercomputer, you'll have that software stack. But going forward in the future, you we can imagine. And it's articulated in the near task force. Report A,

49
00:10:21.840 --> 00:10:29.380
Katie Antypas (NSF - OAC): you know, a a near a software stack or an air container. And so this thrust area will be thinking about that.

50
00:10:29.490 --> 00:10:58.779
Katie Antypas (NSF - OAC): Finally, there's a near classroom thrust area, and this is really, in response to the I think it's really an urgent need to be able to give students hands-on access to to models for classroom projects that that they may be doing. And so we've heard from many many institutions that you know, just finding um those computing resources for their introductory to machine learning class can be incredibly challenging even at the most well-resourced, universities.

51
00:11:00.220 --> 00:11:11.690
Katie Antypas (NSF - OAC): Although we we cannot do this without the community. This field is moving way way too fast. And so we'll be having a number of workshops to really launch this community design process. And

52
00:11:11.700 --> 00:11:18.139
Katie Antypas (NSF - OAC): And really that's what the pilot is all about. We don't have all the answers, and in many cases I think we don't even have the questions yet.

53
00:11:19.850 --> 00:11:32.650
Katie Antypas (NSF - OAC): Okay, when I go out to the community. I say, Katie, I've heard Katie, you're talking a lot about compute. What about the data? You know The data is a huge challenge as well, and we absolutely know that is, that is the case. And here are some of the

54
00:11:32.660 --> 00:11:44.429
Katie Antypas (NSF - OAC): the key challenges that that we see. And for the pilot we're going to have to, you know, really target some of the different areas that that we look into here.

55
00:11:44.600 --> 00:11:55.990
Katie Antypas (NSF - OAC): The pilot is not intended to. Ah, you know, be the the full vision of the ner. But we want to investigate a number of these different different areas with the community.

56
00:11:58.250 --> 00:12:12.720
Katie Antypas (NSF - OAC): Finally, I'm. I think I mentioned this with the near classroom, but there's been so much interest in their classroom. I think we've put in our email lists we have Mayor underscore pilot at Nsf. Gov.

57
00:12:12.730 --> 00:12:30.690
Katie Antypas (NSF - OAC): A huge majority of the questions that are coming in are really around their classroom, and how can individual universities get get involved. And so we're working on our strategy here, and we just say, stay, stay, tuned. We'll um hopefully. Have something out um fairly soon on this.

58
00:12:31.520 --> 00:12:52.860
Katie Antypas (NSF - OAC): And so really what what's been next After the launch? We have a ton of of of work to do outreach the community that needs to happen. As I mentioned, we're going to be having a set of community meetings and technical workshops. We're also building the governance structure, too. I mean, we have twenty five ah non-governmental partners and ten agencies we're working with. So there's a lot of work there

59
00:12:52.870 --> 00:13:08.199
Katie Antypas (NSF - OAC): we're integrating partner resources for this next open opportunity. And then we also have to build out our user support capabilities because we want the the research community that comes to have, you know, a good experience working on on these platforms.

60
00:13:09.220 --> 00:13:34.470
Katie Antypas (NSF - OAC): Um. So then, finally, as ah, I think for the Q and a portion I would love to, i'm going to stop on this page that has the big Qr code. So hopefully you can fill out our survey. It came out as a dear colleague's letter about five ah, five or so or six weeks ago, and again the survey closes. Um tomorrow, so we would. We would love it. You could

61
00:13:34.480 --> 00:13:40.079
Katie Antypas (NSF - OAC): fill this out and and get your your friends and neighbors involved as well.

62
00:13:40.290 --> 00:13:46.190
Michael Littman: So with the link is, you don't have to do the Qr. Code. If you're not that kind of person, the link is in the chat.

63
00:13:46.200 --> 00:13:54.240
Katie Antypas (NSF - OAC): The link is in there, too, so maybe I will keep this up. But I really wanted to keep it brief, since I know

64
00:13:54.570 --> 00:14:03.989
Katie Antypas (NSF - OAC): It's just more helpful, I think, to have a lot of Q. And A. So please, Wendy, you want to serve me up with questions, and and we have.

65
00:14:04.000 --> 00:14:05.090
Michael Littman: I'm a member of the program.

66
00:14:05.100 --> 00:14:17.619
Michael Littman: I think I think maybe she'll, Wendy and and Catherine will serve me, and then i'll pass things along to you and just kick things off. I was wondering, I mean, just just listening to kind of the the way that you were presenting this.

67
00:14:17.630 --> 00:14:32.590
Michael Littman: How How unique of a thing is this like? Have Have there been efforts like this in the past? Or there are other kind of analogs that you can point to that say, Yeah, these are communities that have done this sort of thing. This is how it worked out in the past like, How are you thinking about this.

68
00:14:32.600 --> 00:14:35.219
Katie Antypas (NSF - OAC): Yeah, sure. So I will say that.

69
00:14:35.770 --> 00:14:54.290
Katie Antypas (NSF - OAC): Um, We have a long history in and Nsf. Supporting the research community on advanced computing and data platforms, supercomputing resources. I think most of the time these have been targeted towards the domain sciences. Tons of physicists and chemists.

70
00:14:54.300 --> 00:15:11.709
Katie Antypas (NSF - OAC): Um, biologists, geologists, you know, use these resources and clearly the computing sciences and Ai research community has seen a huge increase in in the resources. We absolutely have some of the communities on our platforms today,

71
00:15:11.720 --> 00:15:37.850
Katie Antypas (NSF - OAC): and but we need to do a lot more outreach. So that's one. If there is a baseline to know how how to how to support the Science community. But, secondly, I want to point to the Covid nineteen Hpc. Performance computing consortium. This was a public private partnership during the pandemic that brought together ah resources from cloud computing. Providers as well as our agency supercomputers,

72
00:15:37.860 --> 00:15:43.470
Katie Antypas (NSF - OAC): very quickly provide resources to.

73
00:15:43.540 --> 00:15:55.980
Katie Antypas (NSF - OAC): You know, researchers that were investigating Covid Covid nineteen and needed computational resources. And so we're using some of that know how to launch the

74
00:15:56.060 --> 00:16:12.329
Katie Antypas (NSF - OAC): to to launch the pilot. What i'll note is that this is too long to answer. My, but what what i'll note is that you know, we really focused on computing access before, and I think what's what's new for the near is really thinking also about data access and model access.

75
00:16:12.340 --> 00:16:17.690
Katie Antypas (NSF - OAC): Well, and that's I think, where we need a lot of community community input a compute. We have a good,

76
00:16:17.700 --> 00:16:26.809
Katie Antypas (NSF - OAC): pretty good sense about how to federate that. But the software for Ai, the data sets models that's going to require a lot of discussion.

77
00:16:30.140 --> 00:16:45.699
Michael Littman: Awesome. Thanks. I know, you guys, your your team has has really worked hard so far into trying to figure out what sort of issues are going to come up and how to address them. And so it's. It's really great to kind of hear your thinking on this. One of the questions in the chat has been.

78
00:16:45.930 --> 00:16:50.079
Michael Littman: But how is neare different from existing programs for research infrastructure?

79
00:16:50.470 --> 00:16:53.580
Michael Littman: So can you kind of articulate that distinction there?

80
00:16:53.890 --> 00:16:56.010
Katie Antypas (NSF - OAC): Um, right? A couple

81
00:16:56.130 --> 00:16:58.880
Katie Antypas (NSF - OAC): a couple ways. So

82
00:17:00.240 --> 00:17:19.609
Katie Antypas (NSF - OAC): I think I would, I would say the vision for the ner is much as much larger. The vision is sort of an in kind of an in it more of an integrated platform. Um! We will use Some of we are right now using some of our existing platform. So one question I had is, should I apply to access? Or should I apply an air?

83
00:17:19.619 --> 00:17:30.399
Katie Antypas (NSF - OAC): And the answer right now is, the near calls are fairly targeted on certain topical areas. So that's one. The second is right now.

84
00:17:30.410 --> 00:17:49.099
Katie Antypas (NSF - OAC): The first call was computationally focused, but the resources that will be available in the near Ah pilot in this next April call will include access to models. So Api eight Api ah credits for open ai

85
00:17:49.120 --> 00:18:04.580
Katie Antypas (NSF - OAC): ah resources, for example, or in throat partnerships with with anthropic. And so there's there's other collaboration opportunities as well. So it's a more diverse set of resources that will be available than just compute.

86
00:18:09.670 --> 00:18:11.220
Michael Littman: Thanks. So

87
00:18:11.230 --> 00:18:38.380
Michael Littman: I actually interpreted the question a little bit differently. So so let me see if I can prompt you in a in a slightly different direction, or maybe just just cover this. So Nsf: offers programs for research, infrastructure. And typically what that means is we would like the community to tell us what research resources you're going to provide for the community right? That makes sense. So So some people submit proposals where they say we would like to. We want to build a telescope for the

88
00:18:38.390 --> 00:18:52.169
Michael Littman: Let's say the astronomy community, because I don't think the computer scientists need that quite so much, but they but they propose that they're going to build and provide this this piece of research infrastructure. So a lot of Nsf programs that are focused that mention research infrastructure. It's about,

89
00:18:52.200 --> 00:19:06.629
Michael Littman: You know us helping you provide those. This the ner, is not that the there is. We're providing a research piece of infrastructure to the community. And so, when you apply for using the ner you're actually applying to use the ner for the research that you do.

90
00:19:06.850 --> 00:19:08.590
Michael Littman: Does that make sense

91
00:19:08.600 --> 00:19:27.110
Katie Antypas (NSF - OAC): that that's correct? Um. But I would say we are providing the infrastructure that we that that those um opportunities will get in the vision for the full in there will get bid out to. Ah! There' be solicitations around that. So

92
00:19:27.120 --> 00:19:31.069
Katie Antypas (NSF - OAC): if that makes sense, so in the vision for the full, there

93
00:19:31.080 --> 00:19:55.039
Katie Antypas (NSF - OAC): there is funding for resource providers. So members in the community. Um! And this is how they people people in the ah office of advanced, cyber, infrastructure community. They bid on. Ah, our solicitations to build a supercomputer for the research community. Here we've been talking about. How do you be a user of the near?

94
00:19:55.260 --> 00:19:56.590
Katie Antypas (NSF - OAC): Does that make that help?

95
00:19:56.600 --> 00:19:58.489
Michael Littman: Yeah, that that clarified for me. Thanks,

96
00:19:58.500 --> 00:20:07.329
Katie Antypas (NSF - OAC): and I would. I just say anyone in Oic that is on. Ah, you hopefully have microphone access, and you can jump in as well

97
00:20:08.700 --> 00:20:24.819
Michael Littman: right just to point out to folks in case in case you don't know this. Katie brought some of her team along for for this presentation. So we've got a deep bench of very knowledgeable and expert people, to to make sure that these questions get get answered correctly

98
00:20:24.910 --> 00:20:44.629
Michael Littman: or in the most useful possible way. Let's say that we're always interested in correct answer. All right. So um. Another question from from the chat is when submitting a proposal, do we need to specify one source provider, or can we list Several preferred providers, and let the providers decide who might be interested to support the proposed research.

99
00:20:44.640 --> 00:20:57.629
Katie Antypas (NSF - OAC): Oh, great question! Um! I actually even like the latter better. So you do not have to. I'm going to let your throne answer it. You. You don't have to specify You don't know, and or if you don't

100
00:20:57.650 --> 00:21:06.309
Katie Antypas (NSF - OAC): care if you don't have a preference, and so, shawn I saw you turn your camera on, Drawn is in this daily, and is our lead on this. So go ahead.

101
00:21:06.320 --> 00:21:21.670
Sharon Broude Geva/NSF: Yeah, thanks, Katie. Um, you. You do not have to specify which resource you would like from the list of resources that are that are possibilities. We actually have a check box for you to say.

102
00:21:21.680 --> 00:21:35.369
Sharon Broude Geva/NSF: I don't know which resource I want. Choose for me. The other possibility is to choose more than one resource, but to indicate that that you would, in your submission to indicate that you would like

103
00:21:35.380 --> 00:21:54.819
Sharon Broude Geva/NSF: um to just be matched with whatever the best resource is, and and in any case every submission, every request, is going to go through a matching process. So if you if you get it wrong. Um, and and ask for for a resource that Isn't suited for your request

104
00:21:54.830 --> 00:22:06.420
Sharon Broude Geva/NSF: that it will still be available to be matched to one of the resources that it does. But it does meet, or it does match best with.

105
00:22:07.060 --> 00:22:23.060
Katie Antypas (NSF - OAC): Yeah. And I can also note, just because you request a particular resources. It doesn't guarantee you will get it, because there's a capacity limit on on each of those resources, so we will do our best to match the matching process. We match you to

106
00:22:24.410 --> 00:22:26.790
Katie Antypas (NSF - OAC): the the best available resource.

107
00:22:29.240 --> 00:22:44.699
Michael Littman: All right. Another question. Are there other resources that you all, including Key, might recommend that I access to ensure that I stay on top of what you all are doing with the near pilot. So I guess. How do you? How do you stay up to date with news and information about what's going on?

108
00:22:45.240 --> 00:23:13.429
Katie Antypas (NSF - OAC): Oh, this feels like I was just handed a softball hit, so you can um subscribe to our mailing list. It's like kind of an old school mailing list. I'm sure Bill is putting it in the chat as we speak, So that is, that is the best way to stay. Ah, we will give me a thumbs up the stay. Stay in tune, and you know larger larger announcements. We will,

109
00:23:13.870 --> 00:23:27.670
Katie Antypas (NSF - OAC): you know we'll echo out through the typical and appropriate channels, whether it's a a Dcl. Or or or other or other means. And, Bill, you're welcome to to jump in if there's any more there.

110
00:23:31.940 --> 00:23:39.589
Michael Littman: All right. Next question uses an acronym that i'm unfamiliar with. So let me try to look it up real fast and make sure it's actually relevant.

111
00:23:39.600 --> 00:23:53.010
Michael Littman: Oh, of course that makes a ton of sense all right, person, as I have been working with Osdf, which I now know is the open Science Data Federation on integrating our big data for Ai.

112
00:23:53.020 --> 00:23:59.940
Michael Littman: Is there any effort, or is there an effort or opportunity to integrate Osdf data origins and caches with Nair.

113
00:24:01.880 --> 00:24:08.899
Katie Antypas (NSF - OAC): So I think that is, there's absolutely something we're we're looking into right right now,

114
00:24:09.530 --> 00:24:24.219
Katie Antypas (NSF - OAC): making sure that the data sets are close to the necessary computing resources. It is a really key priority. I would actually, I don't know whoever wrote that. I think if you, if you write us

115
00:24:24.310 --> 00:24:36.010
Katie Antypas (NSF - OAC): at ah near underscore pilot in a set of I'd love to understand kind of your your particular use case, but we we certainly are looking at

116
00:24:36.340 --> 00:24:53.909
Katie Antypas (NSF - OAC): the kind of the data layer. Now, I would say we needed to get something out quickly in an executive order that you know directed us in ninety days. And so we moved really quickly on the compute. And I think now we're we're moving on some of those other fronts which include data and software. But um, yeah, Great

117
00:24:54.330 --> 00:24:55.890
Katie Antypas (NSF - OAC): Great question.

118
00:24:55.900 --> 00:24:56.710
Six.

119
00:24:57.210 --> 00:25:05.160
Michael Littman: All right. I really like this next question because it gets into a whole bunch of stuff that I guess I haven't heard you talk about as much, but I know that you've thought about

120
00:25:05.410 --> 00:25:20.250
Michael Littman: it. It seems you are seeing Ai researchers, and i'm going to take you to mean, like all of us at an Asf. And and the other folks who are involved in the mayor, it seems you are seeing Ai researchers only as consumers, but not as contributors.

121
00:25:20.260 --> 00:25:39.960
Michael Littman: Community input through surveys and design workshops, is good, but it seems to me it would be useful to consider how the Ai community should be part of the development of the ner. This is not clear from the messaging. I also think that using Ai to create a new generation of infrastructure should be central to Nair and a broader benefit for all sciences.

122
00:25:41.130 --> 00:25:44.979
Katie Antypas (NSF - OAC): Yeah, great great question.

123
00:25:45.090 --> 00:26:03.210
Katie Antypas (NSF - OAC): Ah, we've heard many folks want to contribute to contributed data sets ah! To to the, to the pilot. So one i'll say we have an internal working group that is working on a kind of process for how to

124
00:26:03.260 --> 00:26:15.830
Katie Antypas (NSF - OAC): accept. And those data sets. You know what is the process and the criteria, I would say, you know, I would say, stay tuned the

125
00:26:16.370 --> 00:26:33.620
Katie Antypas (NSF - OAC): we we had to launch very quickly, and we absolutely have the intention to have contributions from from the community. It takes a little while, and you know, to to get the wheels in in motion. Maybe I can talk about the

126
00:26:33.630 --> 00:26:37.660
Katie Antypas (NSF - OAC): grand vision, for the ner is that

127
00:26:37.770 --> 00:26:39.070
Katie Antypas (NSF - OAC): Ah,

128
00:26:39.330 --> 00:26:44.679
Katie Antypas (NSF - OAC): i'm going to put this slide back up. I think it's. It's important to note this here.

129
00:26:55.050 --> 00:27:11.610
Katie Antypas (NSF - OAC): Ah, I think it's important to note. What is this operating entity here? This operating entity is not envisioned to be a non-governmental operating entity that is, running the day-to-day operations up there.

130
00:27:11.620 --> 00:27:27.780
Katie Antypas (NSF - OAC): So the the longer term vision is that Ah, this would be bid out to the community running the day-to-day operations and the strategy for for the near, and so this wouldn't be an and nsf

131
00:27:27.870 --> 00:27:32.109
Katie Antypas (NSF - OAC): How do I say? Nsf would be the uh.

132
00:27:32.420 --> 00:27:47.680
Katie Antypas (NSF - OAC): You know we are the the funding agency. But a competitive process would select this operating of this operating entity. We're at a stage right now, where we're just working to have really early working groups.

133
00:27:47.690 --> 00:28:09.069
Katie Antypas (NSF - OAC): So I would love to, you know, have whoever submitted that fill out kind of the survey and and tell us a little bit more. We're moving really quickly here, but absolutely, you know, intend this to be a community design and have community input into different elements of the mayor. Absolutely.

134
00:28:10.310 --> 00:28:21.589
Katie Antypas (NSF - OAC): I don't know if other people from Oac want to weigh in on that. But that is definitely our our direction, and we're just barely out of the starting gate, is what I would say right now.

135
00:28:25.170 --> 00:28:38.970
Michael Littman: Yeah, I don't know if other people want to time in. But I did. I did so. Okay. So I heard you say, there's the opportunity for people to contribute, for example, data. I think The question, maybe, was about contributing to the actual design, and maybe even the implementation.

136
00:28:38.980 --> 00:28:50.290
Michael Littman: And in particular, the the little plug at the end was, what what role could Ai play in helping to make just cyber infrastructure better generally, and should should that be a part of the near mission

137
00:28:50.300 --> 00:28:53.970
Michael Littman: explicitly? Or is that something even that the ner would support at all.

138
00:28:54.500 --> 00:28:57.049
Katie Antypas (NSF - OAC): Oh, that's kind of That's kind of interesting.

139
00:28:57.060 --> 00:28:58.989
Katie Antypas (NSF - OAC): Yeah, no, I I um.

140
00:28:59.000 --> 00:29:16.090
Katie Antypas (NSF - OAC): I think we're open to all all ideas right now, you know. Ah, I would say, Bill Miller reminds me often right that that the F in the National Science Foundation is a foundation, and so our goal is to get opportunities out to the community.

141
00:29:16.100 --> 00:29:27.800
Katie Antypas (NSF - OAC): Ah, in in in this regard. And so I think we're just at a really early stage, but we absolutely expect the community would be part and critical to the design

142
00:29:29.180 --> 00:29:38.450
Michael Littman: where so people doing research on things like how to use Ai for cyber infrastructure. Where do they even get traction? Is that A.

143
00:29:38.480 --> 00:29:41.230
Michael Littman: Is there a research conference that

144
00:29:41.240 --> 00:29:56.759
Michael Littman: does that? Do you know, I mean I i'm. I'm asking partly because I do feel like the question sort of falls in the cracks between the two of us. There might be other folks on the on the on the Webinar who who've thought about this. I know Bolanda Gill is here,

145
00:29:56.770 --> 00:30:18.670
Michael Littman: but the the role? Yeah, the role that Ai can play, and you know, presumably you don't want to just start off saying, Oh, you've got this crazy new Ai, algorithm sure. We'll put it. We'll. We'll. We'll use it to do all the allocations on on the infrastructure. That would probably be a mistake. But presumably there's some kind of build up of results through a research process that could eventually get incorporated into the infrastructure itself.

146
00:30:19.460 --> 00:30:34.919
Katie Antypas (NSF - OAC): Yeah, no, this is. This is really this is really interesting. And I mean, I think part of what we are finding is that Ah, you know, we but Oac and and Iis have have started to work a lot more closely. Yeah, through the nearer than we.

147
00:30:34.930 --> 00:30:54.490
Katie Antypas (NSF - OAC): Perhaps Haven't had in the past. And I think this is this is a potential gap that we we need to address. I don't have a good good answer for you right now, and I know people at the you know, that are working on high performance. Computing systems are incorporating Ai more and more into into their operations. Right? Whether it's

148
00:30:54.500 --> 00:31:10.519
Katie Antypas (NSF - OAC): um managing the temperature of of the systems and the facilities to the scheduling. So I I know that research is there. Are, you know, already underway, and I can imagine, with the near that something at the scale would be even more important.

149
00:31:11.970 --> 00:31:13.090
Michael Littman: Awesome

150
00:31:13.360 --> 00:31:15.939
Michael Littman: Thanks. All right. Another question. Uh,

151
00:31:17.330 --> 00:31:26.059
Michael Littman: I feel like another softball that I feel like I've heard you answer this question before. Are the resources limited to stem? Will people outside stem get involved.

152
00:31:26.880 --> 00:31:30.249
Katie Antypas (NSF - OAC): Yeah. Great great question. So um,

153
00:31:31.390 --> 00:31:38.009
Katie Antypas (NSF - OAC): the The research area, the the the full vision of the ner is that

154
00:31:38.150 --> 00:31:39.510
Katie Antypas (NSF - OAC): the

155
00:31:39.730 --> 00:31:45.879
Katie Antypas (NSF - OAC): you can support the research and education community, not only in the

156
00:31:45.890 --> 00:32:02.319
Katie Antypas (NSF - OAC): core Ai research, but also in research from the domain, sciences, or or social sciences that are applying Ai or using Ai, so that is in scope to the big for the full vision of the near. I will note that the um because

157
00:32:02.330 --> 00:32:10.790
Katie Antypas (NSF - OAC): we're in a pilot stage we Don't have a ton of resources right now. The initial scoped calls have been focused on stem areas.

158
00:32:10.800 --> 00:32:23.939
Katie Antypas (NSF - OAC): So kind of two parts of that question. Yes, for the greater vision support a really wide range of science areas. Initially, while we have a number of limited resources, those first areas are stem related

159
00:32:24.770 --> 00:32:41.619
Michael Littman: and and certainly areas like. So So one of the early priorities for the mayor is trustworthy, responsible, ethical Ai. And there's definitely people kind of outside the traditional stem disciplines that have contributions to make there,

160
00:32:41.630 --> 00:32:49.720
Michael Littman: so presumably they they would be welcome in some form. Right it it. It has to be the case that it's clear what the research

161
00:32:49.730 --> 00:33:05.160
Michael Littman: is being proposed to do that that people need these resource resources, and they're going to do some relevant research. But yeah, they don't have to be stemmed people per se. That said, I think, Yeah, I guess, Katie, as you're pointing out that the goal is not to say, okay. Well,

162
00:33:05.460 --> 00:33:19.189
Michael Littman: you need computing resources. You're not doing. Ai, you're not in stem. That's not what this is designed for. It really is supposed to be an Ai research resource. And so if you're a non stem person who's doing Ai research and needs the resources, then it seems like it would be a

163
00:33:19.200 --> 00:33:22.259
Katie Antypas (NSF - OAC): yeah. And you're you're absolutely in in scope.

164
00:33:23.830 --> 00:33:36.480
Michael Littman: All right. As I understand it, we only had one other question, and I think Bill just posted saying that it was already answered through the Q. And A. So maybe i'll ask a question in case other people need a minute or two to think of their own questions

165
00:33:36.520 --> 00:33:42.620
Michael Littman: that you've, as I said, you've done now six hundred and twelve of these presentations.

166
00:33:43.020 --> 00:33:54.659
Michael Littman: What's so? What's a great question that you've gotten that we didn't think to ask like, Has there been any particularly really interesting, exciting insights that people have had that you feel like, Oh, yeah, that's actually really important.

167
00:33:56.220 --> 00:34:03.590
Michael Littman: It's like, I'm going to give myself a hard question. Well, you can ask it of me if you don't want to answer it. I just curious as to what the question is,

168
00:34:03.600 --> 00:34:14.390
Katie Antypas (NSF - OAC): Yeah, I I think the questions are: How is the pilot gonna itself going to enable the support of trustworthy Ai: right? So that's a question that

169
00:34:14.550 --> 00:34:31.450
Katie Antypas (NSF - OAC): one. The research areas that we're focusing on are that in the calls are in that theme. But there's another question there, which is, what policies does the the infrastructure. The pilot need to take

170
00:34:31.460 --> 00:34:45.969
Katie Antypas (NSF - OAC): should users have to go to have to take a training. How do we accept a data set into the pilot or a model. What's our process? If something is found in that model? That is,

171
00:34:46.270 --> 00:34:54.389
Katie Antypas (NSF - OAC): you know, illegal like, what? What are all these these processes? And I think, really bringing the community together

172
00:34:54.400 --> 00:35:03.110
Katie Antypas (NSF - OAC): around these questions is is something we want to do in the pilot, You know these are new new questions and

173
00:35:03.120 --> 00:35:16.809
Katie Antypas (NSF - OAC): um. And so that that's that's a question nobody asked. It's also a harder question for me to answer, because we don't. That's part of the process of the pilot is is is figuring some of these things out and learning what what would work and what would

174
00:35:17.390 --> 00:35:18.789
Katie Antypas (NSF - OAC): uh, maybe you want to answer the

175
00:35:18.800 --> 00:35:22.559
Michael Littman: yeah don't that I really like me. I was gonna, I would add.

176
00:35:22.570 --> 00:35:52.080
Michael Littman: So when I think about what it, what it means to do Trust for the Ai research at the moment like in the beginning, when people started using the phrase I was, I scoffed, because i'm like, what were we doing before? We were not doing untrust when the Ai research, but but the idea of trustworthy Ai research is to try to understand. How do we? How do we understand what these systems are doing, and how do we make sure that they do more of what we want and less of what we don't want? And my sense is that the biggest issue there is that we've been exploring as a research field lately.

177
00:35:52.090 --> 00:36:21.950
Michael Littman: We've been trying to explore the the ramifications of scale like what happens when you make these models really really big, and you put lots of data more data than any individual human being, and possibly any group of human beings of of reasonable size could vet right? So you're putting all kinds of things in these models, and then we don't really understand the the size of the scope of the models. The only way to answer those kinds of research questions is to be able to do work at scale right. It's it's you don't actually answer the question. If you do it on your laptop, because well you you might. You might get lucky. It might be that there's some

178
00:36:21.960 --> 00:36:28.999
Michael Littman: fundamental principle that you can actually explore locally. But my sense is that a lot of the real questions involve

179
00:36:29.010 --> 00:36:45.990
Michael Littman: understanding what these models are doing at scale and individual researchers. Haven't had the opportunity to do that. So. But, like I don't know how to do this without the Narr. So So and how is Nayor supporting it. Well, there is making it possible for people to look at these models at scale, or at least that's division.

180
00:36:46.000 --> 00:36:48.469
Michael Littman: That's I don't know. That's how I think about it, anyway.

181
00:36:51.080 --> 00:36:53.759
Michael Littman: All right. Um. Speaking of scale.

182
00:36:54.080 --> 00:37:10.949
Michael Littman: So sometimes I think to my this is maybe too hard of a question. But it is something that's been on my mind. If this works the way we hope it works. Which is to say, it really provides a way for people to do the cutting edge Ai research that society needs, and that we need to understand.

183
00:37:10.960 --> 00:37:20.509
Michael Littman: And we have a very big Ai research community these days, because not only do. We have more people studying Ai explicitly, but more people who are studying other topics.

184
00:37:20.540 --> 00:37:38.209
Michael Littman: It now includes Ai. You can be a geologist and be like an Ai researcher who's who's studying the use of Ai and in geology. So we've got this explosion of a community size, and also the needs per community. Person. How will this possibly ever scale

185
00:37:38.220 --> 00:37:42.009
Michael Littman: right like, if this, how would this not collapse under its own success?

186
00:37:42.630 --> 00:37:44.810
Katie Antypas (NSF - OAC): So I I think um

187
00:37:45.660 --> 00:37:57.230
Katie Antypas (NSF - OAC): right now, right we are on a a huge curve, right which is is exploding in terms of the the requirements from the research community.

188
00:37:58.450 --> 00:38:05.770
Katie Antypas (NSF - OAC): I think we can't underestimate the impact of providing, you know,

189
00:38:06.140 --> 00:38:07.330
Katie Antypas (NSF - OAC): some

190
00:38:07.470 --> 00:38:13.870
Katie Antypas (NSF - OAC): starting this effort. And so I think it's critical that we start, and we show

191
00:38:13.880 --> 00:38:30.339
Katie Antypas (NSF - OAC): and demonstrate the potential and the need and the value, I think the largest models that are trained will probably still have to happen on our biggest supercomputers, or in partnership with the industry. And So I think we have to think as a community.

192
00:38:30.550 --> 00:38:32.069
Katie Antypas (NSF - OAC): You know how

193
00:38:32.120 --> 00:38:34.359
Katie Antypas (NSF - OAC): you know. What tools do you

194
00:38:34.370 --> 00:39:01.579
Katie Antypas (NSF - OAC): need to be trained at the larger, so larger scale. And then what can be reused or refined by other members of the community? I think we also right. We need to keep pushing on basic basic research into the methods that might help bend the curve basic research into the the systems and the technologies and the processors that would allow us, to, you know long-term, and then the curve in

195
00:39:02.560 --> 00:39:11.740
Katie Antypas (NSF - OAC): in terms of energy efficient architectures. So it's it's clearly a big challenge. I think what we want to do is take a first step

196
00:39:11.750 --> 00:39:36.299
Katie Antypas (NSF - OAC): and begin to show the impact that the research community can have. And right now you know it's It's really hard for researchers to even even get get these access to to these resources. So I think I think it's a good question. I think we got a step forward and and start to weed into this this these waters, and you know we'll learn as as as we go.

197
00:39:37.450 --> 00:39:41.719
Michael Littman: Yeah, I like the answer a lot. The the the

198
00:39:41.840 --> 00:39:53.659
Michael Littman: the head of the size directorate, who hired both me and Katy. I I I asked her a question like this once, because this was pre-nare, but but during the Cloud Bank era.

199
00:39:53.670 --> 00:40:15.630
Michael Littman: She's like. You should get more people to use Cloud Bank because they need the resources, and we have the resources. And and I said, Well, what if everybody uses the resources? Then they're all going to get used up? She's like That's a different problem that we could probably know that we can work on solving them. But at the moment we're not in that regime. And so I feel like That's It's a similar thing, except for kind of another order of magnitude up

200
00:40:16.060 --> 00:40:32.919
Michael Littman: all right. Well, this gave a bunch of other people a chance to ask some some good questions, so the next question is, would there be future opportunities to support a longer time project? It seems the first call supports research for six months, while some research, for example, new foundation model building may take longer.

201
00:40:34.200 --> 00:40:36.529
Katie Antypas (NSF - OAC): Thanks for that. Thank you.

202
00:40:36.660 --> 00:40:48.739
Katie Antypas (NSF - OAC): We will take that part of the six months was driven by one of our one of the resources is getting retired. But, Sharon, do you want to say anything more about about that?

203
00:40:51.880 --> 00:40:54.609
Sharon Broude Geva/NSF: Sorry unmuting It's hard.

204
00:40:54.750 --> 00:40:56.399
Sharon Broude Geva/NSF: The

205
00:40:56.420 --> 00:41:11.819
Sharon Broude Geva/NSF: yes, we we are going to continue. As we see, as we add in resources, we're going to continue to see what makes sense. We we are aware, and we have heard from from the community that six months may not be

206
00:41:11.830 --> 00:41:22.209
Sharon Broude Geva/NSF: enough time to get something up and running. So as Katie mentioned, the decision for these six months was first of all, because it was the the first,

207
00:41:22.330 --> 00:41:39.449
Sharon Broude Geva/NSF: the the first call, and we wanted to learn from it, but also because of of a specific resource. One of the things we'll we'll look at is possibly rolling in resources which would make it possible to do longer

208
00:41:39.460 --> 00:41:46.769
Sharon Broude Geva/NSF: amounts of time for an allocation without having to be constrained by any specific resource.

209
00:41:46.950 --> 00:41:51.810
Katie Antypas (NSF - OAC): Yeah, I'll also note what is pretty typical on agency systems now is like,

210
00:41:51.890 --> 00:42:08.989
Katie Antypas (NSF - OAC): ah renewals. So lots of projects, you know, will have been users of of our systems for like decades, but they send in kind of an annual ah review, but we renewal, but we'll we'll um. We'll take that question. I think it's a good one, Bill. You have your hand up.

211
00:42:09.770 --> 00:42:10.490
Katie Antypas (NSF - OAC): You.

212
00:42:10.500 --> 00:42:17.749
Bill Miller/NSF: Yeah, that's a great question. We have actually discussed this a lot as that Sharon was alluding to.

213
00:42:18.020 --> 00:42:30.360
Bill Miller/NSF: We have to also remember that this is a pilot, and you know we've envisioned a two-year activity in which we're trying to do things that are new. Ah, all the things that that Katie laid out,

214
00:42:30.840 --> 00:42:41.040
Bill Miller/NSF: you know, in anticipation of the potential for a a real full scale there. Should Congress decide that it would like to to to to provide funding for that.

215
00:42:41.050 --> 00:42:47.389
Bill Miller/NSF: But since Ai is so important, and and we also mentioned that we have

216
00:42:47.410 --> 00:43:09.910
Bill Miller/NSF: Ah, the regular ah way of the research community coming in to to ask for computing resources through what we call access. If you're familiar with that actually Sharon as the the cognitive program officer. So she's here with us. But that that also allows funding of their Ai projects, and through access on that.

217
00:43:09.920 --> 00:43:19.649
Bill Miller/NSF: Those are typically longer right, Sharon. It's it's a year long allocation, right? So so.

218
00:43:20.010 --> 00:43:31.330
Bill Miller/NSF: And just be aware that there's also that pathway. If you're thinking, or you're needing a longer-term project, and what you might see through the pilot, for whatever reason then, that avenue is available to you,

219
00:43:34.130 --> 00:43:42.709
Michael Littman: i'm going to do the next question slightly out of order, because I feel like it connects with what we're just talking about. If everything works perfectly, what do you expect to see in five years.

220
00:43:44.630 --> 00:43:45.709
Katie Antypas (NSF - OAC): Okay,

221
00:43:45.760 --> 00:43:54.430
Katie Antypas (NSF - OAC): So as Bill said, the the pilot is scope for right now. Two years, I mean, ideally, we show

222
00:43:54.650 --> 00:44:03.560
Katie Antypas (NSF - OAC): you know what are the outcomes we're looking for in a pilot. So i'm going to go shorter in in two years we would like to pilot

223
00:44:03.570 --> 00:44:29.319
Katie Antypas (NSF - OAC): and demonstrating. We can reach new communities. Clearly, we're not going to reach all communities. But can we show, hey? We can go into certain regions of the country or certain types of institutions and really get some new users. We also want to, I think, highlight some of the amazing science work or and research that might not otherwise have happened if if the nearer was near, Kyla had not started.

224
00:44:29.330 --> 00:44:33.849
Katie Antypas (NSF - OAC): And um also get some of these lessons lessons learned from the community.

225
00:44:34.000 --> 00:44:48.659
Katie Antypas (NSF - OAC): If it was five years I would hope that we certainly had demonstrated that, and that we would be moving on the full the fullness by by that by that point, but that that gets into the hands of, I think what

226
00:44:48.670 --> 00:45:06.059
Katie Antypas (NSF - OAC): what what Congress would would do. But I think our job right now is to to demonstrate that this two-year pilot is really needed by the research community. Um! And that Ah, we want to make sure what it is being created and built

227
00:45:06.070 --> 00:45:21.280
Katie Antypas (NSF - OAC): as the fingerprints on from the research community, and that it is something that's that's you know, popular and and and useful. Ah! To to the community. And the only way that we can do that is, with um, with community participation.

228
00:45:24.430 --> 00:45:36.780
Michael Littman: Awesome thanks. All right. A couple more questions that actually covered some missing, some holes in the space that we hadn't actually talked about. So much one is how to make ner more accessible to students for education, purpose,

229
00:45:37.830 --> 00:45:57.480
Katie Antypas (NSF - OAC): Great question. We are um working hard on this right now. I think one thing we really want to do is pilot bringing resources to, you know, some classrooms, so that instructors, lectures professors

230
00:45:57.490 --> 00:46:09.510
Katie Antypas (NSF - OAC): can use near resources in their classrooms. I think we know that to get started even the most advanced users you might just start with a Jupiter-like notebook.

231
00:46:09.700 --> 00:46:22.679
Katie Antypas (NSF - OAC): And so if we can provide, you know that simple resource that includes a software and and and all all included software and hardware environment. I think that's one way we could go forward.

232
00:46:22.690 --> 00:46:32.760
Katie Antypas (NSF - OAC): I will know the the scope of the ner, You know. We're the office of advanced cyber infrastructure. We provide that infrastructure to classrooms. But

233
00:46:32.770 --> 00:46:49.180
Katie Antypas (NSF - OAC): we're working with iis and seeing the the other parts of size to think more about how you could couple a curriculum to to to those classrooms that you know they may not have a a lecturer that

234
00:46:49.190 --> 00:46:55.300
Katie Antypas (NSF - OAC): has been trained in some of these areas. I think we do have a couple opportunities out

235
00:46:55.440 --> 00:47:08.999
Katie Antypas (NSF - OAC): right There's the Educate Ai Dcl. That, I think is closing pretty pretty soon there's a tie in there to the near pilot. So there are some some opportunities that are open in this space. Currently.

236
00:47:11.720 --> 00:47:13.959
Michael Littman: All right. Great. I've got.

237
00:47:14.400 --> 00:47:22.089
Michael Littman: Hmm. What I want to do next. There's two questions about data. It would be great if we could do those together. But here's a

238
00:47:22.800 --> 00:47:42.010
Michael Littman: yeah. Well, i'm going to focus on this, all right. So one is in astronomy. We have multiple Federal funding agencies, for example, Nsf. Nasa Andcar, each with siloed data, archives and access patterns and capabilities. Would you consider proposals that seek to use Nair as a lever to address these important data. Silos

239
00:47:43.510 --> 00:47:46.829
Katie Antypas (NSF - OAC): um great. So, Bill, can you answer that one?

240
00:47:47.310 --> 00:48:05.060
Bill Miller/NSF: Yeah, I was just typing the answer. But here it wasn't fast enough. Yeah, It's a great question actually, that that is absolutely in the in the vein of what we're doing. We're actually doing that two ways. One, we have a separate investment called the National Discovery Cloud for climate.

241
00:48:05.070 --> 00:48:22.740
Bill Miller/NSF: Our colleague, a colleague of Marlon Pierce, in Oac, is is coordinating that, and it involves a number of awards that were made in two thousand and twenty three, and also may may may involve the awards this year

242
00:48:22.790 --> 00:48:40.760
Bill Miller/NSF: that actually tie in a number of those resources and look for developing pathways from data to food and so on in that space. And that is serving as one of the demonstration aspects of the near pilot. So for those

243
00:48:40.770 --> 00:48:43.989
Bill Miller/NSF: parts of the National Discovery Club for climate that are involved in the

244
00:48:44.000 --> 00:48:49.240
Bill Miller/NSF: Ai. That's that's incorporated. In addition, we have partners

245
00:48:49.250 --> 00:49:15.189
Bill Miller/NSF: here. The air pilot itself, like maps like Noaa, who have those kinds of data sets, and they have contributed those datasets as part of some of the data sets of Ai ready. This is part of the pilot, but are also working with us, sort of one on one to think about the workflow pathways that make those data much more accessible. Ah! And uh, usable in the Ai context. So that is definitely part of there. Uh, right now we're not

246
00:49:15.200 --> 00:49:19.710
seeking other proposals per se for projects. But stay tuned.

247
00:49:20.060 --> 00:49:24.819
Michael Littman: Would it be appropriate for people to submit proposals to Oec for that kind of stuff.

248
00:49:26.270 --> 00:49:39.940
Bill Miller/NSF: Well, we we have the the Csi program as a software and data infrastructure program, and we haven't specifically called for near proposals for that. But that, said

249
00:49:39.950 --> 00:49:53.390
Bill Miller/NSF: Um, although nearest specifically about Ai, the underlying plumbing, the data for infrastructure that the storage compute all the at the usage pathways for for for that

250
00:49:53.800 --> 00:50:09.159
Bill Miller/NSF: part of what I always see funds. So ah! If at some point we we point for ah specific proposals for narrative infrastructure, we will definitely know I should turn it back to Katie on that final answer to

251
00:50:09.440 --> 00:50:24.260
Michael Littman: I think that that's a good, a good answer. So we have some opportunities that are up. Ah, right now that people could could apply to, as still said, they're not specifically near, but they're They're

252
00:50:25.330 --> 00:50:30.480
Katie Antypas (NSF - OAC): in the general space of supporting data and software infrastructure.

253
00:50:33.190 --> 00:50:40.569
Michael Littman: All right, cool, related question on the data side of the ner. Are there any official efforts that are making

254
00:50:40.690 --> 00:50:58.970
Michael Littman: that we are making as a community toward archiving the ever-growing list of large-language models or enormous data sets they are trained on. I have bought some external drives to download data sets that some generative models are trained on and model weights. But are there existing resources within the Nsf. Duplicating my personal efforts?

255
00:51:01.660 --> 00:51:10.289
Katie Antypas (NSF - OAC): I haven't heard of anyone duplicating your efforts? Um, Michael, I don't know if you know if there is any effort in the

256
00:51:10.300 --> 00:51:27.719
Michael Littman: I mean, there's tons of tons of websites where people say, Look here's here's the language models that i'm familiar with, and and various properties of them. Some of them list, for example, how open they were, because in many cases the data, the text that they're trained on is just not known. Right? We just don't, and it's just information that's not shared.

257
00:51:27.870 --> 00:51:29.330
Michael Littman: Um,

258
00:51:29.560 --> 00:51:37.720
Michael Littman: I don't know resources like hugging face where there's there's just lots of models are actually archived in one place.

259
00:51:37.990 --> 00:51:41.449
Michael Littman: Yeah to the extent that they exist. That's that's what i'm aware of.

260
00:51:42.290 --> 00:51:45.489
Michael Littman: But I mean, yeah, I heard you say, Oh, sorry! Was that you go?

261
00:51:45.500 --> 00:52:10.799
Bill Miller/NSF: I was actually just going to follow up, Michael, I think plugin-based was a good reference. Um, we're. We're not here to advertise commercial things at all but but the nerd involves a lot of in-kind contributions from a huge array of partner. We have twenty five partners accounting now. They're all listed on the Nsf. Website with what they're giving you. It just came to mind that hugging face is one example or a place where

262
00:52:10.810 --> 00:52:19.700
Bill Miller/NSF: for both data and models, and they are part of the near pilot making contributions. So we just encourage people to take a look at

263
00:52:19.710 --> 00:52:39.290
Bill Miller/NSF: that. We put in the website to the link. The Nsf. Website has a standable set of a list of the partners who have actually signed agreements, and who are engaging on the pilot and their contributions to include things like data and model resources.

264
00:52:39.840 --> 00:52:58.360
Michael Littman: Well, one thing that's really helpful. One thing I've heard Katie say in some of these presentations is that discoverability is actually part of the goal of the Mayor. It's not necessarily creating the data sets, but to the extent that they exist, making it easier for researchers to have, like a kind of a one-stop shop for finding that it's, and it's seem relevant to that

265
00:52:58.370 --> 00:52:59.990
Katie Antypas (NSF - OAC): right. And And so,

266
00:53:00.000 --> 00:53:02.140
Katie Antypas (NSF - OAC): you know, I think we're very aware that

267
00:53:02.180 --> 00:53:14.360
Katie Antypas (NSF - OAC): the the experts, the domain experts, have to be kind of the ones curating in these these data sets. But the role of near as the infrastructure can be to provide kind of the incentives for

268
00:53:14.370 --> 00:53:33.440
Katie Antypas (NSF - OAC): to make these data sets more available, accessible, fair to for community reuse. And so with that, yeah, it is envisioned that there would be like a Data discovery service. Clearly, what we have on the portal right now is not a data Discovery service. It's a list of links. And

269
00:53:33.450 --> 00:53:42.620
Katie Antypas (NSF - OAC): you know this is just the the stage we are at right right now, but that that's certainly in the vision for for the full mayor.

270
00:53:44.030 --> 00:54:13.509
Michael Littman: All right. So I want to say there's there's at least. Ah, there's a possibly a couple other questions out there, but I encourage you folks to stay, you know. Stay engaged with this. Ask for questions by email, if you need to. I wanted to take Ah, but I want to let everybody go, because I always forget to factor in travel time to whatever or two o'clock is, or whatever whatever time when you're at the hour. But but I did want to take just a moment. I don't say it enough just to thank Katie for her efforts and the whole. And I've I've said it to Katie before, but not as often as I should

271
00:54:13.520 --> 00:54:43.359
Michael Littman: but just her whole team that the folks who came from from the Oec Squad and her engaged in this. You know what you're doing on behalf of of the science and and research community is just fantastic, and and I'm personally very grateful. I'm sure that that many of the Ai folks that we've got here are would also like the chance to just say thanks for for this effort, and hopefully, it'll all turn out well, everybody. So so thanks for being here today to share these ideas with us, and thanks for all the work that you put in, and creating this

272
00:54:43.370 --> 00:54:45.219
Michael Littman: opportunity. In the first place,

