Inside Task Force Lima’s exploration of 180-plus generative AI use cases for DOD
Task Force Lima continues to gain momentum across a variety of pursuits in its ambitious, 18-month plan to ensure the Pentagon can responsibly adopt, implement and secure powerful, still-maturing generative artificial intelligence technologies.
Department of Defense leadership formed that new hub in August within the Chief Digital and AI Office’s (CDAO) Algorithmic Warfare Directorate. Its ultimate mission is to set and steer the enterprise’s path forward with the emerging field of generative AI and associated large language models, which yield (convincing but not always correct) software code, images and other media following human prompts.
Such capabilities hold a lot of promise, but also complex challenges for the DOD — including many that remain unseen.
“Task Force Lima has three phases: the ‘learn phase,’ an ‘accelerate phase’ and a ‘guide phase.’ The ‘learn phase’ is where we are performing, for lack of a better word, inventories of what is the demand signal for generative AI across the department. That includes projects that are ongoing, to projects that we think should go forward, to projects that we would like to learn more about. And so, we submitted that as an inquiry to the department — and we’ve received a volume of use cases around 180 that go into many different categories and into many different mission areas,” Task Force Lima Mission Commander Navy Capt. M. Xavier Lugo told DefenseScoop.
In a recent interview, the 28-year Naval officer-turned AI acceleration lead, briefed DefenseScoop about what’s to come with those under-review use cases, a recent “Challenge Day,” and future opportunities and events the task force is planning.
180-plus instances
During his first interview with DefenseScoop back in late September, Lugo confirmed that the task force would be placing an explicit emphasis on enabling generative AI in “low-risk mission areas.”
“That is still the case. However, some of what has evolved from that is they’re not all theoretical. For some of these use cases, there are units that have already started working with those particular technologies and they’re integrating [them] into their workflows. That’s when we’re going to switch from the ‘learn phase’ into the ‘accelerate phase,’ which is where we will partner with the use cases that are ongoing,” Lugo told DefenseScoop in the most recent interview.
At a Pentagon press briefing about the state of AI last week, Deputy Defense Secretary Kathleen Hicks confirmed that the department launched Task Force Lima because it is “mindful of the potential risks and benefits offered by large language models” (LLMs) and other associated generative AI tools.
“Candidly, most commercially available systems enabled by large language models aren’t yet technically mature enough to comply with our DOD ethical AI principles — which is required for responsible operational use. But we have found over 180 instances where such generative AI tools could add value for us with oversight like helping to debug and develop software faster, speeding analysis of battle damage assessments, and verifiably summarizing texts from both open-source and classified datasets,” Hicks told reporters.
The deputy secretary noted that “not all of these use cases” that the task force is exploring are notional.
Some Defense Department components started looking at generative AI even before ChatGPT and similar products “captured the world’s attention,” she said. And a few department insiders have “even made their own models,” by isolating and fine-tuning foundational models for a specific task with clean, reliable and secure DOD data.
“While we have much more evaluating to do, it’s possible some might make fewer factual errors than publicly available tools — in part because, with effort, they can be designed to cite their sources clearly and proactively. Although it would be premature to call most of them operational, it’s true that some are actively being experimented with and even used as part of people’s regular workflows — of course, with appropriate human supervision and judgment — not just to validate, but also to continue improving them,” Hicks said.
Lugo offered an example of those more non-theoretical generative AI use cases that have already been maturing within DOD.
“As you can imagine, the military has a lot of policies and publications, [tactics, techniques, and procedures, or TTPs], and all sorts of documentation out there for particular areas — let’s say in the human resources area, for example. So, one of those projects would be how do I interact with all those publications and policies that are out there to answer questions that a particular person may have on how to do a procedure or a policy?” he told DefenseScoop.
Among its many responsibilities, one that the CDAO leadership has charged Task Force Lima with is coming up with acceptability criteria and a maturity model for each use case or groups of use cases encompassing generative AI.
“So, if we say we need an acceptability criteria of a particular value for a capability of summarization for LLMs, let’s say just as an example, then we need a model that matches that and that has that type of maturity in that particular capability. This is analogous to the self-driving vehicle maturity models and how you can have a different level of maturity in a self-driving vehicle for different road conditions. So, in our case the road conditions will be our acceptability criteria, and the model being able to meet that acceptability will be that maturity model,” Lugo explained.
‘Put me in, coach!’
Soon, the Lima team will start collecting information needed to inform its specific deliverables, including new test-and-evaluation frameworks, mitigation techniques, risk assessments and more.
“That output that we get during the ‘accelerate phase’ will be the input for the ‘guide phase,’ which is our last phase where we compile the deliverables to the CDAO Council so they can then make a determination into policy,” Lugo explained.
The task force does not have authority to officially publish guidance on generative AI deployments in DOD, but members previously made recommendations to the CDAO’s leadership that were approved to advise defense components in their efforts. The task force drafted that interim LLM guidance, but due to its classification level it has not been disseminated widely.
“That guidance [included that] any service can publish its own guidance that is more restrictive than the one that [the Office of the Secretary of Defense] publishes,” Lugo said.
The Navy offered its version of interim guardrails on generative AI and LLMs in September. Shortly after that, the Space Force transmitted a memo that put a temporary pause on guardians’ use of web-based generative AI tools like ChatGPT for its workforce — specifically citing data security concerns.
“Did I learn about the Space Force guidance before it went out? Yes. Would I have had any reason to try to modify that? No,” Lugo told DefenseScoop.
“Space Force — like any other service — has the right to pursue guidance that is even more restrictive than the guidance that is provided by the policy. So, I just want to be clear that they have autonomy to publish their own guidance. At Task Force Lima, we are coordinating with the services — and they understand our viewpoints, and we understand our viewpoints, and there is no conflict on viewpoints here,” he added.
And although it might make sense for one military branch to ban certain uses on a non-permanent basis to address data and security concerns, Lugo noted that doesn’t mean the task force should not be cautiously experimenting with models that are publicly accessible, in order to learn more about them.
In his latest interview with DefenseScoop, the task force chief also stated that his team is “not trying to do this in a vacuum.”
“We are definitely not only working with DOD, but we are working with industry and academia — and actually any organization that is interested in generative AI, they can reach out to us. There’s plenty of work, and there’s plenty of areas of involvement,” Lugo said.
“Also, I want to make sure that just because we are interacting with industry, that doesn’t take us out of the industry-agnostic, technology-agnostic hat. I am always ensuring that we keep that, because that’s what keeps us as an honest broker of this technology,” he added.
Lugo’s currently leading a core team of roughly 15 personnel. But he’s also engaging with a still-growing “expanded team” of close to 500 points of contacts associated with the task force’s activities and aims. To him, those officials are essentially on secondary duty, or a support function to his unit.
“We’re getting more people interested. Now, those 500 people — I’ve got everything from people watching from the bleachers, to personnel saying, ‘Hey, put me in coach!’ So, I’ve got a broad spectrum,” Lugo said.
Nearly 250 people attended a recent “Challenge Day” that the CDAO hosted to connect with industry and academic partners about the challenges associated with implementing generative AI within the DOD.
“There’s a lot of interest in the area, but there’s not that many companies in it. So what we saw was that it’s not just the normal names that you would hear on a day to day basis — but there’s also a lot of companies interested in integrating models. There’s companies that are not necessarily known for LLMs or generative AI, but they are known for other types of integration in the data space and in the AI space. So that was good, because that means that there’s a good pool of talent that will be working on the challenges that we have submitted to industry,” Lugo said.
According to Lugo, the cadre has received more than 120 responses to the recent request for information released to the public to garner input on existing generative AI use cases and critical technical obstacles that accompany its emergence.
The RFI is about learning “what are the insights out there, what are the approaches to solving these particular challenges that we have. And as we compile that information, we will then go ahead and do a more formal solicitation through the proper processes,” he said.
On Nov. 30, industry and academic partners will have an additional opportunity this year to meet with Task Force Lima at the CDAO Industry Day. And down the line during the CDAO’s first in-person symposium — which is set to take place Feb. 20-22 in Washington — an entire track will be dedicated to Task Force Lima and generative AI.
Attendee registration opened in October, and the office is now accepting submissions for potential speakers at that event.
“I’m very optimistic that the challenges that we have submitted will be addressed — and hopefully corrected — by some innovative techniques,” Lugo told DefenseScoop.