Pentagon may take a page out of Tesla’s playbook and run AI in ‘shadow mode’

Self driving vehicles take part in a test ride on the A2 motorway in Zijderveld, on March 16, 2016. (Photo credit KOEN VAN WEEL/AFP via Getty Images)

In its pursuit of trusted artificial intelligence and autonomy, the Defense Department is looking at running additional algorithms on its platforms to test and monitor performance, as Tesla has done with its self-driving cars.

The concept, which Tesla calls “shadow mode,” helps innovators develop and test new software by having algorithms run in the background on a vehicle without actually taking operational control of the platform. The data collected shows how the autonomy technology running in “shadow mode” would perform in real-world operational settings if it were given control, so that developers can measure its safety and effectiveness and ensure it is trustworthy.

“From my perspective, when I thought about for what we want to do in the trusted autonomy framework that we’re building [at the Defense Department] is really build in these ethics and responsible AI pieces as 1s and 0s into the machine intelligence. And so you just think about what Tesla is doing with the shadow mode AI, when you’re driving your car there’s an AI — a core AI that’s operating your car. In the background, there’s a whole AI that’s running that’s for development,” Jaret Riddick, principal director for autonomy in the Office of the Undersecretary of Defense for Research and Engineering, said Sept. 22 at the AUVSI Defense conference.

He continued: “We’re seeing that even now. So, imagine in the future what we can do with building in the machine intelligence into the architecture and … imagine layers of the stack that are dedicated to trust that are separate AIs that are focused just on that.”

The DOD is pursuing a variety of unmanned systems with various level of autonomy including drones, robotic combat vehicles and uncrewed ships, as it tries to maintain a competitive edge against advanced adversaries like China. But the Pentagon also wants to make sure the technology doesn’t screw up in combat or in other scenarios, such as by attacking the wrong targets. That has led to the push for “responsible AI” and trusted autonomy.

However, some officials and other observers are concerned that the DOD might be slow adopters of artificial intelligence, with expectations that China and Russia won’t be slowed down by concerns about ethics and trust when they’re developing these types of technologies.

But Riddick sees “shadow mode” processes as a way to help keep the movement toward trusted AI on track.

“This question comes up all the time, you know, when I talk about constant monitoring” of the artificial intelligence technology, Riddick said. “The notion of the state of engineering [and] state of trust, people say it’s going to slow down the autonomy. I wholeheartedly disagree, because we already see folks like at Tesla, you know, running AIs in the background. Imagine those AIs in the future being an AI in the back of this embedded in these structures, that’s totally there to control trusted behavior of the autonomy.”

Riddick is spearheading the Pentagon’s Operational Trust in Mission Autonomy (OPTIMA) initiative. The aim is to deliver trusted autonomy to give U.S. military forces a leg up in the complex, multi-domain battlefields of the future.

DOD officials expect these cutting-edge machines to operate as teammates to service members, not simply to replace them. They often talk about the need to achieve a level of comfort where commanders and their troops feel like they can rely on autonomous platforms to do what they’re supposed to do.

However, Riddick’s team isn’t looking at trusted AI from a purely psychological perspective. They want numbers to back it up and aid the development of new tech.

It’s “not as an emotion or a subjective or something that will just happen over time, but trust as a product that we can look at as a quantifiable asset on the battlefield,” he said. “In the objectives for the optimization, I’m focusing on quantifiable metrics.”

These metrics will be system-dependent and mission-dependent, he noted.

Some observers are skeptical about this approach.

“Many people have told me directly to my face, ‘You can’t measure trust, you don’t know what to measure.’ Particularly neuroscientists. I’ve upset a lot of neuroscientists, right. And other folks have told me, ‘Well, there are hundreds of definitions for trust. What are you talking about?’ I’m talking about operational trust and I’m talking about mission autonomy or … objective tactical autonomy,” Riddick said.

Once the Defense Department determines what the quantifiable metrics are, they are expected to guide technology developers.

“In the engineering sense, an engineering state of trust can be characterized [and] parameterized,” Riddick said. “Once we have an understanding what those quantifiable metrics are, we have a very powerful tool to use within program development … And that link between quantifiable metrics and operational effectiveness, give us then tools that we can use to design process that we can use to motivate the acquisition of systems, ensuring that we’re buying trusted autonomy.”

Riddick also sees other benefits.

“It also gives a link then to the research side, right. So, if we have guidance within the acquisition lifecycle, too, for PMs and PEOs of the world to understand by virtue of quantifiable metrics how to ensure that they’re buying trusted autonomy. Over time, you know, these quantifiable metrics then become commercial opportunities for small businesses to come in and say, ‘Here are the quantifiable metrics that would, you know, ensure you’re buying trusted autonomy for your system.’ But it also creates a bridge to research for researchers to continually deliver pathways to quantifiable metrics for trust,” Riddick added.

The DOD is currently in the “exploration phase” of its trusted autonomy endeavors, but eventually, it will transition to “exploitation.”

“Once we’re able to sort of pick out the places where we can put some real guidance for folks in the acquisition lifecycle, we can transition,” Riddick said. “We will evolve into exploitation in terms of standards that will lead to the type of scalability that we want.”