Insider Q&A: CIA’s chief technologist’s cautious embrace of generative AI

Knowledge advantage can save lives, win wars and avert disaster. At the Central Intelligence Agency, basic artificial intelligence – machine learning and algorithms – has long served that mission. Now, generative AI is joining the effort.

CIA Director William Burns says AI tech will augment humans, not replace them. The agency’s first chief technology officer, Nand Mulchandani, is marshaling the tools. There’s considerable urgency: Adversaries are already spreading AI-generated deepfakes aimed at undermining U.S. interests.

A former Silicon Valley CEO who helmed successful startups, Mulchandani was named to the job in 2022 after a stint at the Pentagon’s Joint Artificial Intelligence Center.

Among agency projects: A ChatGPT-like generative AI application that draws on open-source data (meaning unclassified, public or commercially available). Thousands of analysts across the 18-agency U.S. intelligence community use it. Other CIA projects that use large-language models are, unsurprisingly, secret.

This Associated Press interview with Mulchandani has been edited for length and clarity.

Q: You recently said generative AI should be treated like a “crazy, drunk friend.” Can you elaborate?

A: When these generative AI systems “hallucinate,” they can sometimes behave like your drunk friend at a bar who can say something that pushes you outside your normal conceptual boundary and sparks out-of-the box thinking. Remember that these AI-based systems are probabilistic in nature, so they are not precise (They are prone to fabrication). So for creative tasks like art, poetry, and painting these systems are excellent. But I wouldn’t yet use these systems for doing precise math or designing an airplane or skyscraper – in those activities “close enough” doesn’t work. They can also be biased and narrowly focused, which I call the “rabbit hole” problem.

Q: The only current use of a large-language model at enterprise scale I’m aware of at CIA is the open-source AI, called Osiris, that it created for the entire intelligence community. Is that correct?

A: That’s the only one we have disclosed publicly. It’s been an absolute home run for us. We should broaden the discussion beyond just LLMs though — as an example, we process huge amounts of foreign language content in multiple media types including video, and use other AI algorithms and tools to process that.

Q: The Special Competitive Studies Project, a high-powered advisory group focused on AI in national security, is out with a report saying U.S. intelligence services must rapidly integrate generative AI — given its disruptive potential. It sets a two-year timeline for getting beyond experimentation and limited pilot projects and “deploying Gen AI tools at scale.” Do you agree?

A: CIA is all in 100% on utilizing these technologies and scaling them. We are taking this as seriously as we’re taking probably any technology issue. We think we’ve beaten the initial timeline by a big margin, as we’re already using Gen AI tools in production. The deeper answer is that we’re on the early side of a huge number of additional changes, and a large part of the work is to integrate the technology more widely into our applications and systems. These are early days.

Q: Can you name your large-language model partners?

A: I’m not sure naming the vendors is interesting right now. There is an explosion of LLMs available on the market now. As a smart customer, we are not tying our boat to a specific set of LLMs or a specific set of vendors. We are evaluating and using practically all the high-runner LLMs out there, both commercial-grade and open source. We are not viewing the LLM market as a singular one where a single lab is better than the others. As you’re noting in the market, models are leapfrogging one another with each new release.

Q: What are the most important use cases at CIA for large-language models?

A: Primary is summarization. It’s impossible for an open-source analyst at CIA to digest the firehouse of media and other information we collect every day. So this has been a game-changer for insights into sentiment and global trends. Analysts then dig into specifics. They must be able — with full certainty — to annotate and explain data they cite and how they reach conclusions. Our tradecraft has not changed. The additional pieces give analysts much broader perspective – both the classified and open-source pieces we gather.

Q: What are the biggest challenges of adapting generative AI at the agency?

A: There isn’t a lot of cultural resistance internally. Our employees deal with AI on a daily basis competitively. Obviously, the whole world is on fire with these new technologies and the amazing productivity gains. The trick is grappling with constraints we have on information compartmentalization and how systems are built. In many cases, the separation of data is not for security but legal reasons. How do we efficiently connect systems to get the benefits of AI while keeping all that intact? Some really interesting technologies are emerging to help us think this through – and combine data in ways that maintain encryption and privacy controls.

Q: Generative AI is currently about as sophisticated as an elementary school student. Intelligence work, by contrast, is for grown-ups. It’s all about trying to pierce an adversary’s deception. How does Gen AI fit into that work?

A: First, let’s emphasize that the human analyst has primacy. We have the world’s leading experts in their domains. And in many cases of incoming information, a huge amount of human judgment is involved to weigh its importance and significance – including of the individuals who may be providing it. We don’t have machines replicate any of that. And we’re not looking for computers to do the jobs of domain experts.

What we are looking at is the co-pilot model. We think Gen AI can have a huge impact in brainstorming, coming up with new ideas. And in boosting productivity – and insight. We have to be very deterministic about how we do it because, wielded properly, these algorithms are a force for good. But wielded incorrectly, they can really hurt you.