A conversation from the chat logs of the Department of Veterans Affairs Assistant Secretary of Information and Technology and Chief Information Officer Kurt DelBene and Chief Technology Officer and Chief Artificial Intelligence Officer Charles Worthington.

KURT: Hi folks, and welcome to our first (and maybe last? … we’ll see how this goes) chat on interesting IT topics at VA. Today, we’re talking about the ever-expanding topic of AI and how we’re thinking about it at VA. I’d like to welcome Charles Worthington to the chat. He’s our Chief Technology Officer and Chief AI Officer at VA. Welcome, Charles!

… Charles, you there?

CHARLES: Sorry, I had to put on some appropriate techno-futuristic music for this conversation.

I’m excited to talk AI!

KURT: I’m old school so I went with Baroque. Let’s start things off with a question on many people’s minds: Is AI overhyped right now?

CHARLES: I think it is obviously overhyped right now, at least within tech circles. Every tech company in America is becoming (or at least branding their product) as an AI product. So, I think it is definitely overhyped.

However, unlike some of the other recent big “hype cycles,” I am convinced there is a lot more substance to this hype than there was for, say, the blockchain hype.

KURT: Oh blockchain. Gotta agree with you there. My bike now has its own blockchain.

How do you think it’ll play out? What do you think will play out over the next ~5 years?

CHARLES: I would divide AI into two large categories: 1) analytical AI, which is the more traditional type of AI that uses models trained on a specific task to do things like make risk predictions or classify items into categories, and 2) generative AI, which is based on this emerging generation of large foundation models that are capable of generating new creative output like writing, images, and audio. Let’s start with the first bucket.

KURT: Ok, first bucket. What do you think are the best use cases for government, or for VA, specifically?

CHARLES: VA has been using the first type of AI for several years. We have several models that have been trained on a large amount of data and can do things like predicting the risk of a specific health condition, or the risk that a given financial transaction is fraudulent. I think that we will see more of these types of models incorporated into more software, where inferences could be useful. There are many cases like this in health care.

KURT: Do you see any particular analytic models as being highest impact at VA, or the first to be used more broadly?

And do they look like AI algorithms, or should we just expect it to be sprinkled everywhere?

CHARLES: Health care has a lot of situations in which these sorts of analytical AI tools might be useful. For example, VA has deployed tools that highlight potentially concerning areas of an image, such as an endoscopy result, and increases the chance that the doctor can identify an area that needs follow-up.

Since so much health data has been digitized in the past few decades, there is a large opportunity to use that data to identify risks more quickly or help surface specific potential risks based on an individual patient’s situation.

KURT: Makes a ton of sense. Do you see us doing innovation ourselves on these sorts of analytics, or mostly working with industry?

CHARLES: I think we will see it both ways. VA has long been a leader in health care innovation, based on our unique public service mission and our large population of patients and health care data. But we are also investing in tools that will allow us to take models and other innovations from outside VA and apply them to our own systems where that makes sense.

Also, I have resorted to using voice recognition to respond more quickly to your questions here.

KURT: Even that is influenced by AI these days. I know that speech-to-text models are now powered by AI, which has resulted in huge improvements in accuracy. A friend of mine is a researcher in the field, and it has totally changed the game for them.

CHARLES: Part of what is leading the current hype cycle is this new generation of AI tools broadly defined as generative AI. Things like ChatGPT and Claude. This category of tools is expanding what we believe computers could or should be capable of — and it’s leading to a dramatic rethinking about software of all kinds.

KURT: Before we go there, how do you think changes the physician’s job to have these tools available to them? Anything we need to worry about?

CHARLES: As we roll out new tools like analytical AI — tools that are fundamentally used to generate predictions — there are some unique issues we need to account for. We need to monitor the effectiveness and accuracy of these tools, both before they are launched and over time.

With traditional software, monitoring is a more straightforward task. There are a lot of tools that can tell us, “Is the system up or down?” or measure things like error rates or latency. In the AI space, it’s more complicated than “up or down” or “slow or fast.” We also need to monitor, “How good was the AI at performing the prediction or other task?”

Fortunately, our health care system is already well-suited to assessing the effectiveness of new technologies before they are rolled out, and it has methods for measuring effectiveness of various practices. But any time you’re dealing with health care, it is extremely important that you’re sure that the tools are accurate, and that the clinicians understand how to use them and how not to use them.

Ultimately, I think that these sorts of tools not only have the opportunity to increase the effectiveness of our health care system, but to also help our clinicians get better outcomes for their patients.

KURT: Makes sense. Is part of your job as Chief AI Officer to think through things like this? What is a Chief AI Officer?

Pretty cool title.

CHARLES: I was hoping you knew!

KURT: You’re fired. 😊

CHARLES: All kidding aside, the Chief AI Officer is a new position that every agency was required to create under the recent Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

There are two main responsibilities: First, ensuring that each agency is using AI in a responsible and trustworthy manner, especially for use cases that have the potential to impact someone’s health, rights, or safety. But second, the role is also about ensuring that VA is seizing the opportunities presented by AI to be as effective as possible at executing against our important mission.

KURT: So, getting us focused on the highest impact use cases and making sure we’re use it responsibly.

Let’s switch gears and talk about large language model use cases. Where does that have the most potential?

CHARLES: This generative AI trend is the new category of AI that I think is leading to much of the recent hype. And it’s easy to understand why if you’ve used any of these generative AI tools. The aperture for what computers are capable of — and what we should expect them to be able to do — has expanded significantly from before the general public had access to a large language model.

We’re in the early stages of figuring out which generative AI use cases will be most powerful inside of VA, but we have a few early ideas.

KURT: I used it this weekend to ask it to characterize the NY Giants roster. Unfortunately, it didn’t tell me how the Seahawks could beat them. ☹

You talk about early use cases. Are there some you think show the most potential?

CHARLES: One where we have seen early success is in software development. Large language models have proven to be quite effective at understanding source code and helping developers with tasks like debugging, writing automated tests, and even generating new code based on the existing source code in a repository.

Another place where we’re looking at using this technology is in helping reduce clinician burnout by providing assistance with tasks like creating a medical note based on an encounter with a patient. Doctors spend a lot of time typing data into electronic medical records, based on discussions they have with their patients in an appointment. There is a new generation of tools that are designed to generate these clinical notes. They work by listening to an encounter, creating a transcription, and then turning that into a medical note that is ready for the clinician to review.

KURT: That’s one of our AI Tech Sprint use cases, right? What are we doing there?

CHARLES: That’s right. Veterans Health Administration (VHA) identified this as one of their top priorities, and earlier this year we ran a comprehensive tech sprint to assess over 100 different implementations of this type of technology.

The results were promising, and we decided to proceed with a pilot in a few medical centers, which we’re working to set up now.

KURT: So, we’re actually testing some real use cases that will have positive impacts on patient care — that’s exciting!

CHARLES: What I like about this approach is that we are focusing on a small number of use cases that have a real potential to help VA clinicians and patients, but we’re moving deliberatively and measuring outcomes as we go.

KURT: One thing I’ve wondered about: If AI is everywhere in the future, how does that change our approach to planning and software testing at VA moving forward? How do we hope to find potential errors and biases in the algorithms when they are already so pervasive?

CHARLES: In my mind, this is the biggest challenge that AI presents to IT organizations like ours.

KURT: So, you don’t know, either? You’re fired again.

I kid.

CHARLES: So much of what we do is based on measuring the effectiveness and availability of systems that act deterministically; that is, when given one set of inputs, they generate the same set of outputs, every single time. We can examine the reason they generate those outputs by looking at the source code and the way the system is built.

With AI systems, it is not enough to simply measure whether the system is up and monitor system errors. We need to measure the accuracy of the outputs of that system over time, some of which we won’t be able to determine until after we see the result. This implies a much different type of technology management approach — one that will require close coordination with the groups that measure business outcomes, and one that requires a high degree of statistical competency to be done accurately.

KURT: Totally agree with you there. And what happens if there’s drift in the behavior over time. Whoa, complicated world.

Charles, thanks for taking the time to talk about this fascinating topic that has such huge potential for impact for VA and the care and services we deliver. We’ll have to come back in a number of years and see what happened — that is, unless both of us have been replaced by AI chatbots.

CHARLES: I enjoyed it! I am truly excited about the next decade of software work. I think that incorporating the capabilities of these new large language models has the potential to dramatically improve the way our software works and what it will be capable of. If we’re successful, I think every VA employee will be even more empowered to deliver excellent care and benefits to Veterans, which is what we’re all here to do!

Kurt DelBene with the VA seal in the backgroundReducing Complexity in Government IT

Continue reading