Podcast: Play in new window | Download (Duration: 32:27 — 29.7MB) | Embed

Subscribe: Email | RSS

In this episode of The Security Ledger Podcast (#256) Paul speaks with Gary McGraw of the Berryville Institute of Machine Learning (BIML), about that group’s latest report: an Architectural Risk Analysis of Large Language Models. Gary and Paul talk about the many security and integrity risks facing large language model machine learning and artificial intelligence, and how organizations looking to leverage artificial intelligence and LLMs can insulate themselves from those risks.

[Video Podcast] | [MP3] | [Transcript]

Four years ago, I sat down with Gary McGraw in the Security Ledger studio to talk about a report released by his new project, The Berryville Institute of Machine learning. That report, An Architectural Risk Analysis of Machine Learning Systems, included a top 10 list of machine learning security risks, as well as some security principles to guide the development of machine learning technology.

*Gary McGraw is the co-founder of the Berryville Institute of Machine Learning*

The concept of cyber risks linked to machine learning and AI – back then – were mostly hypothetical. Artificial Intelligence was clearly advancing rapidly, but – with the exception of cutting edge industries like high tech and finance – its actual applications in everyday life (and business) were still matters of conjecture.

An update on AI risk

Four years later, A LOT has changed. With the launch of OpenAI’s ChatGPT-4 large language model (LLM) artificial intelligence in March, 2023, the use- and applications of AI have exploded. Today, there is hardly any industry that isn’t looking hard at how to apply AI and machine learning technology to enhance efficiency, improve output and reduce costs. In the process, the issue of AI and ML risks and vulnerabilities -from “hallucinations” and “deep fakes” to copyright infringement have also moved to the front burner.

Back in 2020, BIML’s message was one of cautious optimism: while threats to the integrity of LLMs were real, there were things that the users of LLMs could do to manage those risks. For example, scrutinizing critical LLM components like data set assembly (where the data set that trained the LLM came from); the actual data sets themselves as well as the learning algorithms used and the evaluation criteria that determine whether or not the machine learning system that was built is good enough to release.

AI security: tucked away in a black box

By controlling for those factors, organizations that wanted to leverage machine learning and AI systems could limit their risks. Fast forward to 2024, however, and all those components are tucked away inside what McGraw and BIML describe as a “black box.”

So in 2020 we said. There’s a bunch of things you can do around these four components to make stuff better and to understand better what you’re working on from a security perspective. In Twenty-twenty-four, when it comes to LLMs, you don’t get to look in that black box anymore.
-Gary McGraw, BIML

In BIML’s latest report, “An Architectural Risk Analysis of Large Language Models: Applied Machine Learning Security,” McGraw and his colleagues did a deep dive into LLMs the dangers of the “black boxes” that Google, OpenAI, Microsoft and others are building their AI systems on. The report finds that those black box LLM foundational models are hiding critical risks that the LLM users that apply and blindly trust the AI tools based on those black box foundation models are exposed to. Those risks include so-called “recursive pollution,” the ‘echo chamber’ that happens when LLMs begin consuming training data produced by other LLMs. There are also legal risks, such as the unwitting use of copyrighted material in the output of LLMs, and “poison data” with “harmful associations” that are used to train LLMs.

The biggest risk posed by large language model AI like Chat GPT? “It’s this: large language models are often wrong,” McGraw told me. “And they’re very convincingly wrong and very authoritatively wrong.”

Check out my full interview with Gary above, or view a video of our conversation below!

Video Podcast and Transcript

Video Podcast

You can watch a video of my interview with Gary below. Check out more Security Ledger podcast interviews on our YouTube channel!

Transcript

[00:00:00]

Paul Roberts (Security Ledger): Welcome back to the Security Ledger podcast. I’m Paul Roberts, your host, and happy to welcome back to the podcast Mr. Gary McGraw, Lately of the Berryville Institute of Machine Learning. Gary, welcome back.

Gary McGraw, BIML: Hi. Great to be here again, Paul.

Paul Roberts (Security Ledger): Great to see you. You are a legend in cybersecurity. You literally wrote the book on software security, but lately you’ve turned your attention to machine learning and artificial intelligence and this kind of cybersecurity risks therein. Remind us a little bit Berryville Institute of Machine Learning. What is it? And since we talked to you last, what have you guys been up to?

Gary McGraw, BIML: I retired and I am extremely bad at retirement. So if you like what I did in software security, you can certainly borrow that. And borrow that idea and the philosophy, but do [00:01:00] not borrow the way I do retirement. So one of the things that I did in a previous life before software security and Cigital was to be a student of Doug Hofstetter’s at Indiana University, where I wrote an AI program like in 1995.

When I retired, I thought, gosh, what’s really going on in AI? What is all this deep learning stuff? And have we made any progress since I used to write about this in the eighties publishing science stuff. And the answer was really “No”. We haven’t made much progress at all. It’s the same algorithms, pretty much the same architectures.

The only thing that’s changed is. The scale. We have unbelievable piles of data and we have more cycles than ever. And so some of these old ideas are applying in spades. And after reading the edge of science for. A few months I formed a little research group and we started reading more [00:02:00] seriously in machine learning at the edge, and we found out, gosh, somebody needs to think about applying the “Building Security In-” philosophy to machine learning itself. So one important thing to realize is that what we do at BIML, the Berryville Institute of Machine Learning, is we apply security engineering to the very technology of machine learning as opposed to using machine learning to do security. So this is the security of machine learning itself. So when we think about things like what’s going on with LLMs there’s been a lot of, surprising progress, some people think that. LLMs can pass naive versions of the Turing test, for example. And so we took our generic risk analysis of machine learning systems in general, which we published [00:03:00] in January, 2020, and we took the very, the same idea and we applied that to large language models. The result of that that was published last Wednesday, January twenty-fourth twenty-twenty-four.

And it is a risk analysis of large language models in the spirit of “Building Security In-“. That’s what we’ve been up to.

Paul Roberts (Security Ledger): And it’s been about, it’s been almost exactly four years since your first report came out, which was January, 2020. This version focused on large language models. AI and the first report was focused on what?

Gary McGraw, BIML: The first report was focused on generic machine learning of any sort. It was not focused in on large language models. It was zoomed out to include large language models. So everything we had to say in 2020 is entirely relevant. But what we found out when we looked more closely at LLMs [00:04:00] is that.

Some of the components of our generic process model had been placed in a black box by the vendors. In fact, four of the very important components which we identified in 2020 are no longer something that users get to think about or control the risks of because they buy the black box or rent it as the case may be from OpenAI or Microsoft or Google or Meta or Anthropic or whoever.

Paul Roberts (Security Ledger): So the gist of the report in 2020 is and I think your approach in general is, transparency is critical here, particularly around the kind of core architecture of these machine learning models, right? That you can’t have obscurity because that prevents you as the consumer of this information to really know where it came from or how it came about?

Gary McGraw, BIML: A little bit. It’s slightly less slightly less political than that. And

Paul Roberts (Security Ledger): Me, me, political? [00:05:00] How dare you!

Gary McGraw, BIML: Not that there’s anything wrong with that, Paul but the idea is this, security is not the same size and shape for everyone. One person’s “secure enough” is another person’s, “what the hell are you doing?”

And so you have to make these risk management decisions and trade-offs. Yourself or your organization has to. The problem with the black box model of large-language, foundation models is LLM Foundation models, is that those risk management decisions were made by the vendor and not by you, and you just have to abide by them.

Now, that’s not so bad, as long as you knew what they were and you knew exactly how those guys built that. Exactly what they use to build it out of. Most importantly, when we talk about the internet scrape, for example, and the problem is that they’re not being very forthcoming about exactly how those things are built, what data they were built on.[00:06:00] What exact algorithms they use. Those sorts of things that can help you to make better risk management decisions for your organization.

Paul Roberts (Security Ledger): So I think you just ran down this list, but what you mentioned there are four critical controls that you in identified in your Twenty-twenty report that have now been brought into the black box so that consumers or customers of, open AI or who or whomever no longer have a say or ability

Gary McGraw, BIML: They’re not controls, Paul. They’re four components in our generic model of machine learning, and those components are data set assembly. Where did the data set that you did the training on come from? The data sets themselves, including validation and test sets, but most importantly for LLMs, the base training set, the learning algorithms that are used and the evaluation criteria that are used to determine whether or not the machine learning system that you’ve built is good enough to release.

Those are all inside the black box. [00:07:00] So in 2020 we said. There’s a bunch of things you can do around these four components to make stuff better and to understand better what you’re working on from a security perspective. In Twenty-twenty-four, when it comes to LLMs, you don’t get to look in that black box anymore.

Paul Roberts (Security Ledger): Could you give us an example of how. Differences or what’s the word I’m looking for? Tainting of any of those factors might really skew or influence how the language, how the large language model works and what comes out of it?

Gary McGraw, BIML: Sure. So one of the things that we did was we identified what we believe at BIML are the top 10 risks from a security and privacy and kind of just general, oh shit. Perspective. There are in LLMs. Number six makes a good example. That one’s called poison in the data and it’s directly related to this black box notion.

And here’s what we say in our report. Data [00:08:00] play an outsized role in the security of a machine learning system and have a particularly tricky impact on LLMs. That’s because an l ML system learns to do what it does directly from its training data. Remember, it’s becoming the training data. Sometimes data sets include Poison by default.

For example, you can think about what the Stanford guys recently identified as CSAM in existing training sets. So if an attacker can intentionally manipulate the data being used by a machine learning system in a coordinated way, the whole system can be compromised. But in the case of LLMs, in foundation models, the huge internet scrape that they use to get enough data to make this work is full of poison and garbage and nonsense and noise.

By default, much of which is difficult or impossible to scrub out. And so what that means is that we don’t need an attacker to poison the data. There’s already [00:09:00] poison in the data that we grabbed off the internet.

Paul Roberts (Security Ledger): One of the things that has come up as the ha around AI and large language model ai. Chat GPT are cases like, you hear about the attorney, right? Who wrote their brief using chat GPT and it just invented a whole bunch of precedent cases out of thin air, or invented the Washington Post article that it cites to support itself.

Is that ex are those manifestations of the types of things you’re thinking about? You’re talking about.

Gary McGraw, BIML: They are. And in fact, model trustworthiness is our number nine risk. These models are stochastic in nature and they really are ultimately auto associative, predictive generators, so they don’t know anything. All they’re doing is predicting what the next utterance ought to be in the stream of text.

So it’s surprising that they do what they do. Like it’s amazing to me that they can, maybe pass [00:10:00] naive versions of the Turing Test and that some very silly people believe that they’re sentient and so on, because what they’re really doing is predicting what the next word ought to be and what next word will make you super happy and think that they’re having a cool conversation with you. Generative models…

Paul Roberts (Security Ledger): But Gary, how’s that different from what we do as people?

Gary McGraw, BIML: That’s a really good question. I don’t I would hope that humans have a better cognitive capability and a

Paul Roberts (Security Ledger): And suddenly we’re having an ontological discussion, Gary.

Gary McGraw, BIML: As a cognitive scientist, I will tell you that I believe that large language models do not do any understanding or reasoning the way we do as humans.

So the one cognitive system, we know how it works the best. Ours is not how this thing works. Trustworthiness is problematic for a couple reasons. One is it’s got a stochastic output sampling algorithm built into it. So it picks which word out of a possible pile of words to utter [00:11:00] next.

And both the input, which is in the form of slippery natural language prompts. Language is not exactly precise. It’s got a big kind of. Meaning cloud around it when we use it. And and the generated outputs the same way. It’s also in the form of natural language that’s just wildly unstructured.

This auto associative predictive generation seems to be, to have some understanding behind it. But actually it’s auto associative predictive generation of text. Now why you should use that to do lawyering is beyond me or history or I don’t know anything that really counts. Medical stuff. We’ve built some really cool technology that’s really fun to play with, but it’s not clear that we should be using it to do some of the things we’re thinking about using it for now.

Paul Roberts (Security Ledger): Your background is in, certainly [00:12:00] for the last 20 twenty-five years is in information security, cyber security. What in your mind, are the biggest cyber threats to machine learning? And artificial intelligence?

Gary McGraw, BIML: I’m gonna, I’m gonna turn that on its head. I, instead of answering the exact question you ask, I’m gonna use the slippery nature of language and questions that Paul asked to answer what I wanna answer, which is I think the biggest risk that we talk about in our paper. Surrounding machine learning in large language models is this large language models are often wrong, and they’re like very convincingly wrong and very authoritatively wrong.

Now, a lot of people like to call this hallucination as if it’s cutesy little thing, but it’s not hallucination. It’s just. Freaking wrong.

Wrong is bad in a lot of contexts, and it’s really bad in a large number of contexts. Now here’s the deal. Here’s the number one risk. Imagine [00:13:00] that we produce some wrongness with our LLMs and we publish it on the internet.

Now, when it comes time for the next generation to find a bunch of data to build an LLM out of, where do they get it? The internet. This is a feedback loop. And what we call recursive pollution, which is an unbelievable risk to society at large and science and real facts and all sorts of things.

Paul Roberts (Security Ledger): And in fact, I think about this in the, obviously as a writer and a journalist, I think about this in the context of, scraping blogs and web pages, taking that content to create AI generated facsimiles of that content. But of course. Any writer would tell you boy, that AI generated version is a real step down.

And as you say, then, as you continue that loop into succeeding generations, right? So you take that really poor quality AI content, you generate some [00:14:00] AI quality based on that. It’s almost like in a few iterations, almost like maybe you’re not even, maybe it’s not even English or whatever language you’re writing

Gary McGraw, BIML: Let me explain why this happens. Think about the use of language or a particular word as a bell curve. Now what a machine learning algorithm does ultimately is it focuses the bell curve. It builds a statistical generalization over a set of inputs that cuts off the tails on both sides of the bell curve.

So the Gaussian kind of shrinks up and gets taller. If you have Gaussians like that, which are artificial, they’re not like how people use the thing. It’s how LLMs use those same tokens. If you eat that a lot, the tails disappear. So all of the subtlety inside of our language use all of the creative who’s-He-What’s-it that makes Paul Roberts such a great writer as opposed to, some LLM on the street.

All that [00:15:00] sparkle and all that subtle nuance goes away and that is a huge problem. That’s what recursive pollution leads to for mathematical reasons.

Paul Roberts (Security Ledger): So one of the points that you make and Bimal wrote a press release to go along with this report and a blog post, and you were on Google’s podcast and all kinds of cool stuff. Now you’ve been on the Security Ledger podcast but one of the warnings you have to, to the private sector, to enterprise, basically, or corporations is Hey, you all are like throwing yourself into these black box, large language model platforms, environments.

In many cases, laying off hundreds or thousands of workers because you anticipate replacing them with ai and you have no idea, how these things work. Because they’re black boxes, right? And so you’re you don’t really fully understand the risk that you’ve taken on by throwing in, getting that, open ai.

API [00:16:00] and go into town with it. What would your advice talk about that and then also what would your advice be to some, c-suite, person who is under a lot of pressure to figure out a way to, leverage AI to reduce costs and increase productivity.

Gary McGraw, BIML: Yeah I think the answer is to understand that though we do identify these risks in our report. The black box risks twenty-three that we identified as, associated directly with the black box we’ve been talking about. Those risks have been managed by Google and Meta and OpenAI and Microsoft.

It’s not like they didn’t make a decision. They did, but what you need to do when you think about using an LLM foundation model is that risk decision that those guys made appropriate for my use. If I build an application on top of that. Does that meet my risk management requirements? So the number one I don’t know, communication to CISOs is actually included on the [00:17:00] first page of our report where we built an executive summary.

And the reason that’s there is because Jim Routh, who used to be a very well-known CISO he was both the APSEC guy at JP, Morgan Chase, and also the CISO of Aetna. Read our report early and said, sure would be nice to have a one page kind of impact statement for CISOs that explains what you’re doing here.

And so we put that in, and that is in fact, page three. So check out page three if you wanna know what it is. But ultimately, those people that are responsible for building an LLM application, they’re gonna be held accountable both for the choice of which LLM Foundation model to use. And the proper construction of the LLM application on top of that.

So the complete system requires accountability. That means the CISO is gonna be holding the black box bag. don’t forget that you’re holding that.

Paul Roberts (Security Ledger): So there’s been a lot of discussion [00:18:00] of legislation and regulations regarding AI and large language models, both at the federal level and a lot of states have either passed stuff or at least talked about passing stuff. I. One of your kind of recommendations, BIMLs, that is, is, focus on this black box stuff.

Don’t go down the rabbit hole of what we’re gonna outlaw using AI here in this context and in that context, but we’ll allow it here and there. That focus on the black box, the foundation of

Gary McGraw, BIML: Focus on the construction of the black box. How’d you build it? What’d you build it out of? And we actually have very clear and explicit. Calls for regulation by saying, make them tell us what data sets they used, where they got ’em, how they cleaned them, why it’s not full of poisoning garbage. And that should be a good thing.

You should be proud of where your data came from. Not so embarrassed that you try to hide it. But ultimately, I wanna say this, so we do believe that [00:19:00] LLMs should be regulated, but regulations should first target. The LLM Foundation models and only then target the downstream use of those models. So we don’t think you should not Regulate the use.

We just think you gotta do first things first.

Paul Roberts (Security Ledger): So you also point out that a lot of the data the foundational data that feeds these models, and they need huge amounts of, you said they need huge amounts of data in order to. Learn that’s becoming a balkanized environment as well. Siloed and, so folks who have, bodies of clean data or valuable data are cutting off access to it, as opposed to maybe where things were five or 10 years ago.

First of all so is there a fix for that? Second of all, like. How do we, we know that, quality sausage is sausage that doesn’t have E. coli in it. But do we have a similar measure for data? I’m not sure we do.

Gary McGraw, BIML: I love that analogy, Paul. And we actually use that inside of BIML, the USDA and [00:20:00] meat. A lot. And we haven’t expressed it that way, but you hit the nail right on the head. The problem that you are describing that, dividing up the data pile is what we refer to as data feudalism.

Paul Roberts (Security Ledger): Yeah, feudalism, right.

Gary McGraw, BIML: That’s a term that, that I coined a while ago. And the feudalism is happening because a large language model requires a. A ridiculous amount of data. So 14 trillion data points is so big that people who talk about big data flippantly don’t understand what big means. That’s really big and the one of the questions that scholars are actually asking today, I.

Without being silly is, are we gonna run outta data? And then we’re calculating things like how much human utterance happens in a day on the whole planet? Is that enough data? And the, those are real questions. So the feudalism is happening because Google has all their search data.

X, or what [00:21:00] used to be Twitter and is now a disaster has a whole bunch of historical data that we gave ’em for free. Meta has a whole bunch of data. All of those data sets they used to share amongst themselves. Now they don’t or they come up with backroom deals. So if you’re an upstart who wants to build an LLM and you don’t have access to those enormous data sub-oceans, what are you supposed to do?

Is there enough data even for you to do that? We don’t really know the answer to that right now because where we are in the state of things is we have this scaling model and we know what seems to be working, which is Immense so enormous that it’s hard to think about how big it is. And not only that, in order to get a data set that big, we had to just include a lot of garbage and poison and crap and painosity in there.

Here’s a question. If we cleaned all that out of there, would the remaining data set be big enough? And how would we do that? Nobody knows the answer to that yet,

Paul Roberts (Security Ledger): Yeah, because you talk about clean datasets and [00:22:00] then you talk about X, and it’s like, what’s on X is, holy cow. Yeah, there is some valuable data in there and there’s also just a lot of conspiracies and, spam and bots and, everything else. Yeah.

Gary McGraw, BIML: Sure. And even that’s not that much data. So the question is there enough clean data to build an LLM the way that we did it this time? Or do we have to come up with new ways? So that’s a, that’s a challenge that we all have to think about as scientists. The other problem is this, it costs sixty-five million dollars to build a big LLM. The last time anybody wrote it down for OpenAI Schmidt said in the Wall

Paul Roberts (Security Ledger): How much is that? sixty-five million. Hold on.

Gary McGraw, BIML: Yeah. Yeah. Schmidt said the other day in the Wall Street Journal, a hundred million, so let’s just, squint and call it a hundred million. That is so much money that, your usual university computer science department doesn’t have that kind of money to build one of these things.

So when we try to do testing [00:23:00] to try to understand, for example, whether the transform. Transformer representation methodology is the right way to do it or not. We just do it that way, but nobody’s done enough science to know whether that’s the best way and it’s too expensive to, for, big science to take on right now.

So we’ve got a real challenge, and in fact, that’s another one of our top 10 risks is this idea of, economic what do we call it, reproducibility, economics. The

Paul Roberts (Security Ledger): Sounds like a job for the federal government. Gary, they got a hundred million dollars to throw around. They got a hundred billion dollars to throw around

Gary McGraw, BIML: Yeah. I just, but then again, you read what NIST puts out and you’re like, this is not about an attack list, guys. So

Paul Roberts (Security Ledger): Talk about that. Talk about the work that NIST has done and how we should be looking at that or thinking about it.

Gary McGraw, BIML: Okay I have two thoughts about that. Number one thought is talking about attacks is a great idea and you should do it. And I’m the guy who wrote Exploiting software. Don’t forget [00:24:00] that, which I thought was gonna get me in big ass trouble and I didn’t get arrested. So I’m very much in favor of talking about attacks. However, you cannot just go through a laundry list of attacks in order to secure a system. Building security in is not just testing with a set of known attacks against the system. It is not simply red teaming from the outside in or prompt manipulation by pizza guys is the way I would put it.

And unfortunately so far the NIST work seems to be relegated to piling up the attacks in an only moderately useful taxonomy. We produced a taxonomy at BIML in 2019, which I think is better than what came out last week. And. Not to say that’s the be-all end-all, but here’s the deal. We have got to talk about things like recursive pollution and data debt and reproducibility, [00:25:00] economics, not prompt injection by pimply-faced pizza guys.

And that’s, no offense to pizza guys, but that’s the problem. So that’s what I think about it.

Paul Roberts (Security Ledger): So first of all, BIML, one of the things that you do, you personally and BIML as a group is consult with different organizations that are either working on large language models or interested in learning more about them.

So I’d be interested in what sense you get just from those engagements of where people are at in terms of appreciating these risks and actually trying to do things to manage ’em.

Gary McGraw, BIML: Yeah the answer is that there are a lot of large organizations, especially in financial services as a vertical that have been using machine learning for decades. And they have entire. Data science groups that are now grappling with this new kind of data science stuff. And they’re finding that their controls that they built over the years.

Are being ignored or I don’t know, [00:26:00] circumvented accidentally by lines of business that are running out and buying LLM stuff and just standing it up without thinking through the implications. So it’s like when software was doing an in-run around network security and we had to create software security groups.

Guess what we need now? Machine learning security groups. And so that’s what I’m seeing. What I’m seeing is that the level of sophistication among some of the people that I talk to is very high, but the level of understanding what to do about this from a organizational perspective is still something that we’re working out properly. And really smart CISOs have been working on this for a good couple of years. We don’t have all the answers, but this is not something that is gonna, that’s coming as a surprise to a lot of people. Rather, it’s gosh, how are we gonna control this in a reasonable fashion without pretending we can stop it?

Now, I do want to tell you a little side story if I got time. [00:27:00] Is that side story is this one of the companies that I advise is called Legit Security. And the Legit guys are very good at looking around in a software development situation and finding software security stuff like, I don’t know, Jira piles and, different kinds of static analysis tools and different kind of testing tools.

And they’re good at finding what’s actually out there as opposed to what people think they have. So they were, we were talking about that and I said. Guys, what happens if you look for ai? And they said, oh, and they built it. And you know what happens when you look for AI with some CISOs? They say the following, we don’t have any machine learning in our organization because we outlawed it by policy.

So our, organization doesn’t have that. And you say that’s interesting. Can we just run this little tool and see? See what we see? And the answer is,

Paul Roberts (Security Ledger): Yeah, we also don’t use Telnet. We outlawed that

Gary McGraw, BIML: Oh. Oh. Yeah some of the CISOs that did have a handle on it have less of a handle than they think so, so that [00:28:00] the answer is early days

Paul Roberts (Security Ledger): And how is it creeping in? Or how is it evading detection? Like in what form?

Gary McGraw, BIML: It’s not creeping in. Ah, it’s being used by their, their own people inside the organization. It’s lines of business doing something, the marketing, people writing copy, the developers using code generation capable co-pilot and like all over the place. Just everywhere.

Paul Roberts (Security Ledger): Yeah. It’s like Slack, this is just stuff that you can deploy as an individual employee or as a group with kind of, you don’t need it to sign off on it, right?

Gary McGraw, BIML: Yeah. The term is shadow

Paul Roberts (Security Ledger): Shadow away. Shadow away. Yeah. Okay. Final question is one of your notable accomplishments, many in your life is developing the Building Security In Maturity Model and BSIMM, which is just a, pillar of foundation of the kind of application security space. Is there a BSIMM for ai? And if so, what would that look like?

Gary McGraw, BIML: So when we built the BSIMM at first, I think this was in 2008, maybe, [00:29:00] I can’t remember the year. There were a lot of people doing a lot of stuff in software security and application security, and my question was are we all doing the same thing? So I called up. All of my friends, like Steve Lippner over Microsoft and said, Hey Steve, we’re gonna do a study and see what people are actually doing as opposed to the what they ought to do or what they should, they think they’re doing.

We wanna find out what they’re actually doing. And we just kinda wanna write that down for everybody. Oh, he said I’m in, everybody was in Goldman, Sachs was in, Google was in Microsoft was in Wells Fargo was in a whole bunch of us started it in the beginning, and the answer that we found out was, everybody was doing similar things and we could do a union of all the things and we could build a measurement tool for that. Now, here’s what I think in machine learning security. We don’t know enough people that are really doing it yet. In the, in terms of, at the time, Microsoft was maybe spending, I don’t know, eight, eight to 10 million a year on that, maybe even [00:30:00] more.

And this was in 2009 on software security. It’s such early days that I think if we took the union of all the activity, we wouldn’t have an interesting data set yet. Would I like to see a BSIM for machine learning? Absolutely, yes, I would. But it’s too early. We don’t have enough data.

And remember, BSIM was a descriptive model, not prescriptive. It was not what you ought to do. It’s what everybody’s actually doing.

Paul Roberts (Security Ledger): And there’s enough kind of Balkanism that it’s not really clear that companies would be willing to participate in the way that the, those companies did back in 2008, and share what they’re doing.

Gary McGraw, BIML: Oh, sure. You mean in terms of If I went to OpenAI, that’s not who I would go to. I would go to, the banks that are using their stuff.

Paul Roberts (Security Ledger): Yeah. Yeah. It, last question, last question. If the president, either the current or future president, whoever that may be, said, we need a, an AI

Gary McGraw, BIML: Better be Titan. That’s all I have to say.

Paul Roberts (Security Ledger): We need an A.I Tsar. This is actually [00:31:00] totally plausible that this would happen. Actually. We need an AI czar to lead be the point of the spear in federal efforts to manage this, evolving space and, Gary McGrath, that is the person, which frankly, also would be a really good idea.

What would you do? What would you where, how would you tackle this beast?

Gary McGraw, BIML: That’s a tricky question. I was once asked by a president who, whose last name starts with an o their people to about the cyber czar thing. And what I said was, does that person report to the president and do they have their own budget? And the answer has to be right for both of those questions.

Report directly to president has to be one Yes. And has own budget. Not part of national security. Budget has to be. Yes. Before I’d even think about it for me now Howard, Schmidt was happy to do it. So maybe we’ll find another Howard.

Paul Roberts (Security Ledger): Gary McGraw, the Berryville Institute of Machine Learning. It’s been so great having you back on [00:32:00] the show, and thank you for all the work that you’ve been doing to help us understand this critical new technology and the implications of it. We I really appreciate it.

Gary McGraw, BIML: It is my pleasure, Paul. It’s always a great pleasure chatting with you about this

Paul Roberts (Security Ledger): Same here. We’ll do it again.

Episode 256: Recursive Pollution? Data Feudalism? Gary McGraw On LLM Insecurity

An update on AI risk

AI security: tucked away in a black box

Video Podcast and Transcript

Video Podcast

Transcript

Related

2 Comments

We want to hear your thoughts! Leave a reply.Cancel reply

Recent Posts

Episode 257: Securing Software on Wheels with Dennis Kengo Oka of Synopsys

Malicious Python Packages Target Crypto Wallet Recovery Passwords

Episode 256: Recursive Pollution? Data Feudalism? Gary McGraw On LLM Insecurity

China Calls Out U.S. For Hacking. The Proof? TBD!

Episode 255: EDM, Meet CDM – Cyber Dance Music with Niels Provos

Subscribe to Podcast

An update on AI risk

AI security: tucked away in a black box

Video Podcast and Transcript

Video Podcast

Transcript

Share this:

Related

2 Comments

We want to hear your thoughts! Leave a reply.Cancel reply

Recent Posts

Episode 257: Securing Software on Wheels with Dennis Kengo Oka of Synopsys

Malicious Python Packages Target Crypto Wallet Recovery Passwords

Episode 256: Recursive Pollution? Data Feudalism? Gary McGraw On LLM Insecurity

China Calls Out U.S. For Hacking. The Proof? TBD!

Episode 255: EDM, Meet CDM – Cyber Dance Music with Niels Provos

Subscribe to Podcast