Technology Now

In this episode we are looking at an emerging field which is bringing together two transformative fields in tech - edge, and AI.

Traditionally, the intense compute requirements of AI have made it difficult to implement on edge devices - cell phones, laptops, or micro-electronics. However, that is slowly changing, and the global edge AI market is set to expand at a compound annual growth rate of 21.0% from 2023 to 2030.

So, to find out more about edge AI, and the challenges or opportunities they can bring to our organisations, we’re joined today by Peter Moser, Senior Distinguished Technologist at Hewlett Packard Enterprise.

This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it.

Do you have a question for the expert? Ask it here using this Google form:

About the expert:

Sources and statistics cited in this episode:
Scope of the edge AI industry:
Denver Police to begin drone-based 911 call response:

Creators & Guests

Aubrey Lovell
Michael Bird

What is Technology Now?

HPE news. Tech insights. World-class innovations. We take you straight to the source — interviewing tech's foremost thought leaders and change-makers that are propelling businesses and industries forward.

Aubrey Lovell (00:09):
Hey everyone and welcome back to Technology Now, a weekly show from Hewlett Packard Enterprise where we take what's happening in the world and explore how it's changing the way organizations are using technology. I'm your host, Aubrey Lovell, and Michael is on his family duties this week so it's just me.

And in this episode we are looking at an emerging field, which is bringing together two transformative areas in tech edge and AI. Traditionally, the intense compute requirements of AI have made it difficult to implement on edge devices, cell phones, laptops, or even microelectronics. However, that is slowly changing as new optimized chipsets emerge allowing machine learning and smaller devices. That's accompanied by better, more efficient architectures and foundation models to reduce the amount of compute needed to perform AI-based tasks. But what does AI on the edge look like? What will it actually be used for? And why does it matter to our organizations? Well, if you're the kind of person who needs to know why what's going on in the world matters to your organization, this podcast is for you and hopefully you've subscribed to your podcast app of choice so you don't miss out on all the great conversations we're having. All right, let's get into it.

So what exactly is edge AI? Well, traditionally complicated compute and analysis of large data sets has happened in data centers. That's particularly true when it comes to AI, which requires specialist computer systems to function optimally. In particular, the use of GPUs to tackle large parts of a workload. Doing everything in data centers isn't efficient though, data isn't created in data centers, it's actually created here in the real world at the edge. So moving it to a specialist center for analysis and storage is inherently inefficient in terms of time and energy. It's what we call data gravity, and we've spoken about it on this podcast before, of course there's a link in the show notes. So over the last decade or so, as computers have gotten more powerful, the ability to perform some of the calculation and analysis close to where the data is created has improved. If you can do the calculations on site, you get answers more quickly, and that makes everything more energy efficient and cheaper.

Now, adding AI to the mix creates some complications because most edge-based devices have no or limited GPU facilities or AI processing capacity. However, it's also a massive opportunity area. According to a study by Grand View Research, the Global Edge AI market was valued at $14.7 billion in 2022, but is set to expand at a compound annual growth rate of 21% from 2023 to 2030. We've linked to that in the show notes too. So to find out more about edge computing and AI and the challenges or opportunities they can bring to your organizations, we're joined today by Peter Moser, senior distinguished technologist at Hewlett Packard Enterprise. Hey, thanks Peter for joining us. Super excited about this. So let's start with the big one, why would we want to bring AI to the edge?

Peter Moser (03:19):
Let's think about what is the edge first. So the edge is where a lot of businesses generate revenue and engage their customers, like in retail stores, let's say healthcare, if you're a care provider, let's say you're manufacturing, you're engaging your workers at the edge. So when we think about what the edge is, this is also where according to studies, over 50% of all new data will be generated.

So now let's start compounding this thought process about the edge and why AI at the edge. One is this is where you have a lot of workers. This might be where you build your product. This might be where you engage your customers. This is also where most of your data is being generated going forward. So this is where the need to analyze that data is close to where the data is created and is possible, becomes important because some decisions require millisecond response times. You can't afford, for latency purposes, security reasons, a lot of reasons, send that data someplace else to a data center to then analyze it, then give you an answer back, a prediction or whatever type of insights that you're looking for. So that's why AI at the edge is so critical.

Aubrey Lovell (04:38):
Is this more about inference on the edge or is it about training on the edge essentially?

Peter Moser (04:44):
So right now most of the focus has been around inferencing at the edge. In my opinion, training at the edge is going to start growing more in popularity, and the reason why we haven't seen more of it is generally the resources required to train model. And then secondly, the lack of knowledge around federated learning, which is the ability to basically push a model, the parameters and weights to the edge, train on it locally on the local data, and then re-aggregate the trained weights and parameters and model back to a host that then averages it in from the other edge locations to give you a more fully trained model. I think as customers mature in their AI journeys and become more familiar with this type of training technique, I think you'll see more of that being done at the edge.

Aubrey Lovell (05:42):
Can we bring AI to the edge right now? And if so, how do we do that?

Peter Moser (05:47):
We can bring AI at the edge, absolutely. Then how we do that is kind of multi-faceted. So AI is not a point in time exercise, you don't just train a model, deploy it, and you're done. What happens over time is that the model accuracy will drift, it'll go down over time, and this is attributable to new outliers or anomalies in the data. So you have to monitor your model for accuracy and then as that accuracy declines, capture those outliers to then retrain the model so that it recognizes them, it's accuracy goes back up.

Aubrey Lovell (06:30):
Is part of this about treating edge devices like nodes? So for example, in a traditional AI HPC solution, each working on part of a wider problem and distributing the workload? Or are we talking more about each device being a self-contained entity?

Peter Moser (06:45):
So for many customers that are just starting on their AI journey, they may just have a single server that they have a scale scaled, so to speak. Now the more mature customers that might be training multiple models, inferencing multiple models simultaneously, then you might see more expansion into clusters where you have to manage those resources because now investment has grown exponentially in compute, maybe accelerators like GPUs or FPGAs, and so the environment now becomes much more complex because of the scale that you're now developing AI models and retraining them and inferencing them and the like.

But for many customers that are just starting out on their journey, this might be a single server with some GPUs in it, and this is a, not that it doesn't need to be managed like a cluster might, but this is for customers that are just kind of stepping into the AI waters and don't necessarily have the HPC, High Performance Compute, expertise and all that goes with that. So this is where we start seeing separation in maturity with customers in their environments.

Aubrey Lovell (08:03):
Efficient AI requires a whole lot of compute, GPUs and specialist optimized equipment. How do you bring that to the edge devices, which are usually somewhat simpler and optimized for other uses?

Peter Moser (08:15):
That's really been one of the bigger impediments for the advancement of AI at the edge is lack of resources. Because if you think about the edge, this might be a retail store, it might be a hospital, it might be a factory, and we typically don't have IT resources there, let alone AI resources. So for customers to use AI successfully either for inferencing or training or both, you're going to need a degree of expertise locally. And this is where customers are pausing because they have to make the decision, do they want invest in these skill sets at these digital locations or do they want to buy, so to speak, the outcome from a company that can provide the full solution and support for the outcome? And I think we're starting to see customers are really looking at partners to provide a turnkey solution versus them having to invest and figure this out for themselves versus the larger companies or the ones that are much more of a data-driven, mature company that have already invested in these resources and skill sets.

Aubrey Lovell (09:29):
What kind of use cases need AI at the edge rather than say cloud or data center-based inference?

Peter Moser (09:36):
This is where we're starting to see data sovereignty, the governance around the data that, we call it data rights, where the data can't go to the public cloud. It might be patient data, it may be proprietary company information, it may be client information, etc. Then there's the latency aspect, the laws of physics still apply. Whenever you send something over a wire, be it fiber or copper, there's a degree of latency sending it someplace else for processing and then getting an answer back. And so the more volume data you have, the more it compounds this problem, like say video data or large image files, and that's why we see that there's certain attributes around data and AI that really drive businesses to deal with it at the edge, closest to where the data is created, for data sovereignty reasons, privacy reasons, and a lot of other reasons, compliance, where they don't want the data leaving that edge site, they have to deal with it there for both inferencing and training.

Aubrey Lovell (10:44):
Do you think this is something that's going to be of use to most business consumers and everyday organizations, or is this going to be kind of specialized technology for at least the near future?

Peter Moser (10:56):
One of the things around generative AI and the use of large language models like chatbots and things like that, text to speech, speech to text, these types of models have evolved rapidly in the last 12 months. And I think that most customers, in fact, all, have been exposed to the power that these tools can provide them from a productivity standpoint, from a worker standpoint, from a customer experience standpoint, that there's a myriad of use cases that are proving out to deliver maximum business value that they cannot achieve through traditional means. So I think regardless of what industry you're in and regardless of your size, I think we're seeing, at least from the customers I talked to and the partners that I work with this movement towards, how can I, me, my company, where we are in our maturity model with our resources, access the power of, in particular, generative AI and large language models, how can I capture that power and use that to help my business?

Aubrey Lovell (12:12):
Thanks so much, Peter. It's really fascinating to hear how quickly this relatively new technology is taking off. And we'll be right back with Peter in a moment, so don't go anywhere.

All right, now it's time for Today I Learned, the part of the show where we take a look at something happening in the world we think you should know about. Police in Denver, Colorado have announced that they will be using drones as first responders for 911 calls. Very interesting. Several other agencies across the state already use drones for surveillance and search and rescue, but for the first time in the state, drones are set to be the first on the scene to gather information ahead of traditional two legged officers. It's hoped it will give them better situational awareness of potentially more risky call-outs while also assessing more mundane calls such as broken traffic lights without the need to immediately dispatch officers. That's pretty cool. There's still some way to go though with the program to create drone first responders. The police will be looking for public consultation, but it could be an interesting program. A similar pilot running in Chula Vista, California since 2018 has avoided some 4,000 physical police call-outs while also responding to incidents almost twice as quickly as human officers.

All right, now it's time to return to our guest, Peter Moser, to talk more about how organizations can leverage the emerging power of edge computing and AI.

So Peter, the focus with AI at the moment seems to still be on the data center and HPC side of things. Do you see that changing? And if so, when?

Peter Moser (13:50):
The data center is generally where most people start because that's what they know best and the reason why it's the most mature. What we're seeing now is this rotation back to on-premise. And that could be on-premise at the edge or a closely located co-location facility. And there's a lot of reasons for that. Cost is one, the latency aspect that I had mentioned earlier is another reason. Data sovereignty, data privacy, all these reasons are all forcing companies to rethink a data center only approach.

And the experience, for instance, latency can create or taint the experience for the end user. So let's say that you're a chatbot solution and you're rolling it out to, let's say your workers where they're out there asking the AI questions to get answers to help them do their jobs. Well, there's a certain degree of expectation that the AI model is going to give them an answer in less than let's say 200 milliseconds. Well, if they don't get that, then it's a dissatisfier and they won't use the tool because it just takes so long. Well imagine the end user and you tainting that customer's experience using your product or your service.

So this is why businesses are rethinking where they do this AI, where they do the training, the inferencing because of latency of the cost. When you start moving data around, there's a certain cost and there are costs in moving data and there's a risk around security because when that data leaves a premise and goes to another premise over fiber or copper, you've now exposed that data to a potential cyber threat. So there's a lot of reasons to try to keep the movement of data as minimal as possible as it consumes bandwidth as well, and the cost associated with that also. And we're starting to see more and more customers pivot back to keeping everything as close to the data source as they can.

Aubrey Lovell (16:05):
Last question, Peter, let's bring it home. Why should organizations be watching the evolution of AI at the edge?

Peter Moser (16:12):
That is really the question. And what I would say is because of the rapid evolution of AI, especially large language, medium size and small language models over just the last 18 months and in the maturing of federated learning, the awareness that more businesses have because of some of the public domain solutions that are out there that they have access to, we're seeing this rapid evolution in AI model availability. All these resources now are much more accessible to all businesses of any size and scale and that with the low-code, no-code tools that are available that help them just know the data and the software generates the model for them or fine-tunes the model for them all these advancements are helping customers accelerate their use of AI. So in my opinion, there's going to be an acceleration in AI far more rapidly than any of us that ever anticipated.

One of the things that I would advise them is it's not just about technology, it's as much about people processing governance and compliance as it is anything else. Because without data you don't have AI and data has rights. And this is where you have to really have as a company, a good understanding of AI ethics, great data governance and policies and management around your data before you start bounding down the AI path.

Aubrey Lovell (18:02):
Thanks so much, Peter. It's been great to talk. And you can find more on the topics to discuss in today's episode in the show notes.

All right. Well, we're getting towards the end of the show, which means it's time for This Week in History, a look at monumental events in the world of business and technology which has changed our lives. And since Michael is not here, I am your lucky host for this one again. So the clue last week was, it's 1822, and this design really made a difference. Did you guess it? Well, it was the unveiling for the design of Charles Babbage's Difference Engine in a paper to the Royal Astronomical Society in London. The Difference Engine was the first design for a mechanical computer, which used a hand crank and rotating drums of numbers to complete equations and convert units, making it a potentially revolutionary device.

The British government quickly gave Babbage a 1,700 pound grant around 270,000 pounds or $340,000 today to build a machine designed to complete conversion tables. But the cost and complexity ballooned, and 20 years later, by 1842, over 10 times that amount had been sunk into the machine, which still didn't work despite a partially completed prototype for a design which would have had more than 25,000 moving parts and weighed four tons. Wow. That said, Babbage's incredible engineering skills spurred a new wave of thinking about calculations around the world and inspired similar devices. And in the 1980s, two working difference engines were built by universities to Babbage's original designs, both of which worked. A bit before my time, but very cool.

And the clue next week, it's 1675, and about time to set your clocks, know what it is. Oh, you know what, producer Sam, I actually think I do on this one, but don't tell.

And that brings us to the end of Technology Now for this week. Thank you so much to our guest, Peter Moser, senior distinguished technologist at Hewlett Packard Enterprise. And to our listeners, thank you all so much for joining us. Technology Now is hosted by myself, Aubrey Lovell and Michael Bird. This episode was produced by Sam Datta Pollen and Sidney Michelle Perry with production support from Harry Morton, Zoe Anderson, Alicia Kempson, Alison Paisley, Alyssa Mitri, Camilla Patel, and Chloe Suewell. Our social editorial team is Rebecca Wissinger, Judy Ann Goldman, Katie Guarino, and our social media designers are Alejandra Garcia, Carlos Alberto Suarez, and Anbar Maldonado. Technology Now is a lower street production for Hewlett Packard Enterprise, and we'll see you next week. Cheers.