Data Dialogues

Learn how data scientists are teaming up with other risk professionals to fuel open source software with differentiated data, which helps companies maximize their business. And hear how some companies are using it to improve people’s lives around the world. In this inspiring episode, David Ferber, SVP of Analytical Capabilities & Solutions at Equifax, interviews Sri Ambati, Founder and CEO of H2O.ai.

Show Notes

David Ferber, SVP of Analytical Capabilities & Solutions at Equifax, and Sri Ambati, Founder and CEO of H2O.ai, discuss the democratization of AI and how companies need to bring together multi-faceted teams to get the most out of their data. 

Skip ahead to these topics:

1:40 - About H2O
4:30 - Data is at the heart of all machine learning
7:25 - Trying to make decision-making cheaper, faster, easier
9:22 - H2O customer stories
12:11 - How H2O has recruited brilliant minds from around the world
16:45 - Examples of AI for good
20:40 - H2O wins award for its good works in India during COVID-19 pandemic
22:30 - Looking to our younger generation for inspiration
25:00 - The horizon for innovation has shifted since COVID-19

What is Data Dialogues?

A podcast where innovative business leaders discuss data: how to think about, how to use it and how it can help us all make better business decisions every day. As they tell their stories of trials and triumphs, you’ll gain key insights to leverage in your own day-to-day operations.

Data Dialogues Episode 10 (Length: 34:34)
Data Science is a Team Sport

David Ferber, SVP of Analytical Capabilities & Solutions at Equifax, interviews Sri Ambati, Founder and CEO of H2O.ai, about the importance of fueling open source software with differentiated data to help companies maximize their business – and to improve the day-to-day lives of people around the world.

David Ferber:
Hello everybody. My name is David Ferber and welcome to Data Dialogues with Equifax. I am a vice president of our analytical solutions and capabilities at Equifax, and my team of data scientists and data engineers support the delivery of our Ignite solutions in the United States at Equifax. I am honored and privileged to be joined here today by Sri Ambati. He is the founder and CEO of H2O.ai analytics, a machine learning and analytics company. Sri, welcome to Data Dialogues with Equifax.

Sri Ambati:
Thank you for having us, David. Super excited. Our journey started a good part of the mid last decade. So super thrilled to be here.

David Ferber:
Thank you. Yeah, it's an honor for me, too. I remember when we met the first time. I think we were at a Money 20/20 conference. And you rushed me into the booth and you were super excited to show me the latest and greatest that you were building at H2O. And I was floored to be honest with you. I was excited to see the passion and enthusiasm that you have for your company and the mission that you were on. And watching the CEO actually drive the technology and use the tools that you were building was pretty fascinating. And I love that. So with that, I'd love to talk to you about your journey. Like how did you end up as the CEO and founder of a machine learning AI company?

Sri Ambati:
Thanks for memorializing that incredible first time chemistry that we had. And since then, we've been on stage together at H2O World and several conferences. Many partnerships and customers that we've created and made successful together. So super excited for the partnership Equifax has had with us, both from a co-creation standpoint, but also co- innovation to help create great value for our customers and our community. H2O is a movement of data scientists, data engineers, mathematicians, physicists, like you mentioned. And one of the key things that we all know is data fuels AI. And our focus has been to historically be that AI software powerhouse that makes every one of our customers AI superpowers, AI companies. And towards that democratization of AI was a core theme for us.
I started when I started building machine learning models and quickly found that there was not enough machine learning math software that can solve billions of data points that can essentially make AI really accessible to every audience. And I think the lack of those tool chains specifically was highlighted when my mom was diagnosed with cancer, and I was trying to look for a good tool chain to understand the difference between whether to use lumpectomy versus mastectomy. And the tools that the doctors were using at the time and the physician, the researchers was a very, very small data. So essentially that triggered me to go start building a tool set that's accessible to every researcher on the planet to be able to start doing signs at scale.

David Ferber:
Wow. That's amazing and what a great application to apply it on, the health and wellbeing of your family. And I love the democratization of AI, which we talked about and the tools and the platforms you're building that support that, right? And we're seeing this data science is a team sport now, right? There's bringing more people into the fold that may not be traditional data scientists, but giving them the tools that enable them to do more. If you're knowledgeable of the data and the business problem, I think, you know, talking about the things you're building, you’re allowing more people to take advantage of the data and the tools out there and getting more out of that data. So with that, I talk about data a lot, right? It's a passion of mine being at Equifax for so long. But then layering on the capabilities that you're building. So can you talk about how that relationship with data and the importance of it is with what you're trying to accomplish?

Sri Ambati:
Data is actually at the heart of all machine learning models, right? So you fuel a lot of innovation in AI with data. Data has gravity as well. And most of the time, our customers' data is sitting in their data centers or in their cloud. But they need to have alternative datasets to truly spice up their signal in their analysis. And that's where I think our work together, customers love bringing in new data sets that have a different angle on the problem they're trying to solve. And the team sport part of data is really important. Adoption of AI is limited by culture, right? Most often our giant cultures are able to resonate deeply to new found insights in their data, and react to them, and use them.
And in their ability to use them, they need to be very very fast learning organizations. And the domain expertise shouldn't be too far away from the data science. Neither should the data engineering capabilities be too far away from the science. And the business applicability, which I call the business pain for it, should drive analysis. And I think that ability to go from strategy to data, and data to insight again, and back and forth - I think that the ability to go back and monetize your data assets becomes really important. And that's really a team sport and the best teams we've seen across our customer base are working very closely with their business teams, working very closely with the design teams to get that work that responds to a scientific finding in the data. Then apply them to their businesses. I think this kind of ability to bring those multi-faceted teams together is super important. And I think we found, really, an example of that at both Equifax and our common customers.

David Ferber:
Yeah I think having those common customers ,and we're all trying to work towards that common goal of getting more out of our data. And you know, I like to look at it like this, with H2O and tools like that, we're working on that development front end, right? Developing the insights that you want. But I also see the phases that happen after development. That seamless deployment to production to execute on that and get the monetization of those insights. And then that third phase, monitoring that. Is it working? Is it performing the way you want it to? Is that strategy working, and doing the whole process over again. So the development, execution and monitoring of things, and H2O is obviously a capability that allows that to happen and giving you all those amazing insights and way more predictability in a higher performing way.

Sri Ambati:
Explaining the machine learning method. Machine learning is usually, and AI, are associated with black box models or large models. And proxy variables can sneak into those models relatively easily. So I think that's kind of where explaining these models and preventing accidental bias, kind of attack these models with ways to test and stabilize your methods, I think those are all very appropriate techniques. And now of course we have a great tool chain for it. But a tool chain is really as good as the person using it still. And I think companies and organizations are able to build good, safe guardrails in deploying AI at scale and start building a strong core competency in this phase. A key KPI for organizations will be how fast they can learn. How fast they can deploy pipelines, data pipelines. It's not just the size of your data in your data centers. It’s how quickly, how they can flow and create rich monetization for the assets. But also in general, trying to make decision-making cheaper, faster, easier. When we say democratization, we're really calling for faster, cheaper, easier, so that you can do more experiments and not be afraid of failure. Because failure is no longer an option. It's a must-have. It's a feature, not a bug.

David Ferber:
I love that. That's exactly right. We were talking about that co-innovation briefly. We touched on a little bit with some of our mutual clients that buy data from Equifax and use H2O for other predictive analytics. Can you talk a little bit about some of your favorite innovation or co-innovation projects that you've worked on recently? And how some of that has really changed the direction of your business?

Sri Ambati:
Lots of really great examples here. Some of the more interesting ones are when Brexit was about to happen. One of our customers came to us and said, what all data sets can we bring in to see what kind of scenario analysis can be done. For example, to predict corporate bonds in the fixed income space. We looked at another customer more close to recent times during COVID. As the COVID onset happened, we were trying to rebuild these models. It was super important for supply chain optimization and figuring out how much Lysol will be needed in different parts of the US because people had left Manhattan and parked themselves in large and small towns across the nation. They were expecting similar service and quality in the local Walgreens and Walmarts of these states.
So figuring out the new distribution center model delivery method. Every mile saved for a UPS truck driver is up to $50 million saved for them. And so rerouting all of those delivery packages and routes was again, a very data-driven approach. We had to bring in a lot of alternative data sets to bear looking at fraud prevention. A lot of our online customers were selling in e-business more for the first time, and they were more prone to fraud than traditional banking and cards. So they had to bring in a whole new way of looking at data because FICO scores themselves were a little perturbed by stay-at-home lockdowns. People were saving more, so their FICO scores were going up, but not necessarily reflective of credit.
So there are a lot of new ways of looking at the same problem with data. And I think there is a lot of co-creation in business value. One recent piece that we were co-innovating is with one of our largest telco customers, AT&T. We were co-innovating to create a feature store and delivering that to our customers. We're working on a LIBOR application with one of our large banking customers to see if, as LIBOR goes away, can we redraft all the contracts to help other customers who are in similar situations. Again, entity resolution here, between NLP and using Equifax data, a lot of powerful use cases there. Your APIs. There are a lot of APIs at Equifax, so a call-out for them. Co-creating rich AI powered applications with the data assets that Equifax can have, can have far reaching effects for our customers.

David Ferber:
Yeah, definitely. That API portal has been very handy for our customers to get a hold of that and test it out like you said, and this partnership is to do more with that data. I've been focused on my whole career, getting our data in the hands of people who can do more with it and building these big data platforms and those capabilities. And what I love about what you have done, and what you have talked about is some of that math that you've changed, right? You're bringing on a ton of the best and brightest data scientists around the world at H2O to rewrite some of these algorithms and get them in a way that can be distributed and run faster. Do you want to tell me a little bit about that, and how you've gone out there and recruited some of these brilliant data scientists from literally around the world?

Sri Ambati:
Yeah. It started with partnering very closely with the universities for structure between Stanford, which is where a bulk of the core creation of the company had been. At Berkeley and Purdue, we had early tutelage from the folks who wrote the documents and the algorithms and the books in statistics and machine learning. And we quickly found out that the best mathematicians are the ones who are willing to be open to a distributed version of their methods, a much more scalable version of a gradient boosting machine, or a GLM resident starting with the basics. We managed to create the world's fastest calculator at high-scale. So the distance calculation between two, large high-dimensional vectors, we've perfected that and made that really a very small problem. Once we managed to do that tensor distance calculation easily, it quickly made our platform solve larger problems very quickly.
Open source is at the heart of our movement. And so every Thursday or Friday, we would have meetups, continuously educating the audience, bringing awareness about data science in general and H2O in particular, but different methods in a very neutral fashion. We took a very open approach because open-source breeds both innovation and freedom for our customers. And once we started making signs, auto-generate software. So our inference engine, the scoring engines, that come out of the training, they have been embedded now in 20,000 companies that use that open source and all the payment systems, whether it's far off, like in PTM or Mpesa or PayPal or Ali Pay, Apple Pay, all these payment systems have hitched your models embedded in them. And that brought grassroots transformation for data science adoption across the industry. And in return brought a great bit of awareness of our project and of our movement.
One thing led to the other and we started realizing that the grandmasters were defining the best standards, the best innovation in the space. So we sought out the top data science grandmasters, and the fastest growing talent in this space and put them together and started building the place for the world's best machine learning and data scientists. And creating environments for them to be very empowered, very self directing of what they want and where they really want to use their time. Some of them are working on AI to protect endangered species. Some others are working on AI to fight wildfires or predict behavioral fires in wildlife. Still some others are working on making the, improving the biodiversity or AI for water in the oceans and so on and so forth. So there's a lot of high purpose driven work that they're doing, but the biggest purpose that drives them is our customers. Our customers have real problems and our data scientists are looking for those puzzles continuously. So when they work with you and your customers, they're really super motivated and excited to solve those problems of high value and meaningful outcomes for our customers.

David Ferber:
Yeah. And that's a perfect segue into the reason I think we work so well together. I think your company's motto is AI for good. And Equifax's, you know, our goal is to help people live their financial best, right? How can we help consumers get that student loan they need, or the mortgage for their new home or their new car to get them to work. And those things are, we're enabling that. And I think to your point earlier, like layering new data sets that maybe weren't thought of before, or didn't realize they were that predictive in solving some of these challenges and really opening up the credit world to people who may previously not been able to get credit because we didn't have enough data to make those decisions. So, do you want to talk about some of the good that's being created with AI in the industry around the world. I think there's some things I know you've worked on that are very amazing and near and dear to your heart. So I'd love to get that out into the public and know more about it.

Sri Ambati:
I’ll start with the democratizing credit. You almost touched upon it. With democratizing credit, one of the interesting aspects, and I learned it through talking to our customers like Capital One and Subprime Space and with some of the senior leaders at Equifax, is what's the percentage of American workers who are daily wage workers? And the definition of a nurse is very different than if it's an ER nurse or a nurse working on a pediatrics ward, or nurses working in different parts of the hospital. And for these people democratizing credit is meaningful. But one of the top reasons for people to lose their credit in the US is still poor health and healthcare. The cost of healthcare. And I think these are all very tightly connected. Being healthy makes you automatically capable of keeping a job longer, with tenure.
And then of course, as a result be able to be creditworthy in education, which is another very core team of how people can grow up in their careers. So democratizing health, democratizing education and democratizing credit, they're all quite related. Some of our customers are in life insurance where they are naturally aligned to the long-term wellbeing of our people. I think the community-first mindset and we all talk about how to build great companies and for the long, great companies also enhance and create great communities. And in our view, we think that community building is at the heart of all company building and over time that has the most long term potential for change. That brought us very closely in touch with the hospitals who are our customers. During COVID in predicting, we used to predict flu and a flu shot prediction for Walgreens on the corner stores, to predict California flu with Kaiser.
So one thing led to another. We got pulled into helping hospitals react quickly to the demands of COVID. And as a result, most of our data scientists and our product culture, we allowed to them to build some rapid prototyping culture software called H2O Wave, which allows them to start producing a rich set of apps in this space. And these apps are reusable. Completely extendable as well. So people can take those apps, start building AI for good app stores. And the vision of democratizing and digitizing philanthropy, if you will, or change, it makes it so that software can truly transform the world. Not just eat the world, right? Some of it can bring real change. And I think in that sense, AI is geneing software. As a result, it has the power to change the world.
One of the works we were looking at was the pandemic supply chain, and that led us to looking at the rising prices of oxygen concentrators when things were under a lot of demand under the Delta variant impact in India. And our team managed to put some time together, so we could actually go and help rural and tier two tier three towns in India, remote places where one or two concentrators could still save hundreds of lives. And that was a very fulfilling experience for the whole team. And we brought that back and started improving our understanding of how impactful this technology can be for the world.

David Ferber:
Yeah, and that’s an amazing story. And the benefit is that, I don't know, hundreds of thousands of people, maybe more than that were benefiting from that effort that you led in getting the oxygen concentrators to India. I heard you won an award for that. Did the Indian government recognize you and H2O for those efforts, right? Did I hear that correctly?

Sri Ambati:
It was a team effort. A lot of friends, family and folks from the team were very devoted to that mission. They finished my quarter early, so I could actually spend a good week to two to three weeks of intense effort on bringing that together. H2O and AI is borderless. We've actually built bridges across all continents as part of this and other efforts. The key piece I would see in that particular way was I was actually inspired by my two daughters nine and 11. They came back from their upbringing in the local American school to not sit on the sidelines and go do something about things that need help. And that almost got us into action to help remotely.
And the world has truly become a small village. Individuals now process the same power that Kings possessed in the 1600's and 1700's. So it's very inspiring to see so many individuals now able to bring change at global scale. And I think we should essentially look to the next generation of young transformers and change bringers to kind of... They have such good clarity of what needs to be done as opposed to the talk about what can be done. And I think in that sense they are quite a huge inspiration for me as well.

David Ferber:
For me as well. It's great to see how natural some of that becomes for the children of today. I don't know if you've gotten your children coding Python yet, are they rolling up their sleeves and working on some code?

Sri Ambati:
Yeah, I think that one of the things that was interesting is that there's code.org. And during the pandemic the managers spend a lot more time generally with adults and digital education means that the whole world has become much more easily accessible to them. Programming is being literate about data, being literate about finances, being literate about code. It is as important as literacy in English and in Arts.
And I think that means that they're more likely to take things into form. They put up the websites for the contributions themselves. They put up the donation GoFundMe updates would be coming consistently from the kids. And so that just really helped them see that if they could imagine something, it can be done quickly. And the whole community rallied. We had a fourth grader take her personal savings and do a GoFundMe. We had another neighbor, someone from next door who just couldn't get online, but wanted to help somehow and wanted to come home and give contributions. So you can see the level of global universality that COVID has exposed. I mean, we all hear about the kind of localization and the division that COVID has impacted, but what we experienced was yoga centers in the Midwest supporting, or the Tibetan monasteries or the Middle East. We had folks from all walks of life wanting to help and save communities across India.

David Ferber:
And I guess there are silver linings in the disaster of COVID. A lot of the innovation has come out to help fight this. And I know at Equifax, we had these weekly calls, and we were refreshing our data to get the latest and greatest. So, companies had better data to make those decisions, those hard decisions. And how to support the community and their customers. So I've seen some amazing innovations come out of this. And, the community bringing things together to do the things you were talking about. I love that.

Sri Ambati:
Yeah. The horizon for innovation suddenly shifted by five, ten years. So that's the way to look at it because what used to be, we need cloud, we need to be digital. We need to do this podcast remotely. We need to meet occasionally, but continuously innovate together. And I think that kind of collaboration, distributed collaboration is going to just make innovation, just go on an exponential curve. And I think we are beginning to see that happen. A lot more emphasis on both personalization and the data assets that Equifax is able to offer now are much, much more practical, useful for businesses to take them and start rebuilding their models. All models were wrong and some are useful in the right sort of the classic adage. But now I think all models have to be rebuilt because COVID stress tested every machine learning method, every statistical model of the past. And I think that means that continuous learning containers, continuously learning models and from data changes and drifting data, all of that has to be automated to the fullest extent possible. And I think that's a great place for both Equifax and H2O to be at the service for the customers.

David Ferber:
Yeah. Interesting point. And you're right on, the continuous learning and the ability to shift quickly when new data exposes itself. I feel like there's still another shoe to fall. We're not there yet. Right. Things are, there's more to come on this pandemic and the impacts it's having. You know, as the laws change and the banks adjust their policies, we're going to start to see more impact. So I feel like there's more to come here to this story. It's only going to get harder, unfortunately in my mind, but we'd love to hear your thoughts on that.

Sri Ambati:
We’re probably halfway to 60%, the glass is more than half full. But I think another nine to 12 months of general recovery. We looked at the models or data from the 1300's, 1349 pandemic, the black plague to the 1666 to the 1918. I think most of the cases are at least three years long. I think we have better science on our side this time. We have better ways to communicate. I think COVID also exposed the power of misinformation. These black box models have essentially been pretty difficult to kind of explain the effects and interpret models in a way that ordinary citizens can consume simple science. So going back to the basics where we don't bombard the receiver with too many signals.
Sort of the right amount of message and clarity. I think that we should be able to overcome. I think in general, we are in a place where businesses are in the choppiness of recovery and it will be more difficult to adapt. The less uncertain things are, I think the more predictability comes into the system, then businesses can adapt rapidly. You're seeing a lot more deconstruction of the traditional bank. A lot of fintechs. So you're beginning to see that in many ways in each industry, one at a time, and on every continent, there are now emergent large moments for a little bit of decentralization, but a little bit more of applying AI and data to power the transmission. A classic example of one of our banking customers has found that data science can give them just as good interest rate prediction or predictions for underwriting as very many years decades experienced teams on their side.
So then they are like, how do we kind of automate this? How do we take this to the next level? So you've already seen core pockets of extreme competitive innovation in larger companies. And that's actually also very powerful. We're seeing that entrepreneurship is transferring large companies. And likewise, small companies are able to reach the audience of a very large company as well. So you begin to see true cooperation happening in the market. And just like 20 years ago, the internet brought giants like Google, eBay, Amazon, right into our midst. And I think that's happening with AI as the new electricity. It's powering those large organizations to be formed today. There are certainly a few great candidates for that and through our customer partnerships and true customer centricity, we hope to create a few of those ourselves. I call our team the Sherpas for others climbing their Mount Everest. And so as with your partnership together with your data and our AI, we could definitely bring that true scale for our customers' efforts to become the world's best in whatever they're trying to do.

David Ferber:
Yeah. Great analogy. I love the Sherpas analogy. That's exactly right. And you guys are there helping lead the way every time. You know, going back to the overall theme of this discussion, right, the AI, plus the data, and the industry adopting this in this growth. It's just been an amazing journey. Sri, I appreciate you and your company and the partnership that you've had. It's been a great journey so far, and I look forward to the future and doing more great things together. So thank you so much for being here and taking the time to talk to us about all these amazing things that are happening.
Sri Ambati:
Thank you for having us. And like your product’s culture, we want to ignite discussion and we want to ignite that healthy data-driven culture in our customers and our communities. And even the data and purpose can use AI as their co-founders and build incredible value very quickly. The future, as we see it, is that trillion dollar companies will be created more often, and many of them will very likely have AI at the heart of them. And I think the size of these trillion dollar companies are going to be much smaller, right? So, you're now beginning to see the future where anyone with incredible purpose and drive can go and bring that change they're wanting to see in the world. And I think we're super excited to be partners in that and data fuels that growth. And AI is a willing partner in that monetization. So looking forward to a strong partnership with Equifax and your audience in the years ahead.

David Ferber:
Thank you again for taking the time to talk to us. This has been a great discussion. I hope a lot of people find it very useful and enlightening. If you guys want to find out more about H2O or our customers, H2O.ai is their URL. Go check them out. And again, thank you for being on today for Data Dialogues, and I look forward to our future partnership.