Venture Step | AI Unveiled: Google's Latest Breakthroughs and the Future of Artificial Intelligence

Summary

In this episode, Dalton discusses Google's latest announcements from their IO event, focusing on the new AI tools and models. He explores VideoFX, ImageFX, MusicFX, and the Synth ID watermarking technology. Dalton also highlights the capabilities of Notebook LM, a tool for creating personalized and interactive audio conversations, and emphasizes Google's commitment to responsible AI development. He concludes by mentioning the upgrades to Gemini 1.5 Pro and the wide accessibility of Google's AI tools. In this conversation, Dalton discusses various topics related to Google's AI advancements, including Gemini updates, Gemini Flash, Gemini Nano, Gemini integration into Google Workspace, AI search, Android integration, AI overview, Google's code editor, and more. He also shares his thoughts on the accessibility and usefulness of AI features for everyday users. Dalton concludes by mentioning his plans to discuss Alpha Fold and his current reading on scaling processes.

Keywords

Google, IO, AI tools, models, VideoFX, ImageFX, MusicFX, Synth ID, Notebook LM, Gemini 1.5 Pro, responsible AI development, Google, AI advancements, Gemini, Gemini Flash, Gemini Nano, Google Workspace, AI search, Android integration, AI overview, Google's code editor, Alpha Fold, scaling processes

Takeaways

Google introduced new AI tools like VideoFX, ImageFX, and MusicFX
The Synth ID technology embeds digital watermarks into AI-generated content to combat misinformation
Notebook LM is a powerful tool for creating personalized and interactive audio conversations and managing unstructured knowledge bases
Google is committed to responsible AI development and offers free access to many of its AI tools
Gemini 1.5 Pro received an upgrade with an increased context window Google has made several updates to its AI platform, Gemini, including increasing the context window to two million and introducing Gemini Flash for real-time responses.
Gemini Nano is integrated into Pixel Pro 8 and will be available on more phones in the future, offering private and secure AI capabilities.
Gemini has been integrated into Google Workspace, providing enhanced productivity features such as email summarization and assistance with spreadsheets and note-taking.
AI search has been implemented in Chrome, allowing users to ask complex questions and receive AI-generated summaries with cited sources.
Google has introduced its own code editor, IDX, which uses Flutter and integrates with Gemini APIs, Firebase, and Google Cloud.
Dalton emphasizes the importance of making AI accessible and highlights the potential benefits of AI features for everyday users.
Dalton plans to discuss Alpha Fold and scaling processes in future episodes.

Sound Bites

"Google introduced new AI tools"
"Notebook LM: A Game Changer for Knowledge Management"
"Synth ID combats misinformation with digital watermarks"
"Perot now has a two million context window up from a million within Gemini website."
"Gemini Flash is going to be used for real-time response."
"Gemini Nano is what is used on the Pixel Pro 8 with Google's AI chip."

Chapters

00:00 Introduction to Google IO and the new AI tools
02:51 Discussion on the misleading video demo
03:13 Exploring VideoFX, ImageFX, and MusicFX
08:06 Loop Daddy's pre-show concert for MusicFX
09:30 Introduction to Synth ID and its role in responsible AI development
23:40 Overview of Gemini 1.5 Pro and its upgraded context window
32:42 Google's commitment to responsible AI development and AI for all
38:31 Gemini: Upgrades and New Features
41:36 Gemini Nano: Private and Secure AI on Pixel Phones
42:44 Google Workspace Integration: Enhanced Productivity with Gemini
50:38 AI Overview in Chrome: Summaries and Answers
58:58 Scam Detection Alerts: Protecting Pixel Phone Users
01:01:22 Trillium: Improved Performance with Cloud TPUs
01:05:12 IDX: Google's Code Editor for Easy Development

Creators & Guests

Host

Dalton Anderson

I like to explore and build stuff.

What is Venture Step?

Venture Step Podcast: Dive into the boundless journey of entrepreneurship and the richness of life with "Venture Step Podcast," where we unravel the essence of creating, innovating, and living freely. This show is your gateway to exploring the multifaceted world of entrepreneurship, not just as a career path but as a lifestyle that embraces life's full spectrum of experiences. Each episode of "Venture Step Podcast" invites you to explore new horizons, challenge conventional wisdom, and discover the unlimited potential within and around you.

Dalton (00:00)
Welcome to VentureStep podcast where we discuss entrepreneurship, industry trends, and the occasional book review. Today, you're going to have a front row seat to Google's latest announcements from their most recent IO. We're going to explore the groundbreaking tools and models that you might find useful. We'll cover everything from text to video, magic, enhanced image generation to the revolutionary learning models and responsible AI development.

join us as we uncover the exciting possibilities. Of course, before we dive in, I'm your host Dalton. I've got a bit of a mix of background in programming, data science and insurance offline. You can find me running, building my side business or lost in a good book. You can listen to the podcasts in both video and audio format on YouTube. If audio is more your thing, you can find the podcasts on Spotify.

Apple podcasts, YouTube, or wherever else you get your podcasts. Today, we're going to be discussing all of the, I think, important items that were demoed slash unveiled at Google IO. I think that I'm better positioned to speak about these topics as I've been a beta tester for the last.

five or so months for some of these items. So I've had access for a long time before the general public. So I've been testing these tools for a good amount of time. So I feel like I'm well positioned to speak about them. That being said, for the things that I don't know and I haven't been able to try myself, I will be a little less excited about it. I didn't talk about it on the podcast because it was before I started redoing the episodes.

But about five months ago, Google unveiled their Google Ultra model or Gemini Ultra. And it's supposed to be their most advanced, expensive model that is capable of doing insane things. And they demoed a 50 50 proportion video. So like they just sent the video was this human drawing things on a piece of paper with hand handwriting, messy handwriting. And then the other half was like,

Google Gemini Ultra answering the questions. And they made it seem like it was a real time interaction. But really what it was is they had a video that they took and then they uploaded it for Gemini Ultra. But in the video that they uploaded to YouTube, they didn't specify that this isn't real time. And they made it seem and they alluded to it being real time.

So I gave them a lot of compliments on X and...

I mean, no one sees my stuff, but I felt that I got bamboozled because I think a couple months later it came out that that actually wasn't real time. And yeah, so that's kind of, I'm kind of, I'm be less excited about things I haven't tried myself, but overall Google does a good job with their stuff. But I think that that video was misleading.

Okay, so let's get into it. So Google introduced to the general public these new AI tools. So we got VideoFX, which is going to be text to video. I haven't tried this as I think they just released it. So I don't have any experience with VideoFX. ImageFX, I applied to the ImageGen waitlist.

But I'm also pretty sure if you have Gemini Pro, ImageGen is also used as the multimodal function for Gemini Pro. So when you have an image request in Gemini, you can...

generate images with Jim and I, but Jim and I would probably use this image gen or have some version of image gen within its model base to be able to generate images. The new update with ImageGem 3, it allows for higher image generation, higher quality image generation, and it includes these editing tools, which are nice. So you can edit the image. And then also, I think the most important point,

is it can now generate images with text with much better accuracy than before. And if you are new here, my podcast cover art, or I don't know what it's called, but my, it's still my podcast profile photo or cover art, whatever you want to call it, that was generated by Jim and I with an image gen.

the probably ImageGen1. So ImageGen1 and OpenAI's Generation Image Generation Model, they both have issues or both had issues because ImageGen3 doesn't have that issue as much as I did some testing before this podcast. We can do it live later on. Had issues generating images with text. So it can make a really cool image for you.

whatever you want, as long as you didn't ask for any text. If you ask for text on the image, it's gonna be misspelled. The words will be, I was asking for this cool sci -fi explorer image, and I wanted adventure step to be at the top, and it might put, all right, so it would do like, the cool sci -fi image would be great. And then it would say like step vintage or,

I don't, it wouldn't say venture step. It would say like whatever I wanted. I don't know. It was really difficult to control or it would be venture step, but it would be all misspelled or it could be like venture spelled, right steps spelled wrong, vice versa. Or it would just be just a collection of nonsense put on the image that I had no idea what it was trying to do.

After some testing, it doesn't seem like that is much of an issue anymore. So we'll do we'll do some testing of this, which I like. There's also a music FX, which is something I tried a couple of months ago, but I'm not an artist in that regard. Where it generates these beats for you and you can make a soundtrack and from that soundtrack, you it helps you make a song kind of on the fly where.

Like maybe you're, you're an artist and producer, you're working on something and you're like, well, you know, this, if we added this one thing, if we got these like bongos in here at this moment, I feel like it would, it would change the vibe or something. I don't know how artists and producers shock, but I think, I think that's something that they would say. And if you don't have bongos at the studio, you could just go to a music FX type in like bongo sounds.

and then it would generate you a beat for you and then you could download that beat and you could add it into your song. Google did a pre -show concert with this person called Loop Daddy and I think his name is like Mark Rubiak, Rubiak? He goes by, his artist name is Loop Daddy and Loop Daddy did a pre -show of

of this music FX. So he asked the audience, OK, what beats do you want? And he gave some selections, like gave them some choices. But it was pretty much like everything was generated on the fly. So he's like, you know, here are the suggested prompts. You share a screen of the suggested instruments. He's like, which ones do you guys want? And they voted on it. And then they made a song from it live and see if I can.

if I can get that pulled up for us, which would be pretty gnarly. So music effects is something that you use if you're an artist to help you supplement creating a song slash beat. I don't think it's something you could just make. It's not something that you generate lyrics with as well, it's just beats. And so you can ask it to do.

instruments and different types of beats and then you could fade them in but it doesn't really it's not that complex it just it's just used for for beat creation i tried it my beats are garbage garbage it was horrible i was like this this tool is so hard to make anything good but loop daddy loop daddy loop daddy did it right i guess but i guess he he also has like 20 years of knowledge and

does this every day for his whole life. And then they also introduced the Synth AI, or not AI, Synth ID. Synth ID is this digital watermark of all AI -generated content. Meta has a similar approach, but they actually put a watermark on their images. I don't think that is as good because somebody could just crop out the watermark. It's not like the bottom right of the screen.

So they could just crop out the watermark. With Google's approach, anything that's generated by these Google products will have an ID associated with it. And it will, I guess, know what it looks like. It will store in the database or something. And then from there, or maybe it gets stored in the metadata file. And maybe, I'm just thinking about how they connect the two keys. I'm not sure. They didn't really explain it, but.

It's supposed to link anything that's generated with Google products to allow them to prevent people from doing bad things with the images. All right. So let me, I was talking, but let me go and go on YouTube real quick and type in Google pre -concert Google I O, Google I O pre.

concert.

Okay. All right, there it is, Loop Daddy.

Okay, so let me share my screen. Share. I gotta cough for one second, let me step away. All right, back. Okay, share the pre -show, I'm sharing this. We're gonna include the audio. We got this turned up, I don't know how loud. Okay, so here it is. So someone's doing the votes and I'm gonna fast forward.

This is crazy. The show was done at like, I think like eight 30 in the morning and he's he looks like he's been out for days. OK, so they picked this beat. It's making this beat. He's adding this guitar.

and then he makes a full song that he uses.

and now he downloaded it and now he's making the song.

Kind of a banger, honestly. I was surprised.

Okay, that's where it'll end.

and he makes a full like EDM track.

Yeah, I just think it's crazy. So he uses all from the MusicFX app where he asked the audience what to put in the song and then he did that. I stopped sharing, close this tab. Okay, so then I'm gonna go back to sharing and I'm gonna share my screen to share ImageFX. So I have access to the test.

Where's this at? Getting lost in the sauce. Okay. Okay, share, share, share.

screen. AI test kitchen. So if you want to look at this yourself, you'll have to sign up. But if you go to aitestkitchen .withgoogle .com, you'll come to this homepage, you know, kind of have some, some examples where you can, I don't know that this first one is like Renaissance vampire king, flower studded hat, flared nostrils. And so it's just like, it gives you examples of things that, that, you could do. This one says dreamy, eternal Sith.

my gosh, music.

That's the song that they made, a little beat. Okay, so I'm gonna go to the Music FX. All right, so I'm just gonna use what they gave me.

And so it's gonna make a song for me. A major key melody that evokes a sense of brightness and optimism, catchy and full of joy. So it's making a song, generating the music.

So this is the prompt mode. The original version that they provided was like, I guess the DJ mode where it was only beats.

This is a song that we just made on the fly.

Pretty cool. This is the first track and they made a second track.

What if we do...

to like a dark to.

us.

Maybe catchy. I don't know what these mean honestly, some of this stuff. I don't know anything about music creation. Let's generate this. Thanks mom and dad. Okay, so we're generating this song here in a second. I'm gonna let this generate.

This is the new song that we made. This is being full of darkness, sense of darkness. This doesn't feel very dark honestly, but...

Okay, so that was music FX. And then now we're gonna switch over to image FX. So image FX is similar. It provides a text to image versus music is text to music. So it's pretty straightforward. You have the ability to edit these images.

You could ask AI to edit the image and add something. So let's do, I'm going to put a masked area right here and I'll say, can you add, can you add baby?

I don't know. Baby birds, baby birds.

Baby, fire.

Creativity is off the charts right now. Can't type either. All right, generate edit. So let's see if it works.

So in this image, it's like a dragon out of the clouds. And this edit that I'm doing, I put a mask on the image and I don't think it went through. So let's, I'll do this. Underwater coral reefs with schools of fish swimming. Let's try it.

generating.

It's taken honestly a long time, much longer than I would think.

these tools.

Still loading?

Alright, these aren't these honestly aren't that bad. Some of these fish look really odd though, especially when you get into the details. This one looks okay. Looks like kind of a...

I think the further out, I mean this coral looks weird. The further out you are, it's better. Like this is a great image right here. This is perfect. But if anything that's up close and detailed, it really shows that something's weird. It just looks, you may not be able to tell what it is, but you can just tell it like it just looks off, the image looks off. Okay, so that's image. Let's see if video will work. Nope, it says I have to join the wait list.

So we did text FX, we did, what is it? this is.

The text FX is you can make acronyms, you can make chains, you can make, it's like for poetry and songwriting. I don't know anything about this. I can't really speak about it. I've used it before, but I don't really get the whole purpose. So I'm gonna stop sharing. Okay, so that was the introduction of these new tools, the Synth AI.

or I keep saying AI stuck in my brain. Synth ID is the watermark generating, but it's a digital watermark. So I can't really show that. The next thing, which is really cool and I think is something that everyone should be using at their work is this learning LM or like learn LM. And it was originally called like project tailwind and they renamed it to notebook LM.

Notebook LLM is something that was launched, I don't know, I think like six months ago or something like that. I used it to help me study for some of my insurance exams. And so basically I am able to put in these books that I have. I put the books in, it reads the book, and then I could ask it questions like, okay, like I'm struggling with this chapter. Can you make me a study guide? And then it will, then it will allow you to do that.

upwards to I think 500 ,000 words, I think. And I think they just increased it as well. But I don't have confirmation, but they said they upgraded my. They upgraded the notebook LM and its increased capacity, but they didn't say. And the current capacity window right now is a million for the Gemini Pro. So I think it might be a million. There could be a million because they just sorry, they just increased Gemini Pro to.

two million. So it could be a million or up to 500 ,000. I'm not sure. Well, they did say they increased it. So, but they didn't specify that being said, you're able to drop in these files. The limitations is you can still drop in five, no 10, 10 files. I don't know why I said five. I got the 500 ,000 in my mind. You can drop 10 files into the product and Google this notebook LM.

We'll go in and read the files and it will become an expert in what you give it. So you can give it stuff that is difficult to look up on the internet. Something that would be useful would be compliance stuff. What about things that you have at your company that are only related to your company or your group, but you still have questions about it occasionally or someone asks you and you don't necessarily know and you have to go and look in the file. You can just type in, like you just copy and paste what that person said.

Ask the notebook LM. The notebook LM would know where it's at, tell you where it is in the file, which file it's from, the exact session, and then give you the answer or what it thinks to be the answer. And then you can verify it yourself because it tells you the exact location where it's getting this information from. It shows its sources and it's pretty straightforward. And I think it's a game changer for studying or

managing knowledge bases for unstructured knowledge bases like documents. We have a corpus of documents. Notebook LM can go in and read those documents and make the educated answers or guesses, but then it also cites all of the locations it pulls the information from.

and then it gives you a snippet of what it used. So I'm gonna share my screen. I already set up an example for us, because I'm so kind. And I think that this is a really cool tool. And what they demoed blew my mind, because the next feature update that they're doing, they're gonna make it where you set up your LM, your notebook LM, you put in your information, it will generate an FAQ, it will generate a study guide.

And then you're going to also have audio, audio, like you make your LM an audio book, but not only is an audio book, it's interactive to where you can ask the notebook questions. And then the demo, they said, like here's the open source math textbook. My son is studying for this class and he really likes basketball. Can you make analogies with math using basketball?

And so there was two AI people that are talking. It's a woman and a man, and they're having a conversation about how like they would solve this problem using math and how math or basketball is related to math and explaining things that are related into the textbook, into this analogy. And then you can stop.

and ask a question like raise your hand and then they pause and they say, it looks like someone has a question. And then they stop their conversation. You ask the question and it processes the question that you have and it integrates that question that you ask within their conversation. The voices are very human like. It's not robotic whatsoever. And it is super duper cool.

And I think it's a game changer for studying or anything related to managing unstructured knowledge basis. And I've used this product for.

I think six months or so already. So I set up this notebook for us, Google IOTest. So I uploaded, I took a corpus of all the information that was uploaded from Google IOT. Google IOTypically posts a blog per item they discuss. So each of these things that I'm talking about in the podcast, they all got their own little blog upload.

that explains how to use the tool, a video of the tool, all sorts of things. And so let me ask it, I'll say what.

How, okay, well, let's see. What is a learning notebook?

Geez, it's hard to type when you can't see. I try to ghost type, but sometimes it's difficult. Learning notebook. That was demoed.

I have.

And from when I was using it to study, I could ask you complex math problems and it would give me the formulas and how it went about solving it. So learning notebook, yeah, that's true. Learning notebook's not right. So it's a notebook, sorry.

I meant notebook.

LM.

is processing.

Okay, so I asked it, what is the learning notebook that was demoed in Google IO? It said, I don't know what you're talking about, because I said the wrong name. And I said, sorry, I meant notebook LM. Says notebook LM is experimental tool uses Gemini 1 .5 pro to create audio overviews. It can take source material and generate a personalized and interactive audio conversation. The sources highlight Google's focus on multimodality.

enabling its AI models to handle various input formats like text, images, video, and code. They explain that multimodality modality allows for a wider range of questions and more comprehensive answers. The sources also emphasize long contexts, which allows these models to process large amounts of information, making them more intelligent and capable of deeper analysis. And so in this,

question that it answered with, it has 10 citations. And right here it pulls up citation one, it pulls up the exact location to where this information is gathered. So in this, it's kind of rough to read because I just copied and pasted all these blogs into one file and I didn't, I use a no format. So it's kind of hard to read, but the machine can read it just fine. But in this blog, it discusses,

everything and then there's a snippet that's highlighted exactly what was pulled out. And it does this 10 times. So it shows you the file, what is used and then where in the file is used at. And that gives you a purple highlight. It's pretty cool. I think it's huge. If I can ask it, I could ask another question. Okay. So how does the, the,

Synth -AI, I keep saying synth -AI. Synth -ID, enhance the responsibility use of AI. Synth -AI embeds digital watermarks into AI -generated images, audio, text, video. This technology combats misinformation by helping identify AI -generated content. And then it shows the citations.

Pretty cool, I think. Let's see if they have anything. What did?

What about, can you tell me about IDX? What changes were made?

see if I have information on.

And one thing that's really good about this form of large language models is large language models have issues with rejection. They have issues with hallucinations. And there's some questions about is a hallucination an attempt to deceive or is it a general, an actual true positive hallucination? I don't know.

No one knows. Maybe the AI companies know, but that was kind of the issue with some of the image generation that Google had in the beginning until they fixed it. Those two problems, hallucination and rejection, aren't an issue when you create your own LM. So this is an easy way to create your own LM where the LM will only answer things that are within

the files you upload, it becomes an expert in what you give it. And if you ask it something that it doesn't have any information on, like I asked earlier, cause I put in the wrong name, it says, I don't have any information. There's no provided sources that mentioned a learning notebook being demoed at Google IO. However, there are sources that do discuss learn LM and new family of AI models designed to enhance learning processes. And so,

It kind of, if you get close, it will tell you like, hey, is this what you mean? But what you ask, I don't have information for that and I can't tell you. So it prevents you from getting bad information. And that's an issue when you're looking at some of these models where it will just give you an answer super confidently, but it's not necessarily right. With Notebook LLIM,

what you put in is what you get out. And there is no extra that's added in there, hallucinations. It might give you, if you give it a math problem, sometimes the math gets wonky, I think, but for the most part, I would say 90 % of the time, the math is right. And it really depends on what you're trying to get done, but.

It doesn't allow for hallucinations and the rejections are low. I hope that you're excited about this tool as it's now in general use. I think it's super cool and very useful for my job. It would be really great if we put in our underwriting guidelines, our manual and some of the compliance related requirements from

state to state into an LM and have that LM being expert in those things for us. And when questions come up, we can just ask it. Okay.

what about this or like, is this something that is required with this state? And then it'd be able to tell you, no, as of, you know, this date, no, it's not. So you're good to go. And you could use it for studying. You can use it for your work. HR stuff I think would be cool. Like, you know, if you're at the HR, I don't know, you work in HR, you could use the company handbook and.

I don't know the other rules, but not a big rule guy. Not like I break rules, but I'm not out here making rule books. If you could use the company handbook, you could upload that as an LM. Then you can have that LM make a frequently asked questions. And then you can upload to the website or whenever someone asks you a question, you could put their question into your LM that you created.

and then the LM would output your

answer and then it would cite where it got it from. So not only could you tell them, like, hey, this is the answer. You could also find it in this document. But of course, if you're doing these things, like you need to have approval and make sure that your company is cool with it and that you're not just going rogue. If you are going rogue, do that on your own accord. But I think it'd be good to.

Whenever you're saving time, it's great, but as long as you're doing it safely, I do know that this information for this, these LMS are only shared for with your account. So it's tied to your account and it's not shared outside of your account. So Google doesn't know what you put in here. It's encrypted. And so there isn't some way of having contamination of data, like what you give it, the LM and sends out on the internet. It's.

No, so you're protected there. So just throwing that out there if you were thinking about it and you're like, well, where's all the data go? That's where it goes. It just stays with your account. Okay, I'm gonna stop sharing. I can't wait until the conversational aspect of these notebook LMs come about. There is also LearnLM. I don't know necessarily how this is gonna work, because I don't have access to it, but it's gonna be with...

Google search YouTube, where you watch a YouTube video, say a math video or some kind of coding video, or maybe a home building video. I'm not sure what you're watching, but I think it's most related to educational videos. So how to's or lessons. And this LM will be created from, I guess, the transcript of the video.

because Google transcribes their videos, the transcript would be fed into this LM, every video would have an LM, and then during the video, it would ask you questions of.

It would ask you questions about the video to.

understand your level of comprehension and if you didn't understand it would help you understand.

I don't know how that's necessarily gonna work. They demoed it very few minutes, they demoed it. It'd be like a pop -up below the screen of the video, like above the comments that would help you learn, I guess. Not completely sure how that all works out. Responsibility in AI development. We're moving past the LearnLMs and NotebookLM.

unfortunately, that's probably my favorite segment of Google AI, the Google AI IO. It's AI now, cause that's all I talked about. So as I said, so they're doing the responsibility piece. They're taking it seriously that they did the Synth AI. I said this Synth AI thing so many times this episode, it's Synth ID. It's not AI, it's ID. Anyways, they're doing that for text and video.

they're launching the learn LMS to make a more accessible and impactful AI enhancement to people's lives to help everyone learn. And a lot of these tools are free, which is great for Google. Like this notebook LM that I was talking about earlier, that's completely free. It does everything for you for free. And it's using one of the,

second most advanced models that Google offers and probably like the third or second best model on the market. So it's very generous from Google to allow you to use those tools for free. I'm sure eventually they'll have a monetary requirement at one point, maybe. Who knows? I could see it being used a lot in education though. That's free. The.

The image FX is free, the video FX is free, music FX is free. You can use Gimini for free. And a lot of those models are open source. So their kind of motto is AI for all. And so they want to have a safe way of deploying AI to everyone. And a lot of these things that I talk about are a little bit more niche.

And the things that open AI launches are a little bit more niche and not everyone's going to have access or, or be able to use them. But Google has, I think 3 billion users for Chrome and 2 billion users for Android. I don't know how much cross pollination there is between the two users, but we'll just assume 3 billion people, 3 billion people have.

You know, this free free access to these, these other tools that we'll be talking about later on that will be integrated into Chrome and Gmail and all these other places and an Android.

I think they're, they're, they're launching what they said they're going to launch. They're launching everything that they're doing. They're doing at scale. Open AI has these niche models that are, you could download an app for them or you can, you can do what you do, but Google AI for the size of their company and the velocity that they're moving at is incredible. They are launching industry best models.

Not only are they doing that, they're also launching innovative products like Alpha Fold that I talked about last week, Alpha Fold 3. They have many other products like that from DeepMind. They have a foundational model, Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sima Sim

purpose -built AI models for a built for a game like say Minecraft. Google's foundational model is able to.

outperform them with zero experience on the game, just because it's a generalist and it's able to do everything well. This is the magic sauce. Same thing with Outfold 3.

Alpha Fold 3 outperforms general built physics based models that are made to understand one piece of biochemistry. While Alpha Fold 3 can understand pretty much all of it, they said that they are able to understand and predict almost all of living life.

So it's, yeah, it's hard for me to say that they're not delivering with this AI for all. And I think with the controls as they put on their products with the ID and if they run into issues, they close it down. They lock it down quick within a couple of days when they're having that image image generation issue, they, they locked it down pretty quickly. So they,

are doing their best and with the size that they are, I think it's great. Moving on to model updates. So Gemini 1 .5 Pro, which is the model that you would use if you went to gemini .com and you had Gemini Advanced enabled, Gemini Advanced is enabled when you subscribe to Gemini Premium, I think that's what they call it. If you subscribe to Gemini Premium.

you get access to Gemini Advanced. Gemini Advanced.

is GIMMI 1 .5 Pro.

Geminai 1 .5 Pro got an upgraded context window. So it used to be one million, now it's two million. And that being said, earlier in the blog post, it said that the Notebook LM is able to, or Notebook LM is used by Geminai Pro 1 .5. So that being said, I would assume that the Notebook LM,

would be able to handle 2 million.

Contact window.

I didn't say that, but you can make that assumption. Because I was talking about earlier and I kind of got reminded.

So, Perot now has a two million context window up from a million within Gemini website. The Gemini website is gemini .google .com. Before, it was only available via API, but they upgraded it to be able to do not only one million, but two million. So, I don't think you were able to do one million outside of API. You were only able to do the one million context window with API. Now, you can do...

two million within the website, which is nice. There's a lot of people that are listening or just in general, they're not gonna make some API integration to call, call Gimini questions. They also discussed Gimini Flash. Gimini Flash is, I think it's Gimini 1 .5 Flash, so it's like the less meaty version of.

Gemini 1 .5 pro sorry, I had a cough. Gemini 1 .5 flash is going to be used for real time response. So their thought process was, okay, so if you need something very complex, you can use Gemini 1 .5 pro or ultra. If you need something more interactive and

conversational and you don't want to wait along. Like the users probably don't want to wait a long time to get a response. I would say like a chat bot, like you don't want to wait 20 seconds for the chat bot to respond. It's annoying. But Jim and I flash apparently, and I don't know cause I haven't used it is it, it flashes right on there. It's quick. It's flash. It's the flash. So Jim and I 1 .5 pro is also cheaper too, or pro it's,

Flash is cheaper than Pro. Pro and Ultra, I think, are like two to four dollars, and then Flash is like 50 cents or something. It's much cheaper. So every time you hit it, it's like 50 cents, 50 cents, versus when you were hitting it before, when you're using Pro, it's like a buck over a buck for each time, which will get expensive. So Flash is supposed to be really fast and still be able to handle

somewhat complex tasks, but in a very real time manner. Whereas pro is more, you say something, it calculates the answer, comes back to you. The feedback loop is not as fast without using flash. And then they talked about Gemini Nano.

Gimini Nano is what is used on the Pixel Pro 8 with Google's AI chip. And I think the AI chip is coming to Samsung and a couple other devices. So they'll have Google Gimini Nano integrated with phones. And Nano is a model that will be on your phone. It is already on your phone, but in the future it'll be on more phones.

It's on the phone. It doesn't need that much power. It doesn't need access to the internet. And so everything that you look up with Gemini Nano is private.

private and secured. Which is nice, because people are concerned about data privacy issues. I don't think you should be concerned about these things, because if people want to know your stuff, they're going to know your stuff. But anyways. So, G9 Nano is on devices. It originally was only text, but now it's multi -model.

It has multimodality now, which is great. Integrating Gemini into Google workspace for enhanced productivity. So they have integrated Gemini into your workspace. I don't know if you're a beta user or when they rolled out, but I know they recently rolled it out to the general public. Recently was they'll do like summations.

like a summary of your emails, they'll write emails for you. They'll help you make spreadsheets or

help you write down notes or format your notes for you all within Google workspace or any of your non workspace apps. If you don't have a workspace, like you don't necessarily need Google workspace and Google workspace. You're not familiar is the paid subscription with a, I guess a Google domain, not a Google domain, but a domain hooked up to it. It's like when you have a company and you need,

a company's share drive or company workspace. That's Google, Google workspace. That same feature is on your Gmail and your Excel, your Word doc, or not your Word doc, but your Google doc.

So let's share my screen and I'll show you what I'm talking about. We'll get right into it. I'm gonna share my email. Hopefully I can hide my whole email, but let's see. Presentation? no, I don't want a present. I don't want a present.

I want to share my screen. Okay, so I was working on this light. I was working on this light thing from Nano. Who is it from? Nanoleaf, Nanoleaf. And I'm having trouble finding the email. I know the email's there, but in this example I'm having trouble finding the email. I'll say, can you help me find the email?

that talks about, in this scenario, I have many emails from Nanoleaf, like I'm on their newsletter, I am conversing with the community group, I've got comments, all sorts of information related to Nanoleaf, but email that talks about how to fix my lights.

that aren't working.

Okay, so I couldn't find the Nanoleaf email where this person, Michael, was helping me fix my lights. So I asked Jim, what? This is our first Failure Live demo. This is, I couldn't find the, I could not find the email related to how to get your lights to fix in Gmail. Okay, can you find me?

Send me emails.

Nanolive.

related.

Nanoleaf.

Nano leaf.

that.

A ticket. I don't know, we'll see if it knows.

Come on, Gemini. I can't believe you're doing this to me live.

Okay, so I found a Nanoleaf ticket. It cites the sources just like the Notebook LM, similar, where it will cite what emails it's taking the information from. Here are the emails with Nanoleaf, blah, blah, talking about the buttons not responding to touch control processor, no LEDs are flashing on the control processor. And then I could say, what did...

Michael, tell me.

to do.

and email.

So in this scenario, I'm on the go. I can't read the email. I need someone to summarize it for me. So what did Michael tell me to do in the email? So let's see what it knows. It should know. I've done this stuff before in my free time. OK, so Michael is suggesting you do the following. Check the cable connection. OK. Power cycle the controller. Camera positioning. Make sure the camera captures the entire screen. Calibration.

And so this original email from him, it's kind of all over the place, not necessarily super organized. So this is nice. What it gave me, what Jim and I gave me is better than what I got via email. So that would be how you would define stuff or do those things. And I'll say, can you...

Can you write? I'll just put in the reply, hold on. So I'll go to this Michael guy. I'll go to.

default and I'll say hello.

I say, I don't want to write it, you know, so say, hey, can you write a reply to this person about.

saying us saying that

We are saying that.

I couldn't, I say I still can't get the lights.

Which is unfortunate. This is a legitimate issue that I'm having. And so I didn't even say his name. It read the email above. I said, in my thing, I said, hey, can you write, can you reply to this person saying, I still can't get the lights to work? It said, hey, Michael, I just wanted to follow up on the previous email. Unfortunately, I'm still having trouble getting the lights to work. I've tried all the troubleshooting steps in the manual, but nothing seems to be working. Is there anything else I should try?

or should I contact a professional?

So I think it's pretty spot on except steps in the manual, I would probably take that out and then I would take that out the second sentence. But it's not perfect, I didn't really give it that much direction. But it gets you pretty close and if you didn't like it, you could formalize it, you could shorten it, you can elaborate, press elaborate, re -create the email, add more information, blah, blah, blah. So I showed you Gmail but...

The other pieces work just the same. If you want to ask JMI to do stuff, it's in the top right corner of your screen to two items to the left of your profile photo. On emails, it's in the bottom left.

back to editing. I guess I'll insert this. Make it a, it's in the bottom left. It's like this little Gemini emblem and it kind of glows. Help me write with Workspace Labs, that's what it's called. Okay, stop sharing. Okay, so that was the integrations into Google Workspace, AI search and an Android.

So AI was implemented into your search recently for you. It's been around for a good amount of time as a beta tester. And it's pretty useful if you're asking questions, not necessarily if you like, well, what brand do I like if I like the color purple? It's not going to answer or it's not going to trigger if you ask things like that. But if you ask it like a legitimate question about Earth or science or.

things of that nature, you'll have an opportunity to use the. Summeration or what I was saying, summeration, the AI summary or then they call AI overview. AI overview of.

what you ask and it will give you answers and it will cite its sources just like the notebook lm there's kind of a trend going on here but let me share my screen we're going to ask and i already have the question pulled up about what is the tallest animal in the world we know the answer or should right let's see so let's go share our screen what is the tallest animal in the world and so this is what ai overview looks like and it says giraffes are the tallest

land animals standing up to 18 feet or 5 .5 meters tall. And it's cited one, two, three, four, four sources, National Geographic, PPC, International Fund for Animals, and Business Insider. One of the issues with this and why people are really upset about it, especially developers, is it basically scrapes the data off of your...

website that you created that you worked hard on your SEO, your blogs, whatever. And then it makes it an LLM or I guess not LLM, but it makes a notebook LM with your source material that you worked on. And the users don't click on the link anymore. They're just going to read the answer right here. And maybe if I didn't have an answer here or

I didn't have an answer from the international fund. I would just click on the link and they would get traffic to their website, but this is going to decrease traffic because a lot of people are just asking questions or trying to find stuff. If people can find things a lot faster and not necessarily have to click through websites to find the right answer, then people aren't clicking on people's sites. So developers are pretty upset about this feature.

I don't think it's a big deal. You got to adapt with the times. Times are changing pretty quick nowadays. Changes stressful for some people, especially if you're financially anchored into a certain way of living and you weren't realizing that change was afoot. These features for this AI overview thing has been around since I think I was in my masters. I think that's close to a year ago. So they've been demoing this for a year and they just announced it to general publics now.

German public. So it's not like they, it's like, well it just launched yesterday and they just announced it to everyone. It's been around for a long time. And if they didn't see that this could be an issue and maybe find a different way of getting traffic or being useful to internet, that's on them.

Okay, that was the searchability that you had. You can ask it complex questions. It used to have a feature where you could ask like follow -up questions, but it seems like they took that out. I guess that they change or altered the way that they do it because if you're asking follow -up questions, it'd be similar to just logging into jimani .com and asking the same question and getting similar results. So.

I'm assuming that's why they did that, but you used to be able to ask like some pretty cool follow up questions and like, you know, you could trigger AI overview for products about, I don't know, tool sheds. And then you could ask, okay, like what about, you know, what are some things that I should look at with tool sheds and then how are these products related to these answers that you have below? And they would find like the best product for you, which was pretty interesting and useful for myself, but they don't do that anymore. Unfortunately.

I thought it was pretty cool though. So they have the same thing with Android. Android, you switched over from Google Assistant to Gemini. Gemini is now the new Google Assistant. The only issue is that they made Google Assistant good at doing tasks. Like, set a timer.

or make some tasks for me or make a shopping list or you get the point. Jim and I is good at answering questions, but as far as like making timers or.

doing general assistant things, it's not that good at it. Maybe over time they'll increase the features, but they basically cut out something that had a lot of features. And then they're like, okay, now you're gonna use this. And the user base is like, okay, well, what about the features that we had? And Google's like, I don't know. They come when they come. So I think a similar experience with YouTube.

music where they got rid of Google play. I think it's not Google play, but Google play music, I think it's called. And then they have Google podcasts, which had a lot of features. And then they got rid of Google podcasts and made it into Apple music to an Apple, Apple, not, wow. I said, YouTube music. They made YouTube music, both a podcast app and a music app, but it doesn't have as many features for podcast stuff. So,

Kind of a similar trend there.

Android will have or has access to Jim and I as their task assistant now. And then in the future they'll get a higher usability. Right now I don't find it as useful. And I don't like the fact that when you ask it questions, like if I ask so blah, blah, blah, beep Google, can you tell me about...

why lava is hot or something. It would answer the question for you. But the only issue is on your Gemini website, it has a, every question you ask has an entry on your text box. So if you ask, and I ask a lot of questions, fortunately or unfortunately, but unfortunately for my UI on Gemini, I was asking so many questions and it was just,

killing my inbox. I had like hundreds of entries in it and it's just a mess to try to find the stuff that you were using and that you were interested in and projects because you can only pin so many things on there. So I think you have up to like five pins. If you go over five, I would say all that like threads that you need on a daily basis, if you're constantly asking questions with your Google.

Gemini assistant, it just adds those entries onto your Gemini UI website. And the whole thing is a mess. So I stopped doing that. So I asked no questions. If I have my own Gemini thread that I'll ask questions, I'll ask like daily random questions. And that's where I'll ask my questions. I used to do that with my assistant though. So.

It's just, just the way it is. You know, you got to change with the times. I got to change too. So, but I wasn't too happy about that. They have this circle circle and search is what they, they said that they launched, but it's a previous feature from a couple of years ago. So it's not necessarily new. They have scam detection alerts. So they demoed live on a call that if you're calling someone or that, or you're speaking with someone, they're not, you're not necessarily calling them, but.

If you're talking to them, and this is if they get past the scam detection that is phenomenal on Pixel phones, by the way, I haven't on a Pixel phone, you don't get any scam calls or text messages. They just it just doesn't it doesn't allow it because if someone calls you that you haven't interacted with before, you have this Google assistant answer the phone call for you. And then it asks them, why do you need to speak with Dalton? Like, is it urgent? And then.

It will say, like, what are you calling about today? And it freaks out the scammers and they never call. But a lot of times they won't even get there and they'll just get absolutely forwarded. And if they make it through that, then they'll get screened by the AI assistant. And if they make it through that, then I screen the call. So I have. I have, you know, these two fences that the scammer needs to jump over to get to me and then they get to the final boss, which is me. You know.

but I had never had to interact with scammers since I got a Pixel phone. And then they also are launching scam detection alerts. So if you're in a phone call and they are talking, whatever,

And I've had a cough a couple of times in this episodes. I don't know. You're talking to this person and then they start bringing up scammy stuff like, can you send me your social or can you whatever? Can you text me your social or can you send me an email of like your bank account numbers? When it hears those things and it's not necessarily listen to your conversation, it's listening to your like for keyword that would trigger it.

it will say like, probably you'll give you an alert on your screen like probably probably high likelihood this is a scam like do you know, think about this before you act to help people out because a lot of people get taken advantage of because they there's this emotional thing that scammers do to their potential victims and they they add a bit of urgency on there and if you're stopped for a moment thought about what your actions were about to be like well, this doesn't really make sense.

What's going on here?

I think that's useful for everyone if you get scam calls, but I don't necessarily get them on a Pixel. Not bragging or anything, but I just don't have that much interaction with them.

Google launched this VO, which is the high definition video. It's able to be used today on Vertex AI, which is Google Cloud's platform for AI tools. I don't have access to it because I don't have Google Cloud set up on my personal. So, and I'm not going to pay money for it because it's probably expensive. So,

No trial today, but I'm on the wait list for video FX, which is probably like the previous gin, possibly. I'm not sure.

They talked about the, you know, the music AI sandbox for, for collaborations with musicians and songwriters. I didn't get access to that. So I can't really speak about it, but they talked about it in Google IO. They also, you know, once again, emphasize, you know, having a responsible development and addressing challenges with the AI and how things are moving very quickly. They introduced Trillium, which is the sixth generation.

of Google cloud TPUs. TPUs are used for training. They're basically like, you know, Nvidia chips.

So they introduced their TPU. It's much faster, better, and more energy efficient, which is probably the biggest aspect, because we're going to get to a point soon where all these companies are trying to train their AI, and you'll get to a certain throughput of computational throughput, and then you'll get a certain throughput through electricity.

And right now it seems like throughput is going up. Electricity supply is not necessarily. So they're trying to get solutions for more power. And one of the ways to get more power is to use less. So because it's easier to innovate with your hardware than it is to innovate with the energy grid and get approvals for new power stations, nuclear power plants, clean energy.

That stuff takes years to get set up and integrated into the energy grid. Might take seven years or something. Whereas, you know, making a new chip might take two. So that's kind of the approach that many companies are taking.

There was Gima 2. Gima 2 was also released. Gima 2 is kind of the smaller models that are necessarily flagship, but are still used and are still important. Gima 2 was launched. It had a new architectural breakthrough. So it has improved performance and efficiency.

GIMMA to models or outperforming models that are two times bigger than GIMMA.

GMMA2 is a 22 billion parameter model.

So it's pretty small. It's not that large. So I think it would be a quarter of the size of Facebook's middle model. And remember, Meta still doesn't have their largest model. They're still training it. I think their second biggest model is 75 billion. So pretty close to a quarter. They talked about integrations with

you know, Vertex AI, Google Cloud, Hugging Face models, all being integrated into Google Cloud, which is really nice. And Gemma is available on all those platforms and Hugging Face models are available on Google Cloud.

And the last thing that I want to talk about is Google's code editor called IDX .Google .com. So I'm going to share my screen. It uses Flutter. It's integrated into Google Cloud. It is integrated with Firebase, Google Cloud, Gemini APIs. So you can hit the API pretty easily. Let's see.

And so when you start up, it makes an app for you. This is the base package. And then you can code with Gemini in here as well. So I, to no avail, and I don't know that much about the structure of the Flutter language. And it's also, Flutter is used for both the web. It makes a web app. It makes Android. And then it also makes an Apple app.

So it makes all these apps for you and it's supposed to be an easy way to maintain your application and website where you can kind of integrate everything all into one code base. And typically when you are looking at an app, they have a Swift app with Swift is the language for the Apple apps. And then they have the Android app.

Forgot the language, honestly, it's eluding me right now. Because they have a couple that you could use, but there's a main language. But Apple only has Swift. So anyways, Flutter is kind of like a place in between those two, where you can have one code base for many apps. So it saves time when you have to do updates and all these other things, and you don't need as many developers, and it's simpler, apparently. I've never used it before, but here we are. We're using it.

First time ever. So I was trying to get Jim and I to do some coding for you, but what I can do is you can highlight this piece of code and I can ask Jim and I to explain what's going on. So if I go over here to the Jim and I Seminal, so explain this code.

So it would break down the code pretty much function by function in this method that it's using on the website. So it's breaking down this test widgets. So I have a widget on there that it made. I didn't make it. And it will give you a count of the amount of times it's clicked. So if I go to the website that I'm sharing,

Probably open this in big screen here. Open.

It's loading up.

sharing.

I guess it didn't go full screen for me, whatever. Okay, so I'm gonna start clicking this button a couple times. See how it's going up, so I'm at 10, 11, 12. Pretty nice. So let's ask Jim and I, let's say, okay, can you, this is pretty simple to do, just editing this, the colors, okay, can you write code to edit the colors?

to red.

So that's importing all this stuff for me, all this code. It's telling me to modify the code, go to My App widget, go with the Material App widget. Then you can set theme of your Material App to the theme data object. OK, and then change the primary color to red. OK.

So let's copy this code. Colton is the language. Colton, Colton. All right, so we need to go where?

Pretty sure, I'll just make sure where we need to go. Where do I put this code?

All right, so we want to go to test or test within your test directory. If you don't already have a test file, okay, have a test right here. I don't see anything about colors though.

No way it knows it's gotta be purple. So I'll just put that in here, copy.

paste it. All right, so I'm getting a lot of errors. It's no good. Quick fix, let's see. Change color to hint color. All right, so I had an error in my code and I asked it for a quick fix and it fixed my code for me. It gave me a selection of what I want if you have a problem. Main is already defined with one of the direct decorations.

this is where I need to put it. Got it. Just delete all this stuff.

So let's go like this.

Copy, get rid of that. Go up, we're gonna change the main.

All right.

I'm going to delete these imports, put them at the top.

Okay, so now let's reload the page see if it works.

I have to do a hard reload. Hard reset. All right, so it's going to rebuild. Why is this rebuilding? As earlier, we spoke about making AI accessible to everyone and trying to make things helpful. I think out of the current companies, Google is doing this at scale. They're launching these AI features that were talked about in Chrome and on.

your Google apps to three billion users. Potentially, I'm not sure if every one of those people are Google users, but I do know that they, three billion people use Chrome. And our code change did not work, but I don't really know anything about Flutter, so I can't really help out here. I would have to look at it, but I'm doing everything on the fly. So that was a trial and a fail.

Two fails there for us today. The LLM, no, the Gmail question failed and the code change today failed. I'm gonna stop sharing. Interesting. But as I was saying, they're launching these features to many users, three billion, and they are allowing...

Allowing all these people to interact with AI and get more comfortable with it. Same thing with Meta. Meta is launching those AI agents that we spoke about a couple of weeks ago to billions of users. I think those things are a little bit more special than...

the kind of crazy AI models. Because a lot of the people that you meet day to day aren't gonna be the people that would interact with those crazy models. But everyone could interact with those AI models, AI agent models on Facebook or WhatsApp or Instagram. Everyone's gonna get use out of this AI overview on Chrome. Every...

Google user could use the summarize my email or find stuff in my email box for me.

Those things, I feel like, improve people's lives. And they don't necessarily improve them so much, like six -fold. But it's a nice nudge in the right direction where, you know, AI is helpful and it's something good and don't be scared of this change.

That's my perspective. Let me know what you think. I know today's episode was pretty long. If you're still tuning in, I'm, I'm happy for you. I'm happy for myself too. I'm surprised people listen to these episodes anyways. As I said, it, you know, it's, it's a fun, it's a fun thing I like to do. And I like to share the stuff that I learned in my, in my free time and, and stay disciplined and hopefully to keep, keep learning every week and keep pushing myself.

and pushing yourself as well. If you're here, you're learning. If I'm here, I'm learning. We're all learning together.

Next week, I will hopefully be discussing Alpha Fold in a little bit more detail. Last week, I couldn't discuss Alpha Fold as much as I wanted to because I didn't know that much about it. Not that I didn't know about Alpha Fold and how it worked. I just don't know how it interacts with like the human body and how does someone go about drug discovery or are these things that it's used for? I don't really understand it.

as much as I should. And I want to discuss how you would use Alpha Fold if you were an AI researcher. I think that'd be really cool. I picked up a new book. If you're watching a video, the little blue one in the back, I picked up.

I think it's called the growth handbook going from 10 to 10 ,000. And it's supposed to talk about scaling processes and making sure you set up processes for your company that are scalable. I think it's fitting because I am working at a startup type situation. So we're trying to grow and I'm in charge of creating processes for the company in automation and.

risk management and program development and.

rating. So if I can get a better process, a process that's scalable for everyone and integrate these frameworks from the book that other unicorn founders use to be great. So that's what, that's what I'm working on. That's our next book review. Hopefully everyone is excited and giddy, but anyways, have a great day. I hope all is well. Have a good morning, a good night, a good, good afternoon and

I'll see you next week. Thanks for tuning in and I hope you tune in again. Bye. See you.