UpNext AI

A quick catch-up on the biggest AI stories for June 24, 2026: OpenAI broadens its cybersecurity push with a new bug-fixing initiative, MoEngage bets that customer marketing will be run by AI agents, and a new research paper questions whether AI judges are actually good at evaluating subtle speech differences.

Covered in this episode:
- OpenAI unveils an improved GPT-5.5-Cyber model and its Patch the Planet effort for open-source security work
- MoEngage acquires Aampe to push toward customer-by-customer AI agent marketing
- New research: ParaPairAudioBench tests whether audio-language models can judge subtle speech differences the way humans do
- Anthropic launches Claude Tag in research preview inside Slack
- OpenAI says GPT-5 Pro helped immunologist Derya Unutmaz with a three-year-old T cell mystery
- Prime Day brings broad discounts on robot vacuums from brands including Roborock, Dreame, and Shark

Source links:
- https://www.wired.com/story/openai-launches-full-scale-effort-to-patch-open-source-bugs-as-it-takes-on-anthropics-mythos/
- https://techcrunch.com/2026/06/23/indias-moengage-bets-marketings-future-on-millions-of-ai-agents/
- https://arxiv.org/abs/2606.24648v1
- https://www.reuters.com/technology/anthropic-launches-claude-tag-research-preview-slack-users-2026-06-23
- https://openai.com/index/gpt-5-immunology-mystery
- https://www.theverge.com/gadgets/951081/robot-vacuum-mop-deals-amazon-prime-day-2026

What is UpNext AI?

Daily AI news and research, distilled. UpNext AI breaks down the most important developments in artificial intelligence—from major industry moves to cutting-edge papers.

Welcome to the UpNext AI podcast. It's Wednesday, June 24th, 2026, and here's what matters in AI today.

First up, OpenAI is making a broader move into cybersecurity. Wired reports the company announced an improved version of GPT-5.5-Cyber along with a new initiative called Patch the Planet, focused on helping fix bugs in open-source software.

The bigger point here is that this goes beyond a model update. According to Wired, OpenAI is pairing the model with a more operational security effort: expanding trusted access to its cyber-focused models, releasing its Codex Security scanner as an app plug-in, and working with partners including Trail of Bits, HackerOne, and Calif on the open-source side.

Patch the Planet is aimed at a real pain point. Open-source maintainers already struggle to keep up with vulnerability reports, and AI-generated bug reports can add even more noise to that queue. Wired says more than 30 open-source projects are already participating, and that the project has already uncovered hundreds of bugs and produced dozens of patches in its first week.

The competitive backdrop also matters. Wired frames this as part of a broader race over AI cybersecurity capabilities, and cites benchmark figures showing GPT-5.5-Cyber at 85.6 percent versus 83.8 percent for Anthropic’s Mythos 5. Those numbers are in the reporting, but the key takeaway is narrower: OpenAI is positioning this as a practical security program, not just a benchmark win.

So the headline is simple: OpenAI wants to be seen not only as building powerful cyber models, but as putting them to work on real-world software defense.

Next, TechCrunch reports that MoEngage is betting the future of marketing will be millions of AI agents working customer by customer.

The company has acquired San Francisco startup Aampe in an all-cash deal. MoEngage did not disclose terms, but TechCrunch reports a source familiar with the matter said the deal was worth tens of millions of dollars. Aampe’s pitch is that instead of grouping people into broad audience segments, brands can assign a dedicated AI agent to each customer and personalize messaging based on that individual’s behavior.

That is a more ambitious vision than the usual AI marketing story. This is not just drafting copy faster. It is software making decisions about who gets contacted, what they see, and when they see it.

TechCrunch says Aampe has more than 30 customers across the U.S., Europe, and Asia-Pacific. MoEngage says it serves more than 1,350 consumer brands across 75 countries. And this comes just over six months after MoEngage raised $280 million through a mix of primary and secondary transactions. Aampe, for its part, has raised about $28 million since its 2020 founding.

MoEngage also told TechCrunch it sees this as a competitive wedge against larger platforms like Salesforce and Adobe. So if the lead story was about AI moving deeper into security operations, this one is about AI moving deeper into enterprise decision-making at the customer level.

For the research section, a paper published earlier this week takes aim at a growing habit in AI evaluation: using audio-language models as judges for generated speech and assuming broad quality scores are enough.

The paper is called ParaPairAudioBench: Paralinguistic Pairwise Audio Benchmark for LALM-as-a-Judge. The researchers argue that prior methods mostly focus on holistic naturalness, basically whether a clip sounds good overall, while missing finer distinctions in how speech is delivered.

Their benchmark uses 5,175 audio pairs across five dimensions: style, rate, emphasis, age, and gender. Instead of asking whether one clip sounds generally natural, it asks whether a judge model can tell which of two clips better matches a subtle human-relevant property. The benchmark also includes both same-transcript and cross-transcript conditions to help separate sensitivity to words from sensitivity to the audio itself.

The headline result is that current large audio-language model judges lagged human judgments by 32 percentage points on average. The paper also says these models showed severe calibration failures, especially in tie cases where the right answer is to abstain.

Bottom line: if you use AI to grade speech systems, broad naturalness scores may hide important weaknesses. This paper suggests the judge models can still miss the finer cues that people actually notice.

...Are you building apps with voice? Elevate your app's voice capabilities with ElevenLabs. Their API is a game changer for embedding dynamic, responsive voice interactions in your applications, providing unprecedented realism, flexibility and latency. In fact, you're listening to one of their voices - right - now. If you are a developer looking to elevate user experience with natural voice interfaces, this is your solution. Visit up next dot fm slash eleven to check out their latest offerings. ...

Reuters reports Anthropic has launched Claude Tag in research preview for Slack users. The new AI agent works inside Salesforce’s Slack and can operate alongside employees in group chats, adding another front in the enterprise AI battle.

OpenAI says GPT-5 Pro helped immunologist Derya Unutmaz solve a three-year-old immunology mystery, offering insights into T cell behavior that the company says could support cancer and autoimmune research.

And if you want a non-frontier-lab note to end on, The Verge has a roundup of Prime Day robot vacuum deals. The specific discounts move around, but the broader theme is that brands like Roborock, Dreame, and Shark are discounting models across a wide range of price points.

Before we wrap up, a quick note: this podcast is generated with the assistance of AI and is intended for informational purposes only. All referenced articles, research, and commentary remain the property of their original authors and publishers.

If you enjoyed this episode, don't forget to subscribe, rate, and leave us a review! And that's your briefing for today. Full source links are in the episode notes, and we'll be back tomorrow with what's up next!

More episodes

Chapters

What is UpNext AI?