{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"Crazy Wisdom","title":"The Art of Artificial: Synthetic Data and the Shaping of AI with Fabian Schonholz","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/683e1812\"></iframe>","width":"100%","height":180,"duration":3109,"description":"In this episode of the Crazy Wisdom podcast, I, Stewart Alsop, sit down with Fabian Schonholz, a seasoned technology and operations executive, to explore the intriguing world of synthetic data. We discuss its pivotal role in training AI models, particularly large language models (LLMs), and delve into the nuances of data behavior, the challenges of ensuring realism without real-world ties, and the potential of synthetic data to mitigate biases in AI training. For those interested in learning more about Fabian or reaching out for consultations, visit his LinkedIn profile linked here or check out his consulting services at FESSEXconsulting.com.  Check out this GPT we trained on this conversation Timestamps  05:00 - Challenges of modeling nuanced behaviors in synthetic data and its implications for AI model training. 10:00 - Applications of synthetic data in different types of models (e.g., churn models, conversion models) before the emergence of LLMs. 15:00 - The role of synthetic data in accelerating AI model production and enhancing data density. 20:00 - Discussion on the influence of nuanced behaviors on AI models, specifically within the context of LLMs and their ability to capture the subtleties of human language. 25:00 - Exploration of the improvement in model performance when retrained with real data after initial training with synthetic data. 30:00 - Considerations on bias in model training, the impact of synthetic data on reducing bias, and the broader implications for AI accuracy and fairness. 35:00 - The process of creating synthetic data, including the use of data from real-world scenarios as a base for generating synthetic datasets. 40:00 - The utility of synthetic data in operational contexts, specifically in AI model training, and the feedback loops involved in improving these models over time. 45:00 - Final thoughts on the potential risks and philosophical aspects of synthetic data usage, particularly in relation to its impact on the quality of AI...","thumbnail_url":"https://img.transistorcdn.com/UZbrDrlO5VTfDNcq188THwbv0T09vcmLyzx3BcPI9bs/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS81Y2Rj/OGFiMTYyMGFkNTM5/N2NjOWI2MWM5YzQ1/YTc2Ny5qcGc.webp","thumbnail_width":300,"thumbnail_height":300}