This interview was recorded for GOTO Unscripted.
https://gotopia.tech

Check out more here:
https://gotopia.tech/articles/410

Holden Karau - Co-Founder at Fight Health Insurance
Julian Wood - Serverless Developer Advocate at AWS

RESOURCES
Holden
https://bsky.app/profile/holdenkarau.com
https://twitter.com/holdenkarau
https://github.com/holdenk
https://www.linkedin.com/in/holdenkarau
http://holdenkarau.com

Julian
https://bsky.app/profile/julianwood.com
https://twitter.com/julian_wood
https://github.com/julianwood
http://www.wooditwork.com
https://www.linkedin.com/in/julianrwood

Resource
https://fighthealthinsurance.com

DESCRIPTION
Apache Spark contributor Holden Karau discusses the evolution of distributed data processing, the challenges of integrating machine learning with big data tools, and the technical complexities of working with GPUs at scale.

The conversation takes a personal turn as Holden reveals her latest project: an open-source AI tool that helps people appeal health insurance claim denials. Drawing from personal experiences with denied claims, she has built a system that trains on independent medical review data to generate effective appeals, making it freely available at https://fighthealthinsurance.com while grappling with the classic open-source challenge of sustainable funding.

RECOMMENDED BOOKS
Holden Karau • Distributed Computing 4 Kids • https://www.distributedcomputing4kids.com
Holden Karau • Scaling Python with Dask • https://www.oreilly.com/library/view/scaling-python-with/9781098119867
Holden Karau & Boris Lublinsky • Scaling Python with Ray • https://amzn.to/44GU6cC
Holden Karau & Rachel Warren • High Performance Spark • https://amzn.to/3v2eLbn
Holden Karau, Konwinski, Wendell & Zaharia • Learning Spark • https://amzn.to/397e2NE
Holden Karau & Krishna Sankar • Fast Data Processing with Spark 2nd Edition • https://amzn.to/3xKhXKu
Holden Karau • Fast Data Processing with Spark 1st Edition • https://amzn.to/3rHQgOu


Bluesky (https://bsky.app/profile/gotocon.com) 
Twitter (https://twitter.com/GOTOcon) 
Instagram (https://www.instagram.com/goto_con) 
LinkedIn (https://www.linkedin.com/company/goto-) 
Facebook (https://www.facebook.com/GOTOConferences) 

CHANNEL MEMBERSHIP BONUS
Join this channel to get early access to videos & other perks:
https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/join

Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket: gotopia.tech (https://gotopia.tech) 

SUBSCRIBE TO OUR YOUTUBE CHANNEL (https://www.youtube.com/user/GotoConferences/?sub_confirmation=1)  - new videos posted daily!

GOTO's resource offers insights, tutorials, and resources for software developers, architects, and technology leaders. Readers can learn about emerging technologies, industry trends, and best practices for software development. With articles, talks, and workshops, GOTO provides  guidance and expertise for staying ahead in the ever-evolving world of technology.

GOTO Conferences

Apache Spark committer Holden Karau discusses distributed data processing, the evolution from MapReduce to Spark, and how Spark handles modern ML workloads with GPUs. The conversation covers common mistakes in distributed computing (like ignoring data skew), resource profiles for GPU optimization, and the interplay between data tools and AI frameworks. Karau also shares their project using fine-tuned language models to help people appeal health insurance claim denials, trained on independent medical review board data. The discussion touches on open source sustainability challenges, GPU rental costs for training, checkpointing strategies, and the practical difficulties of building AI applications as an individual developer.

This AI Fights Health Insurance Denials • Holden Karau & Julian Wood