NEW RELEASE

MLRun 1.7 is here! Unlock the power of enhanced LLM monitoring, flexible Docker image deployment, and more.

Top 7 ODSC East Sessions You Can’t Afford to Miss

Alexandra Quinn | April 18, 2023

ODSC East is one of the leading data science conferences. This year, it’s taking place in Boston and virtually, between May 9 and 11. The event will be packed with valuable content, with 250 speakers across tracks like Machine Learning, Hands-on Training, Deep Learning, NLP, Responsible AI, and many more. Don’t miss this opportunity to stay up-to-date on the latest technological advancements, build your job skills, and network peers.

With so many options, identifying the sessions you’d like to attend is challenging. To help, we gathered this list of our top seven recommended sessions. We chose them based on their ability to provide practical, yet innovative, ways to solve challenges encountered by data professionals.

Here are our top recommended 8 sessions:

1. The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation

Wed., May 10, 2:20pm - 3:50pm

Andrew Zalvidar, Senior Developer Relations Engineer, Google Research

Data Cards are transparency artifacts with the structured summaries of ML datasets, explanations of the processes and logic that shape the data and descriptions of how the data can be used to train and evaluate ML models. In this session, Andrew Zalvidar from Google Research shares the Data Cards Playbook, a toolkit that provides activities, frameworks and guidance designed to help teams and organizations in their dataset transparency efforts. This workshop will guide participants about how to use this open-source and self-service kit. It will also provide evidence-based patterns to help predict and avoid common challenges.

Read the full abstract here.

2. Keynote: Infuse Generative AI in your apps using Azure OpenAI Service

Tue., May 9, 9:40am - 10:10am

Eve Psalti, Principal Group Program Manager, Microsoft

Azure OpenAI Service can help organizations apply AI models like Dall-E 2, GPT-3.5, Codex and ChatGPT to language-related applications they are building. The platform can help improve efficiency and mitigate risks through capabilities like security, privacy controls, geo-diversity, content filtering and responsible AI. Eve Psalti from Microsoft will explain how.

Read the full abstract here.

3. Synthetic Data in Healthcare: Methods, Challenges, and Use Cases

Tue., May 9, 12:30pm - 1:15pm

Ahmed Alaa, Assistant Professor UC Berkeley, UCSF

The healthcare industry benefits greatly from ML, by enhancing clinical workflows and supporting decision-making for clinical practitioners. But healthcare researchers lack access to high-quality data at scale. Generative modeling can help develop tools for synthesizing realistic multi-modal clinical data, without compromising patient privacy. In this session, Ahmed Alaa from UC Berkeley and UCSF will explore ML methods for synthesizing healthcare data, discuss the challenges of generative modeling in a clinical context and share potential use cases.

Read the full abstract here.

4. Containers + GPUs In Depth

Wed., May 10, 3:35pm - 4:20pm

Emily Curtin, Staff MLOps Engineer, Intuit Mailchimp

Emily Curtin from Intuit Mailchimp was able to tackle the challenge of connecting abstract containerized processes to hardware and scaling that process across people and projects - and she’s here to show how. Her sure-to-be-fascinating talk provides a walkthrough of the lower level system libraries involved in GPU computing, across each layer between a Data Scientist's ML application in a container and the GPU that backs it. She will also include her take on how to balance data scientists' human needs with heavy system optimization.

Read the full abstract here.

5. Powering Millions of Real-time Decisions with Distributed Model Serving

Thu., May 11, 12:00pm - 12:45pm (lightning talk series)

Hakan Baba, Staff Software Engineer, Lyft.Inc

Lyft relies on millions of critical real-time decisions made each day by ML models. To enable efficient and accurate decision-making at scale, the team built LyftLearn Serving, a distributed online model serving system. The system was designed to perform model inferences within single digit millisecond latencies and a throughput of 1,000,000+ requests per second and to support model sizes from low kilobytes to gigabytes and with model update periods as fast as a couple of minutes. It also empowers teams for use-cases across fraud detection, pricing, safety, ETAs, and others, so they can use any modeling library for shipping effective models quickly and with no constraints. In this talk, Hakan Baba from Lyft will provide an overview of LyftLearn Serving’s online model serving requirements, showcase the various techniques used to build the system and present their design decisions.

Read the full abstract here.

6. The Impact of Various Accuracy Metrics When Modeling a Broken Supply Chain

Thu., May 11, 12:00pm - 12:45pm (lightning talk series)

Matthew Dzugan, Director of Data Science, project44

The way an organization leverages performance metrics has far-reaching business implications. What’s the right way to choose and measure metrics and ensure they are accurately mapped to real-world use cases? Matthew Dzugan from project44 will share how his company tracks the shipment of cargo on planes, trains, boats and trucks across the globe by creating models for metrics, like Estimated Time of Arrival. In the talk, he will share the pros and cons of selecting various metrics and the pros and cons of the techniques they use to sample their production data and ensure metrics are accurate.

Read the full abstract here.

7. MLOps in the Era of Generative AI

Wed., May 10, 11:00am - 11:45am

Yaron Haviv, Co-Founder & CTO,  Iguazio (acquired by McKinsey & Company)

Generative AI provides exciting opportunities to be explored. But it also presents new AI operationalization challenges, like handling massive amounts of data, large scale computation and memory, complex pipelines, transfer learning, extensive testing, monitoring, and more. In this session, Yaron Haviv, co-founder and CTO of Iguazio (acquired by McKinsey), will share MLOps orchestration best practices for automating the CI/CD of foundation models and transformers, along with the application logic, in production. He will also discuss pipeline monitoring, show how to use GPUs to maximize application performance and will share how to make the whole process efficient, effective and collaborative.

Read the full abstract here.

Meet Iguazio at ODSC East

If you’re attending ODSC East, be sure to stop by booth # 3 to say hello to the Iguazio team and discuss MLOps. For a more in-depth session with us, let’s book a 1:1