Mark Ramsey, Ph.D.

Title: Using Bots, Machine Learning & Pipelines to create a modern data management environment

Abstract:

The application of AI and machine learning to tackle tasks such as medical diagnosis, portfolio management or help desk automation are popular media topics. An area of much less coverage is the application of these technologies in the creation of a modern data management environment. This session will highlight how a pharmaceutical company implemented a large scale, production class, big data & analytics platform in less than a year leveraging bots, machine learning and pipelines. Learn how the technologies were applied to the data sources, ingestion and rationalization processes to accelerate the implementation of an analytics-ready data management environment.

Bio:

Dr. Ramsey holds a bachelor degree in Computer Science, an MBA with specialization in Computer & Information Security, and a Ph.D. specializing in Applied Computer Science. Mark is the R&D Chief Data & Analytics Officer for GSK. He leads the Data & Computational Sciences team responsible for the transformation of R&D through the application of artificial intelligence, machine learning and analytics on internal and external data to drive data-driven decision making. Mark led the development of the R&D Information Platform which serves as the foundation of the data strategy. Integrating an ecosystem of nearly two dozen technologies, RDIP provides a production, large scale environment to support data consolidation, rationalization, and analytics on complex data ranging from genetics to bioassays. Mark has been recognized as one of the Top 100 Innovators in data & analytics.

Karin Strauss, PhD Luis Ceze, PhD

Title: DNA Data Storage and Near-Molecule Processing for the Yottabyte Era

Abstract:

DNA data storage is an attractive option for digital data storage because of its extreme density, durability and eternal relevance. This is especially attractive when contrasted with the exponential growth in world-wide digital data production. In this talk we will present our efforts in building an end-to-end system, from the computational component of encoding and decoding to the molecular biology component of random access, sequencing and fluidics automation. We will also discuss some early efforts in building a hybrid electronic/molecular computer system that can offer more than just data storage, for example, image similarity search.

Bio:

Karin Strauss is a Senior Researcher at Microsoft and an Affiliate Professor in the Allen School for Computer Science and Engineering at University of Washington. Her research lies at the intersection of computer architecture, systems, and biology. Lately, her focus has been on DNA data storage. In the past, she has studied other emerging memory technologies and hardware accelerators for machine learning, among others. Previously, she worked for AMD, and before that she got her Ph.D. in 2007 from the Department of Computer Science at University of Illinois, Urbana-Champaign.

Luis Ceze is a Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington and a Venture Partner at Madrona Venture Group. His research focuses on the intersection between computer architecture, programming languages, machine learning and biology. His current focus is on approximate computing for efficient machine learning and DNA-based data storage. He co-directs the Molecular Information Systems Lab (MISL), the Systems and Architectures for Machine Learning lab (SAML) and the Sampa Lab for HW/SW C-design. He has co-authored over 100 papers in these areas, and had several papers selected as IEEE Micro Top Picks and CACM Research Highlights. His research has been featured prominently in the media including New York Times, Popular Science, MIT Technology Review, Wall Street Journal, among others. He is a recipient of an NSF CAREER Award, a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, the IEEE TCCA Young Computer Architect Award and UIUC Distinguished Alumni Award.