Stanford Data Science

July 31, 2025 onAir Curators

Source | Stanford Data Science

Summary

In the decades to come, our ability to advance discovery, create new knowledge, and provide insights that suggest solutions to the world’s most pressing problems will increasingly rely on our ability to learn from data.

Stanford Data Science (SDS) convenes a community of the world’s best data scientists with scholars and practitioners from diverse fields who rely on accurate, dependable, large data sets and modern data science techniques to advance their work.

At SDS, research, application, and education thrive in a mutually supportive culture by cross-pollinating ideas, questions, and solutions among engineering, business, the humanities, law, medicine, natural sciences, social sciences, and sustainability experts. Together we are developing new methods, revealing fresh insights, and educating the next generation of leaders and citizens who will harness data science and benefit from its responsible application.

OnAir Post: Stanford Data Science

About

Stanford Data Science Goals

The goal of Stanford Data Science is to enable data-driven discovery at scale and expand data science education — across Stanford and beyond.

To achieve our goals, we are investing in:

Source: Stanford Data Science

Community

Stanford Data Science is helping to weave data science research into the fabric of the university and connect with our peer universities, not-for-profit organizations, and corporate partners.

One way we see this happening is by bringing researchers at all levels together around similar thematic areas like data types and data science methods. We also build community with stakeholders outside the university, students, faculty, non-profits, and industrial partners through sponsored research, social good programs, and exchanges.

Global community

Women in Data Science (WiDS) – a global movement started at Stanford/ICME to “inspire and educate data scientists worldwide, regardless of gender, and to support women in the field”.
Center on Open and Reproducible Science (CORES) – focused on developing and nurturing transparency and reproducibility in the collection, analysis, and dissemination of data across all domains of scientific activity.
Stanford Causal Science Center (SC²) – focused on providing an interdisciplinary community for scholars interested in causality and causal inference.
COVID-19 Data Forum – a series of multidisciplinary, online meetings for topic experts to focus on data-related aspects of the scientific response to the pandemic, including data access and sharing, essential data resources for analysis, and how we can best support decision-making.

Stanford Community

NSF Frameworks grant 2019 – 2021: Data Science Collaboratory, weekly meeting in Wallenberg, Wednesdays at 3 PM.
Data-Driven Wildland Fire Research Seminar Series: a weekly series of talks with discussions from Stanford faculty, researchers, and Ph.D. students on the intersection of wildland fire research and data science. Mondays at 12:30 PM.

Industrial Partnerships

Stanford Data Science Industry Affiliates Program

Social Good and other not-for-profit stakeholders

Data Science for Social Good
Summer extension of the Stanford Big Earth Hackathon on Widland Fires
Planning underway for Data Science capstone courses at Stanford. Contact datascience@stanford.edu to help!

Promote your data science event

Data Science is practiced across campus, with workshops, research, and other events happening all the time. We are happy to help promote your data science event, please drop us a line at datascience@stanford.edu to share.

Source: Stanford Data Science

Research Areas

The world is being transformed by data and data-driven analysis is rapidly becoming an integral part of science and society. Stanford Data Science is a collaborative effort across many departments in all seven schools. We strive to unite existing data science research initiatives and create interdisciplinary collaborations, connecting the data science and related methodologists with disciplines that are being transformed by data science and computation.

Our work supports research in a variety of fields where incredible advances are being made through the facilitation of meaningful collaborations between domain researchers, with deep expertise in societal and fundamental research challenges, and methods researchers that are developing next-generation computational tools and techniques, including:

Data Science for Wildland Fire Research

In recent years, wildfire has gone from an infrequent and distant news item to a centerstage isssue spanning many consecutive weeks for urban and suburban communities. Frequent wildfires are changing everyday lives for California in numerous ways — from public safety power shutoffs to hazardous air quality — that seemed inconceivable as recently as 2015. Moreover, elevated wildfire risk in the western United States (and similar climates globally) is here to stay into the foreseeable future. There is a plethora of problems that need solutions in the wildland fire arena; many of them are well suited to a data-driven approach.

Seminar Series

Data Science for Physics

Astrophysicists and particle physicists at Stanford and at the SLAC National Accelerator Laboratory are deeply engaged in studying the Universe at both the largest and smallest scales, with state-of-the-art instrumentation at telescopes and accelerator facilities

Learn more

Data Science for Economics

Many of the most pressing questions in empirical economics concern causal questions, such as the impact, both short and long run, of educational choices on labor market outcomes, and of economic policies on distributions of outcomes. This makes them conceptually quite different from the predictive type of questions that many of the recently developed methods in machine learning are primarily designed for.

Learn more

Data Science for Education

Educational data spans K-12 school and district records, digital archives of instructional materials and gradebooks, as well as student responses on course surveys. Data science of actual classroom interaction is also of increasing interest and reality.

Learn more

Data Science for Human Health

It is clear that data science will be a driving force in transitioning the world’s healthcare systems from reactive “sick-based” care to proactive, preventive care.

Learn more

Data Science for Humanity

Our modern era is characterized by massive amounts of data documenting the behaviors of individuals, groups, organizations, cultures, and indeed entire societies. This wealth of data on modern humanity is accompanied by massive digitization of historical data, both textual and numeric, in the form of historic newspapers, literary and linguistic corpora, economic data, censuses, and other government data, gathered and preserved over centuries, and newly digitized, acquired, and provisioned by libraries, scholars, and commercial entities.

Learn more

Data Science for Linguistics

The impact of data science on linguistics has been profound. All areas of the field depend on having a rich picture of the true range of variation, within dialects, across dialects, and among different languages. The subfield of corpus linguistics is arguably as old as the field itself and, with the advent of computers, gave rise to many core techniques in data science.

Learn more

Data Science for Nature and Sustainability

Many key sustainability issues translate into decision and optimization problems and could greatly benefit from data-driven decision making tools. In fact, the impact of modern information technology has been highly uneven, mainly benefiting large firms in profitable sectors, with little or no benefit in terms of the environment. Our vision is that data-driven methods can — and should — play a key role in increasing the efficiency and effectiveness of the way we manage and allocate our natural resources.

Learn more

Ethics and Data Science

With the emergence of new techniques of machine learning, and the possibility of using algorithms to perform tasks previously done by human beings, as well as to generate new knowledge, we again face a set of new ethical questions.

Learn more

The Science of Data Science

The practice of data analysis has changed enormously. Data science needs to find new inferential paradigms that allow data exploration prior to the formulation of hypotheses.

Learn more

Source: Stanford Data Science

Careers

Stanford Data Science is hiring!

We are seeking Research Data Scientists! Research Data Scientists will play a critical role in a new strategic investment in “Marlowe”, a GPU-based computational instrument designed to enable large-scale, data-intensive research. This position will leverage their expertise in data science, machine learning/computation and data-intensive research to develop and optimize workflows and applications that unlock Marlowe’s capabilities for everyone on campus.

https://careersearch.stanford.edu/jobs/research-data-scientist-27446

___

We are seeking a Program Manager to help with the application and review process for “Marlowe”, a new GPU-based computational instrument, and assist in the coordination and administration of other Stanford Data Science programs and activities. This position will work closely with key stakeholders across a wide variety of roles including faculty, researchers, staff, and trainees.

https://careersearch.stanford.edu/jobs/program-manager-26870

Please join our email list, for future announcements.

Source: Stanford Data Science

People

Faculty Director

Guido Imbens, Economics

Associate Directors

Emmanuel Candes, Statistics & Mathematics
Ramesh Johari, Management Science and Engineering
David Lobell, Earth System Science
Russell Poldrack, Faculty Director, CORES; Albert Ray Lang Professor of Psychology
Chiara Sabatti, Biomedical Data Science & Statistics
Risa Wechsler, Physics; Particle Physics & Astrophysics
James Zou, Biomedical Data Science

Staff Directors

Craig Kapfer, Senior Director of Research Data Science

Chris Mentzel, Executive Director, Stanford Data Science

Elizabeth Wilsey, Director, Engagement and Partnership

Source: Stanford Data Science

Web Links

Programs

Stanford Data Science (SDS) believes that developing the next generation of early-career researchers is at the core of the University’s mission. We support PhDs and postdocs through the Stanford Data Science Data Science Scholars and Postdoctoral Fellows programs, bringing together a multidisciplinary cohort of scholars to learn, share, and collaborate on cutting-edge topics. Our early-career researchers advance data science, machine learning, and AI and apply these techniques to drive new scientific discoveries in disparate fields, including biology and medicine, astrophysics, sustainability, and a lot more!

Resource Type	Non-preemptible Jobs (Medium/Large Projects)	Preemptible Jobs (Basic Access)
GPU Usage	$1.25 per GPU-Hour	$0.75 per GPU-Hour
CPU Usage	$0.050 per CPU-Hour	$0.025 per CPU-Hour

Stanford Data Science

Summary

About

Stanford Data Science Goals

Community

Global community

Stanford Community

Industrial Partnerships

Social Good and other not-for-profit stakeholders

Promote your data science event

Research Areas

Data Science for Wildland Fire Research

Data Science for Physics

Data Science for Economics

Data Science for Education

Data Science for Human Health

Data Science for Humanity

Data Science for Linguistics

Data Science for Nature and Sustainability

Ethics and Data Science

The Science of Data Science

Careers

Stanford Data Science is hiring!

People

Faculty Director

Associate Directors

Staff Directors

Web Links

Programs

Postdoc Fellows

The Opportunity

Term

Qualifications

Desired Qualifications

Desired Start

Required Application Materials

PhD Scholars

More Information

Education

New Data Science Majors

New Courses

Stanford Continuing Studies

Educational Offerings

Informal Training, “on-ramps” to data science

Data Science for Social Good Summer Program

Women in Data Science

WiDS Worldwide

Rising Stars in Data Science

Eligibility & Guidelines

Workshop Format

Virtual Info Session

Research Centers

Faculty-Led Research Centers

Stanford Causal Science Center

Center for Open and REproducible Science

Center for Sustainability Data Science

Stanford Data Science for Health Center

Center for Decoding the Universe @ Stanford

Welcome to the Stanford Center for Computational Market Design

Stanford Center for Neural Data Science

Mission and Goals

Marlowe

Marlowe – Stanford’s GPU-Based Computational Instrument

GPU-Based Computational Instrument

Research Data Scientists

Overview

Allocation Details

Citing Marlowe

Node Overview

Allocation Details

Technical Details

Data Risk Classification

Low and Moderate Risk data

High Risk data

Marlowe Access

Basic Access

Medium Project Access

Large Project Access

Project Application Guide

Marlowe GPU Project Application – Preparation Guide