Affiliate Positions
- Adjunct Associate Professor, University of Washington Computer Science & Engineering
- Adjunct Associate Professor, University of Washington Department of Electrical Engineering
- Program Director and Faculty Chair, University of Washington Data Science Masters Degree
Specializations
- Data Management
- Data Science for Social Good
- Scientific Databases and Visualization
Research Area
Courses
- INFO 330 - Databases And Data Modeling
Biography
I am an Associate Professor in the Information School, Adjunct Associate Professor in Computer Science & Engineering, and Associate Director and Senior Data Science Fellow at the UW eScience Institute. I am a co-founder of Urban@UW, and with support from the MacArthur Foundation and Microsoft, I lead UW's participation in the MetroLab Network. I created a first MOOC on Data Science through Coursera, and I led the creation of the UW Data Science Masters Degree, where I serve as its first Program Director and Faculty Chair. I serve on the Steering Committee of the Center for Statistics in the Social Sciences.
My group's research aims to make the techniques and technologies of data science dramatically more accessible, particularly at scale. Our methods are rooted in database models and languages, though we sometimes work in machine learning, visualization, HCI, and high-performance computing. We are an applied, systems-oriented group, frequently sourcing projects through collaborations in the physical, life, and social sciences.
Education
- Ph D, Computer Science, Portland State University, 2007
- BS, Industrial and Systems Engineering, Georgia Institute of Technology, 1999
Awards
- Best Paper Runner Up, Experiment, Analysis & Benchmark Track - VLDB 2023, 2023
- Research Highlights - ACM SIGMOD Record, 2023
- Innovation of the Month - MetroLab Network, 2021-2021
Publications and Contributions
-
Conference PaperDoes a Fair Model Produce Fair Explanations? Relating Distributive and Procedural Fairness (2024)Proceedings of the 57th Hawaii International Conference on System Science, pp. 6868-6877
-
Conference PaperGeospatial Imputation of Urban Mobility Data with Self-Supervised Learning (2024)Proceedings of the 57th Hawaii International Conference on System Science, pp. 5619-5628
-
Conference PosterLabel-Efficient Group Robustness via Out-of-Distribution Concept Curation (2024)Conference on Computer Vision and Pattern Recognition (CVPR 2024)
-
Conference PaperContrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias (2023)Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pp. 1174–1185
-
Conference PaperEpistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy (2023)Proceedings of the VLDB Endowment (VLDB), 16(11), pp. 3178-3191
-
Conference PosterRegularizing Model Gradients with Concepts to Improve Robustness to Spurious Correlations (2023)Fortieth International Conference on Machine Learning Workshop on Spurious Correlations, Invariance, and Stability (ICML SCIS 2023)
-
Journal Article, Academic JournalIntegrative urban AI to expand coverage, access, and equity of urban data (2022)The European Physical Journal Special Topics, pp. 1-12
-
Conference PaperOntologue: Declarative Benchmark Construction for Ontological Multi-Label Classification (2022)Conference on Neural Information Processing Systems (NeurIPS), pp. 14
-
Conference PaperResponsible Data Management (2022)Communications of the ACM, 65(6), pp. 64-74
-
Conference PaperSurj: Ontological Learning for Fast, Accurate, and Robust Hierarchical Multi-label Classification (2022)Companion Proceedings of the Web Conference (WWW), pp. 1106–1114
-
Invited Paper ReviewTechnical Perspective: Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds? (2022)ACM SIGMOD Record, 51(1), pp. 59
-
Invited Paper ReviewTechnical perspective: Visualization search: from sketching to natural language (2022)Communications of the ACM, 65(7), pp. 84
-
Journal Article, Academic JournalCovid-19 brings data equity challenges to the fore (2021)Digital Government: Research and Practice, 2(2), pp. 1-7
-
Conference PaperEquiTensors: Learning Fair Integrations of Heterogeneous Urban Data (2021)International Conference on Management of Data (SIGMOD)
-
Conference PaperEquiTensors: Learning Fair Integrations of Heterogeneous Urban Data (2021)International Conference on Management of Data (SIGMOD)
-
Conference PaperJECL: Joint Embedding and Cluster Learning for Image-Text Pairs (2021)International Conference on Pattern Recognition (ICPR)
-
Conference PaperSPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra (2021)Proceedings of the VLDB Endowment (VLDB), 13(12), pp. 3474-3488
-
Conference PaperTechnical Perspective: From Sketching to Natural Language: Expressive Visual Querying for Accelerating Insight (2021)SIGMOD Record 2021
-
Workshop PaperThe many facets of data equity (2021)EDBT/ICDT Workshops
-
ReportCUAC Program Report (2020)
-
Conference PaperDatabase Repair Meets Algorithmic Fairness (2020)ACM SIGMOD Record (2020), 49(1), pp. 34-41
-
Conference PaperDigital Government: Research and Practice (2020)Digital Government: Research and Practice (2020)
-
Conference PaperFairness-Aware Demand Prediction for New Mobility (2020)The AAAI Conference on Articial Intelligence (AAAI) (2020)
-
Conference PaperResponsible data management (2020)Proceedings of the VLDB Endowment (VLDB), 13(12), pp. 3474-3488
-
Conference PaperBeyond Open vs. Closed: Balancing Individual Privacy and Public Accountability in Data Sharing (2019)ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*)
-
Conference PaperCapuchin: Causal Database Repair for Algorithmic Fairness (2019)Proceedings of the 2019 International Conference on Management of Data (SIGMOD '19)
-
Technical ReportData Management for Causal Algorithmic Fairness (2019)pp. 12
-
Journal Article, Academic JournalData Management for Causal Algorithmic Fairness (2019)IEEE Data Eng. Bull., 42(3)
-
Conference PaperDatabase-Agnostic Workload Management (2019)Conference on Innovative Database Research (CIDR)
-
Technical Report
-
Conference PaperFairST: Equitable Spatial and Temporal Demand Prediction for New Mobility Systems (2019)Proceedings of the 27th ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), pp. 10
-
Journal Article, Academic JournalFairness in Practice: A Survey on Equity in Urban Mobility (2019)IEEE Data Eng. Bull., 42(3)
-
Conference Demonstration PaperGraviTIE: Exploratory Analysis of Large-Scale Heterogeneous Image Collections (2019)The World Wide Web Conference (2019), pp. 3605-3609
-
Conference PaperIdentifying the Central Figure of a Scientific Paper (2019)International Conference on Document Analysis and Recognition (ICDAR), pp. 1063-1070
-
Technical ReportIn Defense of Synthetic Data (2019)pp. 3
-
Conference PaperInterventional Fairness: Causal Database repair for Algorithmic Fairness (2019)International Conference on Management of Data (SIGMOD)
-
Conference Demonstration PaperMithralabel: Flexible dataset nutritional labels for responsible data science (2019)Proceedings of the 28th ACM International Conference on Information and Knowledge Management (SIGMOD), pp. 2893–2896
-
Technical ReportMultiDEC: Multi-Modal Clustering of Image-Caption Pairs (2019)pp. 9
-
Journal Article, Academic JournalNutritional Labels for Data and Models (2019)IEEE Data Eng. Bull., 42(3)
-
Opinion PieceProtect the public from bias in automated decision systems (2019)Seattle Times
-
Journal Article, Academic JournalThe principles of tomorrow's university (2019)F1000Research, 7:1926(Unknown Issue)
-
Conference Workshop PaperClassifying digitized art type and time period (2018)Workshop on Data Science for Digital Art History
-
Conference Demonstration PaperA Nutritional Label for Rankings (2018)ACM Conference on Management of Data (SIGMOD)
-
Conference PaperDelineating Disciplines Using Visual Information in Scientific Literature (2018)KDD 2018 BigScholar 2018: The 5th Workshop on Big Scholarly Data
-
Conference PaperEZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation (2018)International Joint Conference on Artificial Intelligence (IJCAI)
-
Conference PaperFormalizing Visualization Design Knowledge as Constraints: Actionable and Extensible Models in Draco (2018)IEEE Information Visualization (InfoVis)
-
Conference Workshop PaperMobilityMirror: Bias-Adjusted Transportation Datasets (2018)Workshop on Big Social Data and Urban Computing (BiDU)
-
Technical Report
-
Conference Demonstration PaperDataSynthesizer: Privacy-Preserving Synthetic Datasets (2017)ACM Scientific and Statistical Database Management Conference (SSDBM)
-
Conference Workshop PaperDeep Mapping of the Visual Literature (2017)Proceedings of the 26th International Conference on World Wide Web Companion (WWW): Big Scholar Workshop, pp. 1273-1277
-
Conference Workshop PaperEZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation (2017)Learning with Limited Labeled Data: Weak Supervision and Beyond, 2017 NIPS Conference
-
Conference PaperFides: Towards a Platform for Responsible Data Science (2017)ACM Conference on Scientific and Statistical Database Management (SSDBM)
-
Conference Workshop PaperLaraDB: A Minimalist Kernel for Linear and Relational Algebra Computation (2017)BeyondMR workshop, 2017 ACM SIGMOD conference
-
Conference Workshop PaperProfiling a GPU database implementation: a holistic view of GPU resource utilization on TPC-H queries (2017)Thirteenth International Workshop on Data Management on New Hardware, SIGMOD
-
Journal Article, Academic JournalScalable and Efficient Flow-Based Community Detection for Large-Scale Graph Analysis (2017)ACM Transactions on Knowledge Discovery from Data (TKDD), 11(3)
-
Workshop PaperSynthetic Data for Social Good (2017)Bloomberg Data for Good Exchange
-
Conference PaperThe Myria Big Data Management and Analytics System and Cloud Services (2017)Conference for Innovative Data Research (CIDR)
-
Journal Article, Academic JournalViziometrics: Analyzing Visual Information in the Scientific Literature (2017)IEEE Transactions on Big Data, PP(99)
-
Conference PosterViziometrics: Identifying Central Figures in Scientific Papers (2017)
-
Conference PaperVoyager 2: Augmenting Visual Analysis with Partial View Specifications (2017)ACM Human Factors in Computing Systems (CHI)
-
Journal Article, Academic JournalWide-Open: Accelerating public data release by automating detection of overdue datasets (2017)PLOS Biology, 15(6)
-
Conference PaperData Cleaning in the Wild: Reusable Curation Idioms from a Multi-Year SQL Workload (2016)Proceedings of the 11th International Workshop on Quality in Databases (QDB'16)
-
Journal Article, Academic JournalDeciphering Ocean Carbon in a Changing World (2016)Proceedings of the National Academy of Sciences, 113(12), ISBN/ISSN: ISSN 0027-8424
-
Conference PaperFrom NoSQL Accumulo to NewSQL Graphulo: Design and utility of graph algorithms inside a BigTable database (2016)Proceedings of the High Performance Extreme Computing Conference (HPEC 2016), pp. 1-9
-
Conference Workshop PaperHigh Variety Cloud Databases (2016)Proceedings of the 2016 IEEE Cloud Data Management Workshop
-
Conference PaperMusicDB: Relational Approach for Numeric Longitudinal Music Analytics (2016)Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR 2016), pp. 702-708
-
Conference PaperSQLShare: Results from a Multi-Year SQL-as-a-Service Experiment (2016)SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data, pp. 281-293
-
Journal Article, Academic JournalScalable clustering algorithms for continuous environmental flow cytometry (2016)Bioinformatics, 32(3), pp. 417–423
-
Conference PaperVizioMetrix: A Platform for Analyzing the Visual Information in Big Scholarly Data (2016)BigScholar Workshop (Third WWW Workshop on Big Scholarly Data: Towards the Web of Scholars)
-
Conference PaperVoyager: Exploratory analysis via faceted browsing of visualization recommendations (2016)IEEE Transactions on Visualization and Computer Graphics, 22(1), pp. 649–658
-
Journal Article, Academic JournalA Demonstration of the BigDAWG Polystore System (2015)Proc. Very Large Database Endowment (PVLDB), 8(12)
-
Conference ProceedingBig Data Science Needs Big Data Middleware (2015)CIDR 2015, Seventh Biennial Conference on Innovative Data Systems (lightning talk)
-
Conference ProceedingBuilding an Urban Data Science Summer Program at the University of Washington eScience Institute (2015)
-
Conference ProceedingDetecting and Dismantling Composite Visualizations in the Scientific Literature (2015)Pattern Recognition Applications and Methods - 4th International Conference, ICPRAM 2015, Lisbon, Portugal, January 10-12, 2015, Revised Selected Papers, pp. 247–266
-
Conference ProceedingDismantling Composite Visualizations in the Scientific Literature (2015)4th International Conference on Pattern Recognition Applications and Methods (ICPRAM)
-
Conference ProceedingGaussian mixture models use-case: in-memory analysis with myria (2015)Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics, pp. 3
-
Conference ProceedingGossipMap: a distributed community detection algorithm for billion-edgedirected graphs (2015)Proceedings of the International Conference for High Performance Computing,Networking, Storage and Analysis, Supercomputing 2015, Austin, TX, USA, November15-20, 2015, pp. 27:1–27:12
-
Conference ProceedingPerfopticon: Visual query analysis for distributed databases (2015)Computer Graphics Forum, 34(3), pp. 71–80
-
Journal Article, Academic JournalQuery-based data pricing (2015)Journal of the ACM (JACM), 62(5), pp. 43
-
Journal Article, Academic JournalThe BigDAWG Polystore System (2015)SIGMOD Record, 44(2), pp. 11–16
-
Conference ProceedingTime-varying clusters in large-scale flow cytometry (2015)IAAI Conference
-
Conference ProceedingTowards automated prediction of relationships among scientific datasets (2015)Proceedings of the 27th International Conference on Scientific and Statistical Database Management, SSDBM ’15, La Jolla, CA, USA, June 29 - July 1, 2015, pp. 35:1–35:5
-
Conference ProceedingHelping scientists reconnect their datasets (2014)Proceedings of the 26th International Conference on Scientific and Statistical Database Management, pp. 29
-
Conference ProceedingShould we all be teaching intro to data science instead of intro to databases? (2014)Panel at the 2014 ACM SIGMOD international conference on management of data, pp. 917–918
-
Journal Article, Academic JournalThe database group at the University of Washington (2014)SIGMOD Record, 43(1), pp. 39–44
-
Book, Chapter in Scholarly Book-NewA discussion on pricing relational data (2013)In Search of Elegance in the Theory and Practice of Computation, pp. 167–173
-
Journal Article, Academic JournalCollaborative science workflows in SQL (2013)Computing in Science and Engineering, 15(3), pp. 22–31
-
Conference ProceedingCompiled Plans for In-Memory Path-Counting Queries (2013)Proceedings of the 1st International Workshop on In Memory Data Management and Analytics, IMDM 2013, Riva Del Garda, Italy, August 26, 2013., pp. 25–37
-
Journal Article, Academic JournalHadoop’s adolescence: an analysis of Hadoop usage in scientific workloads (2013)Proceedings of the VLDB Endowment, 6(10), pp. 853–864
-
Journal Article, Academic JournalManaging Skew in Hadoop. (2013)IEEE Data Eng. Bull., 36(1), pp. 24–33
-
Conference ProceedingMassive scale cyber traffic analysis: a driver for graph database research (2013)First International Workshop on Graph Data Management Experiences and Systems, pp. 3
-
Conference ProceedingReal-time collaborative analysis with (almost) pure SQL: a case study in biogeochemical oceanography (2013)Proceedings of the 25th International Conference on Scientific and Statistical Database Management, pp. 28
-
Journal Article, Academic JournalSQLShare: Scientific workflow via relational view sharing (2013)Computing in Science and Engineering, Special Issue on Science Data Management, 15(2)
-
Conference ProceedingScalable Flow-Based Community Detection for Large-Scale Network Analysis (2013)Proceedings of IEEE International Conference on Data Mining Workshops (ICDMW 2013)
-
Conference ProceedingStop That Query! The Need for Managing Data Use. (2013)CIDR
-
Conference ProceedingThe power of data use management in action (2013)Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1117–1120
-
Conference ProceedingToward practical query pricing with QueryMarket (2013)Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 613–624
-
Conference ProceedingVizDeck: Streamlining exploratory visual analytics of scientific data (2013)iConference
-
Conference ProceedingHadoop’s Adolescence; A Comparative Workloads Analysis from Three Research Clusters. (2012)SC Companion, pp. 1452
-
Conference ProceedingOptimizing large-scale semi-naive datalog evaluation in hadoop (2012)Proceedings of the Second International Conference on Datalog in Academia and Industry
-
Conference ProceedingQuery-based data pricing (2012)Proceedings of the 31st symposium on Principles of Database Systems (PODS)
-
Journal Article, Academic JournalQueryMarket demonstration: Pricing for online data markets (2012)Proceedings of the VLDB Endowment, 5(12), pp. 1962–1965
-
Conference ProceedingSkewtune: mitigating skew in mapreduce applications (2012)Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 25–36
-
Journal Article, Academic JournalThe HaLoop approach to large-scale iterative data analysis (2012)The VLDB Journal—The International Journal on Very Large Data Bases, 21(2), pp. 169–190
-
Journal Article, Academic JournalVirtual appliances, cloud computing, and reproducible research (2012)Computing in Science and Engineering, 14(4), pp. 36–41
-
Conference ProceedingVizDeck: A Card Game Metaphor for Fast Visual Data Exploration (2012)CHI ’12 Extended Abstracts on Human Factors in Computing Systems, pp. 1667–1672, ISBN/ISSN: 978-1-4503-1016-1
-
Conference ProceedingVizdeck: Self-organizing dashboards for visual analytics (2012)Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (demo), pp. 681–684
-
Journal Article, Academic JournalA study of skew in mapreduce applications (2011)Open Cirrus Summit
-
Journal Article, Academic JournalAstronomy in the cloud: using mapreduce for image co-addition (2011)Publications of the Astronomical Society of the Pacific, 123(901), pp. 366
-
Conference ProceedingAutomatic example queries for ad hoc databases (2011)Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011, pp. 1319–1322
-
Journal Article, Academic JournalBioinformatics and data-intensive scientific discovery in the beginning of the 21st century (2011)Omics: a journal of integrative biology, 15(4), pp. 199–201
-
Journal Article, Academic JournalData markets in the cloud: An opportunity for the database community (2011)Proc. of the VLDB Endowment, 4(12), pp. 1482–1485
-
Conference ProceedingDatabase-as-a-service for long-tail science (2011)Scientific and Statistical Database Management, pp. 480–489
-
Conference ProceedingParallel visualization on large clusters using MapReduce (2011)Large Data Analysis and Visualization (LDAV), 2011 IEEE Symposium on, pp. 81–88
-
Journal Article, Academic JournalAstronomy in the Cloud: Using MapReduce for Image Coaddition (2010)CoRR, abs/1010.1015(Unknown Issue)
-
Conference ProceedingClient+ cloud: evaluating seamless architectures for visual data analytics in the ocean sciences (2010)Scientific and Statistical Database Management, pp. 114–131
-
Journal Article, Academic JournalHaLoop: efficient iterative data processing on large clusters (2010)Proceedings of the VLDB Endowment, 3(1-2), pp. 285–296
-
Conference ProceedingSQL is dead; long live SQL: Lightweight query services for ad hoc research data (2010)4th Microsoft eScience Workshop
-
Conference ProceedingScalable clustering algorithm for N-body simulations in a shared-nothing cluster (2010)Scientific and Statistical Database Management, pp. 132–150
-
Conference ProceedingSkew-resistant parallel processing of feature-extracting scientific user-defined functions (2010)Proceedings of the 1st ACM symposium on Cloud computing, pp. 75–86
-
Conference ProceedingAnalyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help? (2009)Cluster Computing and Workshops, 2009. CLUSTER’09. IEEE International Conference on, pp. 1–10
-
Conference ProceedingEmbracing Uncertainty in Large-Scale Computational Astrophysics. (2009)MUD, pp. 63–77
-
Conference ProceedingQuery-driven visualization in the cloud with mapreduce (2009)Proceedings of the Fourth Annual Workshop on Ultrascale Visualization
-
Conference ProceedingScientific mashups: runtime-configurable data product ensembles (2009)Scientific and Statistical Database Management, pp. 19–36
-
Conference ProceedingEnd-to-end escience: Integrating workflow, query, visualization, and provenance at an ocean observatory (2008)eScience, 2008. eScience’08. IEEE Fourth International Conference on, pp. 127–134
-
Conference ProceedingQuarrying dataspaces: Schemaless profiling of unfamiliar information sources (2008)Workshop on Information Integration Methods, Architectures, and Systems (IIMAS 08), pp. 270–277
-
Journal Article, Academic JournalSciDB Examples from Environmental Observation and Modeling (2008)Center for Coastal Margin Observation and Prediction
-
Conference ProceedingScientific Mashups: Runtime-Configurable Data Product Ensembles (2008)eScience, 2008. eScience’08. IEEE Fourth International Conference on, pp. 442–443
-
Journal Article, Academic JournalScientific exploration in the era of ocean observatories (2008)Computing in Science and Engineering, 10(3), pp. 53–58
-
Conference ProceedingSmoothing the ROI Curve for Scientific Data Management Applications. (2007)CIDR, pp. 185–195
-
Conference ProceedingThe Ocean Appliance: Complete Platform Provisioning for Low-Cost Data Sharing (2007)OCEANS 2007, pp. 1–10
-
Ph.D. ThesisGridfields: model-driven data transformation in the physical sciences (2006)
-
Conference ProceedingManaging the Forecast Factory (2006)Data Engineering Workshops, 2006. Proceedings. 22nd International Conference on, pp. 64–64
-
Journal Article, Academic JournalAlgebraic manipulation of scientific datasets (2005)The VLDB journal, 14(4), pp. 397–416
-
Journal Article, Academic JournalGridFields: Model-Driven Query Services for Simulation Results in the Physical Sciences (2005)
-
Conference ProceedingQuerying and Visualizing Gridded Datasets for e-Science (2005)Proceedings of the 21st International Conference on Data Engineering, ICDE 2005, 5-8 April 2005, Tokyo, Japan, pp. 1106–1107
-
Conference ProceedingRetrofitting a Data Model to Existing Environmental Data. (2005)SSDBM, pp. 3–13
-
Book, Chapter in Scholarly Book-NewEmergent semantics: Towards self-organizing scientific metadata (2004)Semantics of a Networked World. Semantics for Grid Databases, pp. 177–198
-
Journal Article, Academic JournalLogical and Physical Data Independence for Native Scientific Data Repositories. (2004)IEEE Data Eng. Bull., 27(4), pp. 29–36
-
Journal Article, Academic JournalA language for spatial data manipulation (2003)Journal of Environmental Informatics, 2(2), pp. 23–37
-
Conference ProceedingModeling data product generation (2002)Workshop on Data Derivation and Provenance, Chicago
-
Conference ProceedingRepresenting, exploiting, and extracting metadata using metadata++ (2002)Proceedings of the 2002 annual national conference on Digital government research, pp. 1–7
Presentations
-
Applied AI in High-Expertise Settings, or Curation as Programming
(2023)
Engineering IDBE Seminar Series, NYU Abu Dhabi - Virtual
-
Data Curation as Programming
(2023)
15th Alberto Mendelzon International Workshop on Foundations of Data Management - Santiago, Chile
-
Applied AI in High-Expertise Settings, or Curation as Programming
(2022)
AI2 - Tahoma, WA (Virtual)
-
Ethical AI in the Public Sector: Towards A Semi-Synthetic Data Fabric for AI Evaluation
(2022)
Cisco Systems, Inc. - Virtual
-
Data-Centric AI: Reuse, Integration, and Synthesis of Weakly Structured Data
(2021)
Northeastern - Boston, MA
-
Equitensors: Learning Fair Integration of Urban Mobility Data
(2021)
Berkeley Institute for Transportation Studies - Berkeley, CA
-
Introspection and Interventions in Data Equity Systems
(2021)
Provenance and Visualisation Workshop - Virtual
-
Data Equity Systems
(2020)
DEEM Workshop - Virtual
-
Public interest research in data management and machine learning
(2020)
NYU - New York, NY
-
Special Session: A Technical Research Agenda in Data Ethics and Responsible Data Management
(2018)
SIGMOD - Houston, TX
-
Bias and Ethics in City Services Data Science
(2017)
Bloomberg Data for Good Exchange - New York, NY
-
Big Data + Big Sim: Query Processing over Unstructured CFD Models
(2017)
ISIM Research Workshop - Durham, UK
-
Data Analysis and Visualization Workshop
(2017)
Schloss Dagstuhl – Leibniz-Zentrum für Informatik - Dagstuhl, Germany
-
Data in the Humanities Panel
(2017)
2017 ICDE (IEEE International Conference on Data Engineering) - San Diego, CA
-
Data, Responsibly: The Next Decade of Data Science
(2017)
iSchool Founding Board - Seattle, WA
-
Deep Curation
(2017)
Tandon School of Engineering, New York University - New York, NY
-
Epistemic Issues in Data Science
(2017)
University of Massachusetts, Amherst - Amherst, Massachusetts
-
Fake Data for Social Good
(2017)
Bloomberg Data for Good Exchange - New York, NY
-
Responsible Urban Data Science
(2017)
Redondo Beach, CA
-
The Information War: Fake News, Privacy, and Big Data
(2017)
eScience Institute, University of Washington - Seattle, WA
-
The Next Decade of Data Science
(2017)
Distinguished Colloquium, University of Maryland - College Park, Maryland
-
Viziometrics: Mining the Visual Literature
(2017)
Vizualization Seminar, Scientific Computing Institute, University of Utah - Salt Lake City, UT
-
Workshop on Science and Technology for Washington State: Advising the Legislature
(2017)
Seattle, WA
-
Democratizing Data in the Cloud
(2016)
Workshop on Cloud Data Management (CloudDM) (co-located with ICDE) - Helsinki, Finland
-
Going "Deep" with Computational Data Curation
(2016)
NSF-JST Meeting on Big Data, AI, loT, and Cybersecurity for a New Society - USA
-
Responsible Data Science and Reproducibility
(2016)
Dagstuhl Seminar on “Data, Responsibly,” Schloss Dagstuhl – Leibniz Center for Informatics - Germany
-
Urban Analytics and Responsible Data Science
(2016)
SciTech Northwest - Seattle, WA
-
Data Equity Systems
Northwest Database Society - Virtual