Featured Publications
Research Areas
Publications
J. Sivaloganathan, A. Trišović, N. Thompson. (2025). 21st IEEE International Conference on e-Science (eScience 2025)
Description
We analyze scientific publications using Foundation Models to assess their alignment with the UN Sustainable Development Goals, revealing a concentration of work on a few goals and large gaps in others.
A. Fogelson, A. Trišović, N. Thompson. (2025). ACM Conference on Reproducibility and Replicability (ACM REP)
Description
We investigate the use of LLMs for multi-class citation intent classification, highlighting striking inter-model disagreement among state-of-the-art systems and revealing key challenges in the robustness, transparency, and reproducibility of LLM-based research.
J. K. Hu*, A. Trišović*, A. Bakshi, D. Braun, F. Dominici, J. A. Casey. (2025). Science Advances
Description
We analyze 15 years of daily census tract–level data to quantify coexposure to extreme heat, wildfire burn zones, and smoke across 11 Western U.S. states. Coexposures—especially between heat and smoke—increased over time and disproportionately affected vulnerable and Indigenous populations.
A. Trišović, G. Miller, D. Bertsimas, and J. K. Hu. (2025). Tackling Climate Change with Machine Learning at ICLR
Description
We introduce a framework to identify a state-of-the-art multi-task model for jointly predicting heatwaves, droughts, and wildfires, capturing shared risk factors across these climate extremes.
(α-β) J. A. Rising, A. Hussain, K. Schwarzwald, A. Trisovic. (2024). Journal of Open Source Education (JOSE)
Description
We present a free and open-source tutorial on the practical aspects of climate econometrics, which includes data collection, analysis design, and result presentation (available at climateestimate.net).
M. Tec, A. Trisovic, M. Audirac, S. Woodward, J. K. Hu, N. Khoshnevis, F. Dominici. (2024). The 12th International Conference on Learning Representations (ICLR)
Description
We introduce SpaCE - The Spatial Confounding Environment, the first toolkit to provide realistic benchmark datasets and tools for systematically evaluating causal inference methods designed to alleviate spatial confounding.
M. Tec, A. Trisovic, M. Audirac, F. Dominici. (2023). Causal Learning and Reasoning (CLeaR)
Description
The article introduces SpaCE datasets as a benchmarking tool to assist in developing novel methods to address outstanding challenges in spatial and network causal inference.
W. Lee, X. Wu, S. Heo, J. M. Kim, K. C. Fong, J. Son, M. B. Sabath, A. Trisovic, D. Braun, J. Y. Park, Y. C. Kim, J. P. Lee, J. Schwartz, H. Kim, F. Dominici, Z. Al-Aly, M. L. Bell. (2023). Environmental Health Perspectives
Description
The article investigates the association between short-term exposure to air pollution and acute kidney injury (AKI) in the US Medicare population.
D. Bouquin, A. Trisovic, O. Bertuch, E. Colón-Marrero. (2023). Software Citation Workshop 2022 (ArXiv)
Description
Software's pivotal role in progress is not mirrored in traditional acknowledgments. This report from captures insights from 51 global experts on unresolved software citation issues. It aims to pinpoint and tackle these challenges, benefiting the GLAM community, repository managers, software developers, and publishers.
D. Garijo, H. Ménager, L. Hwang, A. Trisovic, M. Hucka, T. Morrell, A. Allen, Task Force on Best Practices for Software Registries, SciCodes Consortium. (2022). PeerJ Computer Science
Description
As the FORCE11 Software Citation Implementation Working Group, we describe the best practices for software repositories and registries which include defining the scope, policies, and governing rules, along with the background, examples, and collaborative work that went into their development.
A. Trisovic. (2023). International Journal of Digital Curation
Description
The article presents a cluster analysis of 1,000+ open research datasets from the Harvard Dataverse repository to identify the most common replication metadata elements.
A. Trisovic, T. Pasquier, M. K. Lau, M. Crosas. (2022). Nature Scientific Data
Description
This paper presents a large-scale study of the quality, programming literacy, and reproducibility of over 2100 datasets that contain research code in R from the Harvard Dataverse data repository.
(α-β) S. Soiland-Reyes, P. Sefton, M. Crosas, L. J. Castro, F. Coppens, J. M. Fernández, D. Garijo, B. Grüning, M. La Rosa, S. Leo, E. Ó Carragáin, M. Portier, A. Trisovic, RO-Crate Community, P. Groth, C. Goble. (2021). Data Science
Description
We introduce RO-Crate, an open, community-driven, lightweight approach to packaging research artifacts with metadata, including their identifiers, provenance, relations, and annotations.
N. Miljković, A. Trisovic, L. Peer. (2021). Conference on Application of Free Software and Open Hardware (PSSOH)
Description
We elaborate on open hardware dissemination and reuse complexity, present examples of unique demands, and propose leveraging FAIR principles to make it findable, accessible, interoperable, and reusable.
(α-β) K. Blumenthal, A. Goeva, S. Stoudt, A. Trisovic, P. Trisovic. (2021). Harvard Data Science Review
Description
Explainer Zine for the article "Designing for interactive exploratory data analysis requires theory of graphical inference."
(α-β) A. Goeva, P. Jones, S. Stoudt, A. Trisovic. (2021). Harvard Data Science Review
Description
We propose a handful of connector courses for data science, inspired by the article "Interleaving Computational and Inferential Thinking- Data Science for Undergraduates at Berkeley."
A. Trisovic, K. Mika, C. Boyd, S. Feger, M. Crosas. (2021). Data
Description
We propose three approaches based on computational reproducibility, data curation, and gamified design elements that can be used to indicate and improve the quality of shared data and code in data repositories.
(α-β) S. Frost, A. Goeva, J. Pombra, W. Seaton, S. Stoudt, A. Trisovic, C. Wang, C. Zucker. (2021). Harvard Data Science Review
Description
Early-Career Board members of the Harvard Data Science Review discuss the acquisition of practical data science skills and share their experiences from a number of disciplines.
(α-β) A. Goeva, S. Stoudt, A. Trisovic. (2020). Harvard Data Science Review
Description
This paper discusses the National Academies' report "Reproducibility and Replicability in Science," advocating for reusability and the need for actionable and hierarchical steps for researchers.
(α-β) S. Frost, A. Goeva, W. Seaton, S. Stoudt, A. Trisovic. (2020). Harvard Data Science Review
Description
Early-Career Board members of the Harvard Data Science Review present their view of top research challenge areas in data science.
A. Trisovic, P. Durbin, T. Schlatter, G. Durand, S. Barbosa, D. Brooke, M. Crosas. (2020). 3rd International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS)
Description
The Dataverse repository software has undertaken integrations with the platforms Code Ocean, Whole Tale, Renku, and Jupyter Binder, which will help capture research code dependencies and advance reproducibility.
A. E. Woodard, A. Trisovic, Z. Li, Y. Babuji, R. Chard, T. Skluzacek, B. Blaiszik, D. S. Katz, I. Foster, K. Chard. (2020). 24th International Conference on Computing in High Energy & Nuclear Physics (CHEP)
Description
We present how the function-as-a-service paradigm can address CERN's computing challenges with efficient and scalable experimental data processing on heterogeneous resources.
A. Trisovic, C. R. Jones, B. Couturier, M. Clemencic. (2020). Computing in Science & Engineering (CISE)
Description
We argue that reproducibility needs to be incorporated into the existing infrastructure and present a new functionality in the CERN software that captures all information within a resulting dataset necessary to reproduce it.
(α-β) X. Chen, S. Dallmeier-Tiessen, R. Dasler, S. Feger, P. Fokianos, J. B. Gonzalez, H. Hirvonsalo, D. Kousidis, A. Lavasa, S. Mele, D. R. Rodriguez, T. Šimko, T. Smith, A. Trisovic, A. Trzcinska, I. Tsanaktsidis, M. Zimmermann, K. Cranmer, L. Heinrich, G. Watts, M. Hildreth, L. Lloret Iglesias, K. Lassila-Perini, S. Neubert. (2019). Nature Physics
Description
The platforms CERN Analysis Preservation and Reusable Analyses (REANA) are created to facilitate reproducible research for the LHC experiments at CERN. The project, CERN Open Data, disseminates particle-physics data that can be used for research.
A. Trisovic. (2018). 7th International Symposium on Industrial Engineering
Description
The paper presents a number of challenges, questions, and use-cases that can be addressed by exploring and analyzing the LHCb graph database that captures its data and software.
A. Trisovic, B. Couturier, V. Gibson, C. Jones. (2017). 22th International Conference on Computing in High Energy and Nuclear Physics (CHEP)
Description
We present the design and development of the LHCb graph database that captures the scientific software stack, its software and hardware dependencies, and its products, which are simulation and experimental data.
T. Pasquier, M. K. Lau, A. Trisovic, E. R. Boose, B. Couturier, M. Crosas, A. M. Ellison, V. Gibson, C. R. Jones, M. Seltzer. (2017). Nature Scientific Data
Description
The lack of formalism hinders reporting in computational research, which hinders reproducibility. Data provenance can aid in this problem, as showcased in two use-cases- physics (CERN) and ecology (Harvard Forest).
A. Trisovic. (2016). 37th International Conference on High Energy Physics (ICHEP)
Description
The paper presents the design of a stand-alone educational application that displays proton-proton collisions in the LHCb experiment created for the International Masterclass in Physics.