Tec M., Trisovic A., Audirac M., and Dominici F., CLeaR (Causal Learning and Reasoning). (2023). SpaCE: The Spatial Confounding (Benchmarking) Environment. |
The article introduces SpaCE datasets as a benchmarking tool to assist in developing novel methods to address outstanding challenges in spatial and network causal inference. |
Garijo D., Ménager H., Hwang L., Trisovic A., Hucka M., Morrell T., Allen A., Task Force on Best Practices for Software Registries, SciCodes Consortium. (2022).Nine best practices for research software registries and repositories. PeerJ Computer Science . |
As the FORCE11 Software Citation Implementation Working Group, we describe the best practices for software repositories and registries which include defining the scope, policies, and governing rules, along with the background, examples, and collaborative work that went into their development. |
Trisovic, A., Pasquier, T., Lau, M. K., & Crosas, M. (2022). A Large-Scale Study on Research Code Quality and Execution. Nature Scientific Data. |
This paper presents a large-scale study of the quality, programming literacy, and reproducibility of over 2100 datasets that contain research code in R from the Harvard Dataverse data repository. |
Soiland-Reyes, S., Sefton, P., Crosas, M., Castro, L. J., Coppens, F., Fernández, J. M., Garijo, D., Grüning, B., La Rosa, M., Leo, S., Ó Carragáin, E., Portier, M., Trisovic, A., RO-Crate Community, Groth, P., & Goble, C. (2021). Packaging Research Artefacts with RO-Crate. Data Science. |
We introduce RO-Crate, an open, community-driven, lightweight approach to packaging research artifacts with metadata, including their identifiers, provenance, relations, and annotations.
|
Miljković, N., Trisovic, A., & Peer, L. (2021). Towards FAIR Principles for Open Hardware. Conference on Application of Free Software and Open Hardware (PSSOH). |
We elaborate on open hardware dissemination and reuse complexity, present examples of unique demands, and propose leveraging FAIR principles to make it findable, accessible, interoperable, and reusable. |
Blumenthal, K., Goeva, A., Stoudt, S., Trisovic, A., & Trisovic, P. (2021). Why Do We Plot Data? Harvard Data Science Review. |
Explainer Zine for the article "Designing for interactive exploratory data analysis requires theory of graphical inference." |
Goeva, A., Jones, P., Stoudt, S., & Trisovic, A. (2021). Recipes for Connector Courses From the Early-Career Board Kitchen. Harvard Data Science Review. |
We propose a handful of connector courses for data science, inspired by the article "Interleaving Computational and Inferential Thinking: Data Science for Undergraduates at Berkeley." |
Trisovic, A., Mika, K., Boyd, C., Feger, S., & Crosas, M. (2021). Repository Approaches to Improving the Quality of Shared Data and Code. Data. |
We propose three approaches based on computational reproducibility, data curation, and gamified design elements that can be used to indicate and improve the quality of shared data and code in data repositories. |
Rising, J. A., Hussain, A., Schwarzwald, K., & Trisovic, A. (2021). A Practical Guide to Climate Econometrics: Navigating Key Decision Point in Weather and Climate Data Analysis. Accepted in Journal of Open Source Education (JOSE). |
We present a free and open-source tutorial on the practical aspects of climate econometrics, which includes data collection, analysis design, and result presentation (available at climateestimate.net). |
Frost, S., Goeva, A., Pombra, J., Seaton, W., Stoudt, S., Trisovic, A., Wang, C., & Zucker, C. (2021). Kaleidoscopic Perspectives on Practicum-Based Data Science Education. Harvard Data Science Review. |
Early-Career Board members of the Harvard Data Science Review discuss the acquisition of practical data science skills and share their experiences from a number of disciplines. |
Goeva, A., Stoudt, S., & Trisovic, A. (2020). Toward Reproducible and Extensible Research: From Values to Action. Harvard Data Science Review. |
This paper discusses the National Academies' report "Reproducibility and Replicability in Science," advocating for reusability and the need for actionable and hierarchical steps for researchers. |
Frost, S., Goeva, A., Seaton, W., Stoudt, S., & Trisovic, A. (2020). Early-Career View on Data Science Challenges: Responsibility, Rigor, and Accessibility. Harvard Data Science Review. |
Early-Career Board members of the Harvard Data Science Review present their view of top research challenge areas in data science. |
Trisovic, A., Durbin, P., Schlatter, T., Durand, G., Barbosa, S., Brooke, D., & Crosas, M. (2020). Advancing Computational Reproducibility in the Dataverse Data Repository Platform. 3rd International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS). |
The Dataverse repository software has undertaken integrations with the platforms Code Ocean, Whole Tale, Renku, and Jupyter Binder, which will help capture research code dependencies and advance reproducibility. |
Woodard, A. E., Trisovic, A., Li, Z., Babuji, Y., Chard, R., Skluzacek, T., Blaiszik, B., Katz, D. S., Foster, I., & Chard, K. (2020). Real-Time HEP Analysis With FuncX – a High-Performance Platform for Function as a Service. 24th International Conference on Computing in High Energy & Nuclear Physics (CHEP). |
We present how the function-as-a-service paradigm can address CERN's computing challenges with efficient and scalable experimental data processing on heterogeneous resources. |
Trisovic, A., Jones, C. R., Couturier, B., & Clemencic, M. (2020). Provenance Tracking in the LHCb Software. Computing in Science & Engineering (CISE). |
We argue that reproducibility needs to be incorporated into the existing infrastructure and present a new functionality in the CERN software that captures all information within a resulting dataset necessary to reproduce it. |
Chen, X., Dallmeier-Tiessen, S., Dasler, R., Feger, S., Fokianos, P., Gonzalez, J. B., Hirvonsalo, H., Kousidis, D., Lavasa, A., Mele, S., Rodriguez, D. R., Šimko, T., Smith, T., Trisovic, A., Trzcinska, A., Tsanaktsidis, I., Zimmermann, M., Cranmer, K., Heinrich, L., Watts, G., Hildreth, M., Lloret Iglesias, L., Lassila-Perini, K., & Neubert, S. (2019). Open is not enough. Nature Physics. |
The platforms CERN Analysis Preservation and Reusable Analyses (REANA) are created to facilitate reproducible research for the LHC experiments at CERN. The project, CERN Open Data, disseminates particle-physics data that can be used for research. |
Trisovic, A. (2018). Graph Mining at the High-Energy Physics Experiment LHCb. 7th International Symposium on Industrial Engineering. |
The paper presents a number of challenges, questions, and use-cases that can be addressed by exploring and analyzing the LHCb graph database that captures its data and software. |
Trisovic, A., Couturier, B., Gibson, V., & Jones, C. (2017). Recording the LHCb Data and Software Dependencies. 22th International Conference on Computing in High Energy and Nuclear Physics (CHEP). |
We present the design and development of the LHCb graph database that captures the scientific software stack, its software and hardware dependencies, and its products, which are simulation and experimental data. |
Pasquier, T., Lau, M. K., Trisovic, A., Boose, E. R., Couturier, B., Crosas, M., Ellison, A. M., Gibson, V., Jones, C. R., & Seltzer, M. (2017). If These Data Could Talk. Nature Scientific Data. |
The lack of formalism hinders reporting in computational research, which hinders reproducibility. Data provenance can aid in this problem, as showcased in two use-cases: physics (CERN) and ecology (Harvard Forest). |
Trisovic, A. (2016). Measuring the D0 Lifetime at the LHCb Masterclass. 37th International Conference on High Energy Physics (ICHEP). |
The paper presents the design of a stand-alone educational application that displays proton-proton collisions in the LHCb experiment created for the International Masterclass in Physics. |