<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Wei Zhang | UCSC OSPO</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/wei-zhang/</link><atom:link href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/wei-zhang/index.xml" rel="self" type="application/rss+xml"/><description>Wei Zhang</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/wei-zhang/avatar_hu839d51eba455e551c49a45dd5f875db7_483515_270x270_fill_q75_lanczos_center.jpg</url><title>Wei Zhang</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/wei-zhang/</link></image><item><title>Exploration of I/O Reproducibility with HDF5</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/pnnl/h5_reproducibility/</link><pubDate>Wed, 19 Feb 2025 09:00:00 -0700</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/pnnl/h5_reproducibility/</guid><description>&lt;p>Parallel I/O is a critical component in high-performance computing (HPC), allowing multiple processes to read and write data concurrently from a shared storage system. &lt;a href="https://github.com/HDFGroup/hdf5" target="_blank" rel="noopener">HDF5&lt;/a>—a widely adopted data model and library for managing complex scientific data—supports parallel I/O but introduces challenges in I/O reproducibility, where repeated executions do not always produce identical results. This lack of reproducibility can stem from non-deterministic execution orders, variations in collective buffering strategies, and race conditions in metadata and dataset chunking operations within HDF5’s parallel I/O hierarchy. Moreover, many HDF5 operations that leverage &lt;a href="%28https://www.hdfgroup.org/wp-content/uploads/2020/02/20200206_ECPTutorial-final.pdf%29">MPI I/O&lt;/a> require collective communication; that is, all processes within a communicator must participate in operations such as metadata creation, chunk allocation, and data aggregation. These collective calls ensure that the file structure and data layout remain consistent across processes, but they also introduce additional synchronization complexity that can impact reproducibility if not properly managed. In HPC scientific workflows, consistent I/O reproducibility is essential for accurate debugging, validation, and benchmarking, ensuring that scientific results are both verifiable and trustworthy. Tools such as &lt;a href="https://github.com/hpc-io/h5bench" target="_blank" rel="noopener">h5bench&lt;/a>—a suite of I/O kernels designed to exercise HDF5 I/O on parallel file systems—play an important role in identifying these reproducibility challenges, tuning performance, and ultimately supporting the overall robustness of large-scale scientific applications.&lt;/p>
&lt;h3 id="workplan">Workplan&lt;/h3>
&lt;p>The proposed work will include (1) analyzing and characterizing parallel I/O operations in &lt;a href="https://www.hdfgroup.org/wp-content/uploads/2020/02/20200206_ECPTutorial-final.pdf" target="_blank" rel="noopener">HDF5&lt;/a> with &lt;a href="https://github.com/hpc-io/h5bench" target="_blank" rel="noopener">h5bench&lt;/a> miniapps, (2) exploring and validating potential reproducibility challenges within the parallel I/O hierarchy (e.g., MPI I/O), and (3) implementing solutions to address parallel I/O reproducibility.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Topics:&lt;/strong> &lt;code>Parallel I/O&lt;/code> &lt;code>MPI-I/O&lt;/code> &lt;code>Reproducibility&lt;/code> &lt;code>HPC&lt;/code> &lt;code>HDF5&lt;/code>&lt;/li>
&lt;li>&lt;strong>Skills:&lt;/strong> C/C++, Python&lt;/li>
&lt;li>&lt;strong>Difficulty:&lt;/strong> Medium&lt;/li>
&lt;li>&lt;strong>Size:&lt;/strong> Large (350 hours)&lt;/li>
&lt;li>&lt;strong>Mentors:&lt;/strong> &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/luanzheng-lenny-guo/">Luanzheng &amp;quot;Lenny&amp;quot; Guo&lt;/a> and [Wei Zhang]&lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/wei-zhang/">Wei Zhang&lt;/a>&lt;/li>
&lt;/ul></description></item></channel></rss>