<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>SoR | UCSC OSPO</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/category/sor/</link><atom:link href="https://deploy-preview-1007--ucsc-ospo.netlify.app/category/sor/index.xml" rel="self" type="application/rss+xml"/><description>SoR</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Mon, 01 Sep 2025 00:00:00 +0000</lastBuildDate><image><url>https://deploy-preview-1007--ucsc-ospo.netlify.app/media/logo_hub6795c39d7c5d58c9535d13299c9651f_74810_300x300_fit_lanczos_3.png</url><title>SoR</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/category/sor/</link></image><item><title>Final Report: MPI Appliance for HPC Research on Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250901-rohan-babbar/</link><pubDate>Mon, 01 Sep 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250901-rohan-babbar/</guid><description>&lt;p>Hi Everyone, This is my final report for the project I completed during my summer as a &lt;a href="https://ucsc-ospo.github.io/sor/" target="_blank" rel="noopener">Summer of Reproducibility (SOR)&lt;/a> student.
The project, titled &amp;ldquo;&lt;a href="https://ucsc-ospo.github.io/project/osre25/uchicago/mpi/" target="_blank" rel="noopener">MPI Appliance for HPC Research in Chameleon&lt;/a>,&amp;rdquo; was undertaken in collaboration with Argonne National Laboratory
and the Chameleon Cloud community. The project was mentored by &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/ken-raffenetti/">Ken Raffenetti&lt;/a> and was completed over the summer.
This blog details the work and outcomes of the project.&lt;/p>
&lt;h2 id="background">Background&lt;/h2>
&lt;p>Message Passing Interface (MPI) is the backbone of high-performance computing (HPC), enabling efficient scaling across thousands of
processing cores. However, reproducing MPI-based experiments remains challenging due to dependencies on specific library versions,
network configurations, and multi-node setups.&lt;/p>
&lt;p>To address this, we introduce a reproducibility initiative that provides standardized MPI environments on the Chameleon testbed.
This is set up as a master–worker MPI cluster. The master node manages tasks and communication, while the worker nodes do the computations.
All nodes have the same MPI libraries, software, and network settings, making experiments easier to scale and reproduce.&lt;/p>
&lt;h2 id="objectives">Objectives&lt;/h2>
&lt;p>The aim of this project is to create an MPI cluster that is reproducible, easily deployable, and efficiently configurable.&lt;/p>
&lt;p>The key objectives of this project were:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Pre-built MPI Images: Create ready-to-use images with MPI and all dependencies installed.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Automated Cluster Configuration: Develop Ansible playbooks to configure master–worker communication, including host setup, SSH key distribution, and MPI configuration across nodes.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cluster Orchestration: Develop orchestration template to provision resources and invoke Ansible playbooks for automated cluster setup.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="implementation-strategy-and-deliverables">Implementation Strategy and Deliverables&lt;/h2>
&lt;h3 id="openstack-image-creation">Openstack Image Creation&lt;/h3>
&lt;p>The first step was to create a standardized pre-built image, which serves as the base image for all nodes in the cluster.&lt;/p>
&lt;p>Some important features of the image include:&lt;/p>
&lt;ol>
&lt;li>Built on Ubuntu 22.04 for a stable base environment.&lt;/li>
&lt;li>&lt;a href="https://spack.io/" target="_blank" rel="noopener">Spack&lt;/a> + Lmod integration:
&lt;ul>
&lt;li>Spack handles reproducible, version-controlled installations of software packages.&lt;/li>
&lt;li>Lmod (Lua Modules) provides a user-friendly way to load/unload software environments dynamically.&lt;/li>
&lt;li>Together, they allow users to easily switch between MPI versions, libraries, and GPU toolkits&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="https://github.com/pmodels/mpich" target="_blank" rel="noopener">MPICH&lt;/a> and &lt;a href="https://github.com/open-mpi/ompi" target="_blank" rel="noopener">OpenMPI&lt;/a> pre-installed for standard MPI support and can be loaded/unloaded.&lt;/li>
&lt;li>Three image variants for various HPC workloads: CPU-only, NVIDIA GPU (CUDA 12.8), and AMD GPU (ROCm 6.4.2).&lt;/li>
&lt;/ol>
&lt;p>These images have been published and are available in the Chameleon Cloud Appliance Catalog:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://chameleoncloud.org/appliances/127/" target="_blank" rel="noopener">MPI and Spack for HPC (Ubuntu 22.04)&lt;/a> - CPU Only&lt;/li>
&lt;li>&lt;a href="https://chameleoncloud.org/appliances/130/" target="_blank" rel="noopener">MPI and Spack for HPC (Ubuntu 22.04 - CUDA)&lt;/a> - NVIDIA GPU (CUDA 12.8)&lt;/li>
&lt;li>&lt;a href="https://chameleoncloud.org/appliances/131/" target="_blank" rel="noopener">MPI and Spack for HPC (Ubuntu 22.04 - ROCm)&lt;/a> - AMD GPU (ROCm 6.4.2)&lt;/li>
&lt;/ul>
&lt;h3 id="cluster-configuration-using-ansible">Cluster Configuration using Ansible&lt;/h3>
&lt;p>The next step is to create scripts/playbooks to configure these nodes and set up an HPC cluster.
We assigned specific roles to different nodes in the cluster and combined them into a single playbook to configure the entire cluster automatically.&lt;/p>
&lt;p>Some key steps the playbook performs:&lt;/p>
&lt;ol>
&lt;li>Configure /etc/hosts entries for all nodes.&lt;/li>
&lt;li>Mount Manila NFS shares on each node.&lt;/li>
&lt;li>Generate an SSH key pair on the master node and add the master’s public key to the workers’ authorized_keys.&lt;/li>
&lt;li>Scan worker node keys and update known_hosts on the master.&lt;/li>
&lt;li>(Optional) Manage software:
&lt;ul>
&lt;li>Install new compilers with Spack&lt;/li>
&lt;li>Add new Spack packages&lt;/li>
&lt;li>Update environment modules to recognize them&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Create a hostfile at /etc/mpi/hostfile.&lt;/li>
&lt;/ol>
&lt;p>The code is publicly available and can be found on the GitHub repository: &lt;a href="https://github.com/rohanbabbar04/MPI-Spack-Experiment-Artifact" target="_blank" rel="noopener">https://github.com/rohanbabbar04/MPI-Spack-Experiment-Artifact&lt;/a>&lt;/p>
&lt;h3 id="orchestration">Orchestration&lt;/h3>
&lt;p>With the image now created and deployed, and the Ansible scripts ready for cluster configuration, we put everything
together to orchestrate the cluster deployment.&lt;/p>
&lt;p>This can be done in two primary ways:&lt;/p>
&lt;h4 id="python-chijupyter--ansible">Python CHI(Jupyter) + Ansible&lt;/h4>
&lt;p>&lt;a href="https://github.com/ChameleonCloud/python-chi" target="_blank" rel="noopener">Python-CHI&lt;/a> is a python library designed to facilitate interaction with the Chameleon testbed. Often used within environments like Jupyter notebooks.&lt;/p>
&lt;p>This setup can be put up as:&lt;/p>
&lt;ol>
&lt;li>Create leases, launch instances, and set up shared storage using python-chi commands.&lt;/li>
&lt;li>Automatically generate inventory.ini for Ansible based on launched instances.&lt;/li>
&lt;li>Run Ansible playbook programmatically using &lt;code>ansible_runner&lt;/code>.&lt;/li>
&lt;li>Outcome: fully configured, ready-to-use HPC cluster; SSH into master to run examples.&lt;/li>
&lt;/ol>
&lt;p>If you would like to see a working example, you can view it in the &lt;a href="https://chameleoncloud.org/experiment/share/7424a8dc-0688-4383-9d67-1e40ff37de17" target="_blank" rel="noopener">Trovi example&lt;/a>&lt;/p>
&lt;h4 id="heat-orchestration-template">Heat Orchestration Template&lt;/h4>
&lt;p>Heat Orchestration Template(HOT) is a YAML based configuration file. Its purpose is to define/create a stack to automate
the deployment and configuration of OpenStack cloud resources.&lt;/p>
&lt;p>&lt;strong>Challenges&lt;/strong>&lt;/p>
&lt;p>We faced some challenges while working with Heat templates and stacks in particular in Chameleon Cloud&lt;/p>
&lt;ol>
&lt;li>&lt;code>OS::Nova::Keypair&lt;/code>(new version): In the latest OpenStack version, the stack fails to launch if the &lt;code>public_key&lt;/code> parameter is not provided for the keypair,
as auto-generation is no longer supported.&lt;/li>
&lt;li>&lt;code>OS::Heat::SoftwareConfig&lt;/code>: Deployment scripts often fail, hang, or time out, preventing proper configuration of nodes and causing unreliable deployments.&lt;/li>
&lt;/ol>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Heat Approach" srcset="
/report/osre25/uchicago/mpi/20250901-rohan-babbar/heatapproach_hua2bf48ad20dec386c348c909fcaf7111_39548_05fca9fb65271d31e3fd79f2e7b58a53.webp 400w,
/report/osre25/uchicago/mpi/20250901-rohan-babbar/heatapproach_hua2bf48ad20dec386c348c909fcaf7111_39548_19399eb0dbf598de84852723f8d60783.webp 760w,
/report/osre25/uchicago/mpi/20250901-rohan-babbar/heatapproach_hua2bf48ad20dec386c348c909fcaf7111_39548_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250901-rohan-babbar/heatapproach_hua2bf48ad20dec386c348c909fcaf7111_39548_05fca9fb65271d31e3fd79f2e7b58a53.webp"
width="760"
height="235"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>To tackle these challenges, we designed an approach that is both easy to implement and reproducible. First, we launch instances
by provisioning master and worker nodes using the HOT template in OpenStack. Next, we set up a bootstrap node, install Git and Ansible,
and run an Ansible playbook from the bootstrap node to configure the master and worker nodes, including SSH, host communication, and
MPI setup. The outcome is a fully configured, ready-to-use HPC cluster, where users can simply SSH into the master node to run examples.&lt;/p>
&lt;p>Users can view/use the template published in the Appliance Catalog: &lt;a href="https://chameleoncloud.org/appliances/132/" target="_blank" rel="noopener">MPI+Spack Bare Metal Cluster&lt;/a>.
For example, a demonstration of how to pass parameters is available on &lt;a href="https://chameleoncloud.org/experiment/share/7424a8dc-0688-4383-9d67-1e40ff37de17" target="_blank" rel="noopener">Trovi&lt;/a>.&lt;/p>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>In conclusion, this work demonstrates a reproducible approach to building and configuring MPI clusters on the Chameleon testbed. By using standardized images,
Ansible automation, and Orchestration Templates, we ensure that every node is consistently set up, reducing manual effort and errors. The artifact, published on Trovi,
makes the entire process transparent, reusable, and easy to implement, enabling users/researchers to reliably recreate and extend the cluster environment for their own
experiments.&lt;/p>
&lt;h2 id="future-work">Future Work&lt;/h2>
&lt;p>Maintaining these images and possibly creating a script to reproduce MPI and Spack on a different image base environment.&lt;/p></description></item><item><title>Final Update(Mid-Term -> Final): MPI Appliance for HPC Research on Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250831-rohan-babbar/</link><pubDate>Sun, 31 Aug 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250831-rohan-babbar/</guid><description>&lt;p>Hi everyone! This is my final update, covering the progress made every two weeks from the midterm to the end of the
project &lt;a href="https://ucsc-ospo.github.io/project/osre25/uchicago/mpi/" target="_blank" rel="noopener">MPI Appliance for HPC Research on Chameleon&lt;/a>, developed
in collaboration with Argonne National Laboratory and the Chameleon Cloud community.
This blog follows up on my earlier post, which you can find &lt;a href="https://ucsc-ospo.github.io/report/osre25/uchicago/mpi/20250803-rohan-babbar/" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;h3 id="-july-29--august-11-2025">🔧 July 29 – August 11, 2025&lt;/h3>
&lt;p>With the CUDA- and MPI-Spack–based appliances published, we considered releasing another image variant (ROCm-based) for AMD GPUs.
This will be primarily used in CHI@TACC, which provides AMD GPUs. We have successfully published a new image on Chameleon titled &lt;a href="https://chameleoncloud.org/appliances/131/" target="_blank" rel="noopener">MPI and Spack for HPC (Ubuntu 22.04 - ROCm)&lt;/a>,
and we also added an example to demonstrate its usage.&lt;/p>
&lt;h3 id="-august-12--august-25-2025">🔧 August 12 – August 25, 2025&lt;/h3>
&lt;p>With the examples now available on Trovi for creating an MPI cluster using Ansible and Python-CHI, my next step was to experiment with stack orchestration using Heat Orchestration Templates (HOT) on OpenStack Chameleon Cloud.
This turned out to be more challenging due to a few restrictions:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>OS::Nova::Keypair (new version)&lt;/strong>: In the latest OpenStack version, the stack fails to launch if the public_key parameter is not provided for the keypair, as auto-generation is no longer supported.&lt;/li>
&lt;li>&lt;strong>OS::Heat::SoftwareConfig&lt;/strong>: Deployment scripts often fail, hang, or time out, preventing proper configuration of nodes and causing unreliable deployments.&lt;/li>
&lt;/ol>
&lt;p>To address these issues, we adopted a new strategy for configuring and creating the MPI cluster: using a temporary bootstrap node.&lt;/p>
&lt;p>In simple terms, the workflow of the Heat template is:&lt;/p>
&lt;ol>
&lt;li>Provision master and worker nodes via the HOT template on OpenStack.&lt;/li>
&lt;li>Launch a bootstrap node, install Git and Ansible on it, and then run an Ansible playbook from the bootstrap node to configure the master and worker nodes. This includes setting up SSH, host communication, and the MPI environment.&lt;/li>
&lt;/ol>
&lt;p>This provides an alternative method for creating an MPI cluster.&lt;/p>
&lt;p>We presented this work on August 26, 2025, to the Chameleon Team and the Argonne MPICH Team. The project was very well received.&lt;/p>
&lt;p>Stay tuned for my final report on this work, which I’ll be sharing in my next blog post.&lt;/p></description></item><item><title>[Final Blog] Distrobench: Distributed Protocol Benchmark</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250830-panjisri/</link><pubDate>Sat, 30 Aug 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250830-panjisri/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>This is the final blog for our contribution to the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/umass/edge-replication/">Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges&lt;/a> project under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/fadhil-kurnia/">Fadhil Kurnia&lt;/a> for the OSRE program.&lt;/p>
&lt;p>&lt;a href="https://github.com/fadhilkurnia/distro" target="_blank" rel="noopener">Distrobench&lt;/a> is a framework to evaluate the performance of replication/coordination protocols for distributed systems. This framework standardizes benchmarking by allowing different protocols to be tested under an identical workload, and supports both local and remote deployment of the protocols. The frameworks tested are restricted under a key-value store application and are categorized under different &lt;a href="https://jepsen.io/consistency/models" target="_blank" rel="noopener">consistency models&lt;/a>, programming languages, and persistency (whether the framework stores its data in-memory or on-disk).&lt;/p>
&lt;p>All the benchmark results are stored in a &lt;code>data.json&lt;/code> file which can be viewed through a webpage we have provided. A user can clone the git repository, benchmark different protocols on their own machine or in a cluster of remote machines, then view the results locally. We also provided a &lt;a href="https://distrobench.org" target="_blank" rel="noopener">webpage&lt;/a> that shows our own benchmark results which ran on 3 Amazon EC2 t2.micro instances.&lt;br>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="" srcset="
/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_2eb41220c4287bdc730b38c76a5643f8.webp 400w,
/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_789a9a55850eed73f3a681f8423873cf.webp 760w,
/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_2eb41220c4287bdc730b38c76a5643f8.webp"
width="760"
height="381"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="how-to-run-a-benchmark-on-distrobench">How to run a benchmark on Distrobench&lt;/h2>
&lt;p>Before running a benchmark using Distrobench, the protocol that will be benchmarked must first be built. This is to allow the script to initialize the protocol instance for local benchmark or to send the binaries into the remote machine. The remote machine running the protocol does not need to store the code for the protocol implementations, but does require dependencies for running that specific protocol such as Java, Docker, rsync, etc. The following are commands used to build the &lt;a href="https://github.com/ailidani/paxi" target="_blank" rel="noopener">ailidani/paxi&lt;/a> project which does not need any additional dependency to be run inside of a remote machine:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Clone the Distrobench repository &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">git clone git@github.com:fadhilkurnia/distro.git
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Clone the Paxi repository and build the binary &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> distro/sut/ailidani.paxi
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">git clone git@github.com:ailidani/paxi.git
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> paxi/bin/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">./build.sh
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Go back to the Distrobench root directory &amp;amp; run python script &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> ../../../..
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">python main.py
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>By default, the script will start 3 local instances of a Paxi protocol implementation that the user chose through the CLI. The user can modify the number of running instances and whether or not it is deployed locally or in a remote machine by changing the contents of the &lt;code>.env&lt;/code> file inside the root directory. The following is the contents of the default .env file:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">NUM_OF_NODES=3
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">SSH_KEY=ssh-key.pem
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">REMOTE_USERNAME=ubuntu
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PUBLIC_IP1=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PUBLIC_IP2=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PUBLIC_IP3=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PRIVATE_IP1=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PRIVATE_IP2=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PRIVATE_IP3=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">CLIENT_IP=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">OUTPUT=data.json
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>When running a remote benchmark, a ssh-key should also be added in the root directory to allow the use of ssh and rsync from within the python script. All machines must also allow TCP connection through port 2000-2300 and port 3000-3300 because that would be the port range for communication between the running instances as well as for the YCSB benchmark. Running the benchmark requires the use of at least 3 nodes because it is the minimum number of nodes to support most protocols (5 nodes recommended).&lt;/p>
&lt;p>To view the benchmark result in the web page locally, move &lt;code>data.json&lt;/code> into the &lt;code>docs/&lt;/code> directory and run &lt;code>python -m http.server 8000&lt;/code>. The page is then accessible through &lt;code>http://localhost:8000&lt;/code>.&lt;/p>
&lt;h2 id="deep-dive-on-how-distrobench-works">Deep dive on how Distrobench works&lt;/h2>
&lt;p>The following is the project structure of the Distrobench repository:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">distro/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── main.py // Main python script for running benchmark
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── data.json // Output file for main.py
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── README.md
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── .env // Config for running the benchmark
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── docs/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── index.html // Web page to show benchmark results
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── data.json // Output file displayed by web page
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── README.md
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── src/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── utils/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ └── ycsb/ // Submodule for YCSB
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">└── sut/ // Systems under test
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── ailidani.paxi/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> └── run.py // Protocol-specific benchmark script called by main.py
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── apache.zookeeper/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── etcd-io.etcd/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── fadhilkurnia.xdn/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── holipaxos-artifect.holipaxos/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── otoolep.hraftd/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> └── tikv.tikv/
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>main.py&lt;/code> will automatically detect directories inside &lt;code>sut/&lt;/code> and will call the main function inside &lt;code>run.py&lt;/code>. The following is the structure of &lt;code>run.py&lt;/code> written in pseudocode style:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">FUNCTION main(run_ycsb: Function, nodes: List of Nodes, ssh: Dictionary)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> node_data = map_ip_port(nodes)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> SWITCH user\_input
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> CASE 0:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> start()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> RETURN
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> CASE 1:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> stop()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> RETURN
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> CASE 2:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> client_data = []
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> FOR EACH item IN node_data
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ADD item.client_addr TO client_data
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> END FOR
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> run_ycsb(client_data)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> RETURN
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> END SWITCH
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">FUNCTION start()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> // Start the protocol instance (local or remote)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">FUNCTION stop()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> // Stop the protocol instance (local or remote)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">FUNCTION map_ip_port(nodes: List of Nodes) -&amp;gt; List of Dictionary
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> // Generate port numbers based on the protocol requirements
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The .env file provides both public and private IP addresses to add versatility when running a remote benchmark. Private IP is used for communication between remote machines if they are under the same network group. In the case of our own benchmark, four t2.micro EC2 instances are deployed under the same network group. Three of them are used to run the protocol and the fourth machine acts as the YCSB client. It is possible to use your local machine as the YCSB client instead of through another remote machine by specifying &lt;code>CLIENT_IP&lt;/code> in the .env file as &lt;code>127.0.0.1&lt;/code>. The decision to use the remote machine as the YCSB client is made to reduce the impact of network latency between the client and the protocol servers to a minimum.&lt;/p>
&lt;p>The main tasks of the &lt;code>start()&lt;/code> function can be broken down into the following:&lt;/p>
&lt;ol>
&lt;li>Generate custom configuration files for each remote machine instance (May differ between implementations. Some implementations does not require a config file because they support flag parameters out of the box, others require multiple configuration files for each instance)&lt;/li>
&lt;li>rsync binaries into the remote machine (If running a remote benchmark)&lt;/li>
&lt;li>Start the instances&lt;/li>
&lt;/ol>
&lt;p>The &lt;code>stop()&lt;/code> function is a lot simpler since it only kills the process running the protocol and optionally removes the copied binary files in the remote machine. The &lt;code>run_ycsb()&lt;/code> function passed onto &lt;code>run.py&lt;/code> is defined in &lt;code>main.py&lt;/code> and currently supports two types of workload:&lt;/p>
&lt;ol>
&lt;li>Read-heavy: A single-client workload with 95% read and 5% update (write) operations&lt;/li>
&lt;li>Update-heavy: A single-client workload with 50% read and 50% update (write) operations&lt;/li>
&lt;/ol>
&lt;p>A new workload can be added inside the &lt;code>src/ycsb/workloads&lt;/code> directory. Both workloads above only run 1000 operations for the benchmark which may not be enough operations to properly evaluate the performance of the protocols. It should also be noted that while YCSB does support a &lt;code>scan&lt;/code> operation, it is never used for our benchmark because none of our tested protocols implement this operation.&lt;/p>
&lt;h3 id="how-to-implement-a-new-protocol-in-distrobench">How to implement a new protocol in Distrobench&lt;/h3>
&lt;p>Adding a new protocol to distrobench requires implementing two main components: a Python integration script (&lt;code>run.py&lt;/code>) and a YCSB database binding for benchmarking.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Create the protocol directory structure&lt;/p>
&lt;ul>
&lt;li>Create a new directory under &lt;code>sut/&lt;/code> using format &lt;code>yourrepo.yourprotocol/.&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Write &lt;code>run.py&lt;/code> integration&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Put script inside yourrepo.yourprotocol/ directory&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Must have the &lt;code>main(run_ycsb, nodes, ssh)&lt;/code> function.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Add start/stop/benchmark menu options&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Handle local (127.0.0.1) and remote deployment&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Create YCSB client&lt;/p>
&lt;ul>
&lt;li>Make Java class extending YCSB&amp;rsquo;s DB class&lt;/li>
&lt;li>Put inside &lt;code>src/ycsb/yourprotocol/src/main/java/site/ycsb/yourprotocol&lt;/code>&lt;/li>
&lt;li>Implement &lt;code>read()&lt;/code>, &lt;code>insert()&lt;/code>, &lt;code>update()&lt;/code>, &lt;code>delete()&lt;/code> methods&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Register your client&lt;/p>
&lt;ul>
&lt;li>Register your client to &lt;code>src/pom.xml&lt;/code>, &lt;code>src/ycsb/bin/binding.properties&lt;/code>, and &lt;code>src/ycsb/bin/ycsb&lt;/code>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Build and test&lt;/p>
&lt;ul>
&lt;li>Run &lt;code>cd src/ycsb &amp;amp;&amp;amp; mvn clean package&lt;/code>&lt;/li>
&lt;li>Run python &lt;code>main.py&lt;/code>&lt;/li>
&lt;li>Select your protocol and test it&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;h2 id="protocols-which-have-been-tested">Protocols which have been tested&lt;/h2>
&lt;p>Distrobench has tested 20 different distributed consensus protocols across 7 different implementation projects.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;a href="https://github.com/ailidani/paxi" target="_blank" rel="noopener">ailidani/paxi&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability, Eventual&lt;/li>
&lt;li>Protocol : Paxos, EPaxos, SDpaxos, WPaxos, ABD, chain, VPaxos, WanKeeper, KPaxos, Paxos_groups, Dynamo, Blockchain, M2Paxos, HPaxos.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/apache/zookeeper" target="_blank" rel="noopener">apache/zookeeper&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Java&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability + Primary Integrity&lt;/li>
&lt;li>Protocol : Zookeeper implements ZAB (Zookeper Atomic Broadcast)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/etcd-io/etcd" target="_blank" rel="noopener">etcd-io/etcd&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Raft&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/fadhilkurnia/xdn" target="_blank" rel="noopener">fadhilkurnia/xdn&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Java, Rust&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability, Linearizability + Primary Integrity&lt;/li>
&lt;li>Protocol : Gigapaxos&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/Zhiying12/holipaxos-artifect" target="_blank" rel="noopener">Zhiying12/holipaxos-artifect&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go, Rust&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Holipaxos, Omnipaxos, Multipaxos&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/otoolep/hraftd" target="_blank" rel="noopener">otoolep/hraftd&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Raft&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/tikv/tikv" target="_blank" rel="noopener">tikv/tikv&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Rust&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Raft&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;h2 id="challenges">Challenges&lt;/h2>
&lt;ul>
&lt;li>When attempting to benchmark HoliPaxos, the main challenge was handling versions that rely on persistent storage with RocksDB. Since some implementations are written in Go, it was necessary to find compatible versions of RocksDB and gRocksDB (for example, RocksDB 10.5.1 works with gRocksDB 1.10.2). Another difficulty was that RocksDB is resource-intensive to compile, and in our project we did not have sufficient CPU capacity on the remote machine to build RocksDB and run remote benchmarks.&lt;/li>
&lt;li>Some projects did not compile successfully at first and required minor modifications to run.&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion-and-future-improvements">Conclusion and future improvements&lt;/h2>
&lt;p>The current benchmark result shows the performance of all the mentioned protocols by throughput and benchmark runtime. The results are subject to revisions because it may not reflect the best performance for the protocols due to unoptimized deployment script. We are also planning to switch to a more powerful EC2 machine because t2.micro does not have enough resources to support the use of RocksDB as well as TiKV.&lt;/p>
&lt;p>In the near future, additional features will be added to Distrobench such as:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Multi-Client Support:&lt;/strong> The YCSB client will start multiple clients which will send requests in parallel to different servers in the group.&lt;/li>
&lt;li>&lt;strong>Commit Versioning:&lt;/strong> Allows the labelling of all benchmark results with the commit hash of the protocol&amp;rsquo;s repository version. This allows comparing different versions of the same project.&lt;/li>
&lt;li>&lt;strong>Adding more Primary-Backup, Sequential, Causal, and Eventual consistency protocols:&lt;/strong> Implementations with support for a consistency model other than linearizability and one that provides an existing key-value store application are notoriously difficult to find.&lt;/li>
&lt;li>&lt;strong>Benchmark on node failure&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Benchmark on the addition of a new node&lt;/strong>&lt;/li>
&lt;/ul></description></item><item><title>End-term Blog: StatWrap: Cross-Project Searching and Classification using Local Indexing</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/</guid><description>&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Heading" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image0_hu69efae69f006c4366342bdc2ded8b248_187729_f9e5e16b2001b9950ad995b2c786abc9.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image0_hu69efae69f006c4366342bdc2ded8b248_187729_27bc4379277ab462935158b3db96d992.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image0_hu69efae69f006c4366342bdc2ded8b248_187729_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image0_hu69efae69f006c4366342bdc2ded8b248_187729_f9e5e16b2001b9950ad995b2c786abc9.webp"
width="760"
height="392"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h1 id="introduction">&lt;strong>Introduction&lt;/strong>&lt;/h1>
&lt;p>Hello everyone!&lt;br>
I am Debangi Ghosh from India, an undergraduate student at the Indian Institute of Technology (IIT) BHU, Varanasi. As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/northwestern/statwrap/">StatWrap: Cross-Project Searching and Classification using Local Indexing&lt;/a> project, my &lt;a href="https://drive.google.com/file/d/1dxyBP2oMJwYDCKyIWzr465zNmm6UWtnI/view?usp=sharing" target="_blank" rel="noopener">proposal&lt;/a>, under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/luke-rasmussen/">Luke Rasmussen&lt;/a>, focuses on developing a full-text search service within the StatWrap user interface. This involves evaluating different search libraries and implementing a classification system to distinguish between active and past projects.&lt;/p>
&lt;h1 id="about-the-project">&lt;strong>About the Project&lt;/strong>&lt;/h1>
&lt;p>As part of the project, I am working on enhancing the usability of StatWrap by enabling efficient cross-project search capabilities. The goal is to make it easier for researchers to discover relevant projects, notes, and assets across both current and archived work, using information that is either user-entered or passively collected by StatWrap.&lt;/p>
&lt;p>Given the sensitivity of the data involved, one of the key requirements is that all indexing and search operations must be performed locally. To address this, my responsibilities include:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Evaluating open-source search libraries&lt;/strong> suitable for local indexing and retrieval&lt;/li>
&lt;li>&lt;strong>Building the full-text search functionality&lt;/strong> directly into the StatWrap UI to allow seamless querying across projects&lt;/li>
&lt;li>&lt;strong>Ensuring reliability&lt;/strong> through the development of unit tests and comprehensive system testing&lt;/li>
&lt;li>&lt;strong>Implementing a classification system&lt;/strong> to label projects as “Active,” “Pinned,” or “Past” within the user interface&lt;/li>
&lt;/ul>
&lt;p>This project offers a great opportunity to work at the intersection of software development, information retrieval, and user-centric design—while contributing to research reproducibility and collaboration within scientific workflows.&lt;/p>
&lt;h1 id="deliverables">&lt;strong>Deliverables&lt;/strong>&lt;/h1>
&lt;p>The project has reached the end of its scope after 12 weeks of work. Here&amp;rsquo;s a breakdown:&lt;/p>
&lt;h2 id="1-descriptive-comparison-of-open-source-libraries">&lt;strong>1. Descriptive Comparison of Open-Source Libraries&lt;/strong>&lt;/h2>
&lt;p>Compared various open-source search libraries based on evaluation criteria such as &lt;strong>indexing speed, search speed, memory usage, typo tolerance, fuzzy searching, partial matching, full-text queries, contextual search, Boolean support, exact word match, installation ease, maintenance, documentation&lt;/strong>, and &lt;strong>developer experience&lt;/strong>. Decided upon the weights to assign to each of the features and point out the best library to use. According to our weights assigned,
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Evaluation" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image1_hu63c79919752d2305350a1cb96819590d_110608_4b5e863d88146124b333878508147eff.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image1_hu63c79919752d2305350a1cb96819590d_110608_c2220a56c480048842e8b750cc2ca56f.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image1_hu63c79919752d2305350a1cb96819590d_110608_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image1_hu63c79919752d2305350a1cb96819590d_110608_4b5e863d88146124b333878508147eff.webp"
width="760"
height="603"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>These results are after tuning the hyperparameters to give the best set of results
For huge data, FlexSearch has the least memory usage, followed by MiniSearch. The examples we used were limited, so Minisearch had the better memory usage results.
Along with the research and evaluation, I looked upon the Performance Benchmark of Full-Text-Search Libraries (Stress Test), available &lt;a href="https://nextapps-de.github.io/flexsearch/" target="_blank" rel="noopener">here&lt;/a>&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Stress Test" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image2_hu9b739b80416dccda0a7e0361ba4f7e36_163727_407cb964e7e05c64834433b6a84182ff.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image2_hu9b739b80416dccda0a7e0361ba4f7e36_163727_167223f62fbaf30991601d7745fad9f5.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image2_hu9b739b80416dccda0a7e0361ba4f7e36_163727_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image2_hu9b739b80416dccda0a7e0361ba4f7e36_163727_407cb964e7e05c64834433b6a84182ff.webp"
width="760"
height="384"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The benchmark was measured in terms per seconds, higher values are better (except the test &amp;ldquo;Memory&amp;rdquo;). The memory value refers to the amount of memory which was additionally allocated during search.&lt;/p>
&lt;p>FlexSearch performs queries up to 1,000,000 times faster compared to other libraries by also providing powerful search capabilities like multi-field search (document search), phonetic transformations, partial matching, tag-search, result highlighting or suggestions.
Bigger workloads are scalable through workers to perform any updates or queries to the index in parallel through dedicated balanced threads.&lt;/p>
&lt;h2 id="2-the-search-user-interface">&lt;strong>2. The Search User Interface&lt;/strong>&lt;/h2>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="ui" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image3_hu2c7c529fbdaba5c9b4f85e802acf251e_292973_5c88d9d2587c54c50da97d6c489519dc.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image3_hu2c7c529fbdaba5c9b4f85e802acf251e_292973_82065ca30e98bced61362bca45765215.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image3_hu2c7c529fbdaba5c9b4f85e802acf251e_292973_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image3_hu2c7c529fbdaba5c9b4f85e802acf251e_292973_5c88d9d2587c54c50da97d6c489519dc.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="ui2" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image5_hu55f5482f96b2f6db562c5a51f9b5f629_220424_7a3499ad0fc3cd06919fcdd17194742a.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image5_hu55f5482f96b2f6db562c5a51f9b5f629_220424_5840b85d48a6e608855c8e0d96b4fe49.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image5_hu55f5482f96b2f6db562c5a51f9b5f629_220424_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image5_hu55f5482f96b2f6db562c5a51f9b5f629_220424_7a3499ad0fc3cd06919fcdd17194742a.webp"
width="760"
height="652"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="3-complete-search-execution-pipeline">&lt;strong>3. Complete Search Execution Pipeline&lt;/strong>&lt;/h2>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="ui2" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/Flowchart__hu0123533bb7a682ac6b28d9b34fa57bc0_349775_bd4ac2fa5efb17e2b237cf8d78278398.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/Flowchart__hu0123533bb7a682ac6b28d9b34fa57bc0_349775_a0e8f31fdbdc656a2886def3dca3410b.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/Flowchart__hu0123533bb7a682ac6b28d9b34fa57bc0_349775_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/Flowchart__hu0123533bb7a682ac6b28d9b34fa57bc0_349775_bd4ac2fa5efb17e2b237cf8d78278398.webp"
width="513"
height="760"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="4-flexsearch-features">&lt;strong>4. FlexSearch Features&lt;/strong>&lt;/h2>
&lt;h4 id="1-persistent-indexing-with-automatic-loading">1. &lt;strong>Persistent Indexing with Automatic Loading&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Index persistence&lt;/strong>: Search index automatically saves to disk and loads on startup&lt;/li>
&lt;li>&lt;strong>Fast restoration&lt;/strong>: Rebuilds FlexSearch indices from saved document store without re-scanning files&lt;/li>
&lt;li>&lt;strong>Incremental updates&lt;/strong>: Detects project changes and updates only modified content&lt;/li>
&lt;li>&lt;strong>Background processing&lt;/strong>: Index updates happen asynchronously without blocking the User Interface.&lt;/li>
&lt;/ul>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="indexing" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image4_hu4893772edaa569a0d2e6454373f66573_78656_23074ee37edbb0f6abbd289ef211f756.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image4_hu4893772edaa569a0d2e6454373f66573_78656_993d6a1363d2cddf66632c4102acb8f5.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image4_hu4893772edaa569a0d2e6454373f66573_78656_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image4_hu4893772edaa569a0d2e6454373f66573_78656_23074ee37edbb0f6abbd289ef211f756.webp"
width="494"
height="760"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="2-multi-document-type-support">2. &lt;strong>Multi-Document Type Support&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Unified search&lt;/strong>: Single search interface for projects, files, people, notes, and assets&lt;/li>
&lt;li>&lt;strong>Type-specific indices&lt;/strong>: Separate FlexSearch indices optimized for each document type&lt;/li>
&lt;li>&lt;strong>Cross-reference capabilities&lt;/strong>: Documents can reference and link to each other&lt;/li>
&lt;li>&lt;strong>Flexible schema&lt;/strong>: Each document type has tailored fields for optimal search performance&lt;/li>
&lt;/ul>
&lt;h4 id="3-intelligent-file-content-indexing">3. &lt;strong>Intelligent File Content Indexing&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Configurable file size limits&lt;/strong>: Admin-controlled maximum file size for content indexing&lt;/li>
&lt;li>&lt;strong>Smart file detection&lt;/strong>: Automatically identifies text files by extension and filename patterns&lt;/li>
&lt;li>&lt;strong>Content extraction&lt;/strong>: Full-text indexing with snippet generation for search results&lt;/li>
&lt;li>&lt;strong>Performance optimization&lt;/strong>: Skips binary files and respects size constraints to maintain speed&lt;/li>
&lt;/ul>
&lt;h4 id="4-advanced-query-processing">4. &lt;strong>Advanced Query Processing&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Multi-strategy search&lt;/strong>: Combines exact matches, fuzzy search, partial matches, and contextual search&lt;/li>
&lt;li>&lt;strong>Query preprocessing&lt;/strong>: Removes stop words and applies linguistic filters&lt;/li>
&lt;li>&lt;strong>Relevance scoring&lt;/strong>: Custom scoring algorithm considering multiple factors:
&lt;ul>
&lt;li>Exact phrase matches (highest weight)&lt;/li>
&lt;li>Individual word matches&lt;/li>
&lt;li>Term frequency with logarithmic capping&lt;/li>
&lt;li>Position-based scoring (earlier matches rank higher)&lt;/li>
&lt;li>Proximity bonuses for terms appearing near each other&lt;/li>
&lt;li>Completeness penalties for missing query terms&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="5-real-time-search-suggestions">5. &lt;strong>Real-Time Search Suggestions&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Autocomplete support&lt;/strong>: Dynamic suggestions based on indexed document titles&lt;/li>
&lt;li>&lt;strong>Search history&lt;/strong>: Maintains recent searches for quick re-execution&lt;/li>
&lt;li>&lt;strong>Debounced input&lt;/strong>: Prevents excessive API calls during typing&lt;/li>
&lt;li>&lt;strong>Contextual suggestions&lt;/strong>: Suggestions adapt based on current filters and context&lt;/li>
&lt;/ul>
&lt;h4 id="6-comprehensive-filtering-system">6. &lt;strong>Comprehensive Filtering System&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Type filtering&lt;/strong>: Filter by document type (projects, files, people, etc.)&lt;/li>
&lt;li>&lt;strong>Project scoping&lt;/strong>: Limit searches to specific projects&lt;/li>
&lt;li>&lt;strong>File type filtering&lt;/strong>: Filter files by extension&lt;/li>
&lt;li>&lt;strong>Advanced search panel&lt;/strong>: Collapsible interface for power users&lt;/li>
&lt;li>&lt;strong>Filter persistence&lt;/strong>: Maintains filter state across searches&lt;/li>
&lt;/ul>
&lt;h4 id="7-performance-monitoring--analytics">7. &lt;strong>Performance Monitoring &amp;amp; Analytics&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Real-time metrics&lt;/strong>: Track search times, cache hit rates, and index statistics&lt;/li>
&lt;li>&lt;strong>Performance dashboard&lt;/strong>: Visual indicators for system health&lt;/li>
&lt;li>&lt;strong>Cache management&lt;/strong>: LRU cache with configurable size and TTL&lt;/li>
&lt;li>&lt;strong>Search analytics&lt;/strong>: Historical data on search patterns and performance&lt;/li>
&lt;/ul>
&lt;h4 id="8-index-management-tools">8. &lt;strong>Index Management Tools&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Export/Import functionality&lt;/strong>: Backup and restore search indices&lt;/li>
&lt;li>&lt;strong>Full reindexing&lt;/strong>: Complete index rebuild with progress tracking&lt;/li>
&lt;li>&lt;strong>Index deletion&lt;/strong>: Clean slate functionality for troubleshooting&lt;/li>
&lt;li>&lt;strong>File size adjustment&lt;/strong>: Modify indexing constraints and rebuild affected content&lt;/li>
&lt;li>&lt;strong>Index statistics&lt;/strong>: Detailed breakdown of indexed content by type and project&lt;/li>
&lt;/ul>
&lt;h4 id="9-robust-error-handling--resilience">9. &lt;strong>Robust Error Handling &amp;amp; Resilience&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Graceful degradation&lt;/strong>: System continues operating even with partial index corruption&lt;/li>
&lt;li>&lt;strong>File system error handling&lt;/strong>: Handles missing files, permission issues, and path changes&lt;/li>
&lt;li>&lt;strong>Memory management&lt;/strong>: Prevents memory leaks during large indexing operations&lt;/li>
&lt;li>&lt;strong>Recovery mechanisms&lt;/strong>: Automatic fallback to basic search if advanced features fail&lt;/li>
&lt;/ul>
&lt;h4 id="10-user-experience-enhancements">10. &lt;strong>User Experience Enhancements&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Keyboard shortcuts&lt;/strong>: Ctrl+K to focus search, Escape to clear&lt;/li>
&lt;li>&lt;strong>Result highlighting&lt;/strong>: Visual emphasis on matching terms in results&lt;/li>
&lt;li>&lt;strong>Expandable results&lt;/strong>: Drill down into detailed information for each result&lt;/li>
&lt;li>&lt;strong>Loading states&lt;/strong>: Clear feedback during indexing and search operations&lt;/li>
&lt;li>&lt;strong>Responsive tabs&lt;/strong>: Organized results by type with badge counts&lt;/li>
&lt;/ul>
&lt;h2 id="5-classification-of-active-and-past-projects">&lt;strong>5. Classification of Active and Past Projects&lt;/strong>&lt;/h2>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Active Pinned" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image6_huacf20425d6903f6cfe6149bc5cb1772d_171494_1d3344ebb95180438d54893a9b5683e4.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image6_huacf20425d6903f6cfe6149bc5cb1772d_171494_a0f8ee7f62445c2f5f806022268d0821.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image6_huacf20425d6903f6cfe6149bc5cb1772d_171494_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image6_huacf20425d6903f6cfe6149bc5cb1772d_171494_1d3344ebb95180438d54893a9b5683e4.webp"
width="733"
height="760"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Past" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image7_hu7cccff315a5d098cd440d7277689d606_85529_76660a0dce9ac0ba1fa91c959db2773c.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image7_hu7cccff315a5d098cd440d7277689d606_85529_cc2abd1a6a3019f703ca3e656e55f920.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image7_hu7cccff315a5d098cd440d7277689d606_85529_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image7_hu7cccff315a5d098cd440d7277689d606_85529_76660a0dce9ac0ba1fa91c959db2773c.webp"
width="740"
height="542"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>A classification system is added within the User Interface similar to &lt;strong>&amp;ldquo;Add to Favorites&amp;rdquo;&lt;/strong> option. A new project added by default moves to &lt;strong>&amp;ldquo;Active&amp;rdquo;&lt;/strong> section, unless explicitely marked as &lt;strong>&amp;ldquo;Past&amp;rdquo;&lt;/strong>. Similarly, when a project is unpinned from Favorites, it goes to &amp;ldquo;Active&amp;rdquo; Section.&lt;/p>
&lt;h1 id="conclusion-and-future-scope">&lt;strong>Conclusion and future Scope&lt;/strong>&lt;/h1>
&lt;p>Building a comprehensive search system requires careful attention to performance, user experience, and maintainability. FlexSearch provided the foundation, but the real value came from thoughtful implementation of persistent indexing, advanced scoring, and robust error handling. The result is a search system that feels instant to users while handling complex queries across diverse document types.&lt;/p>
&lt;p>The key to success was treating search not as a single feature, but as a complete subsystem with its own data management, performance monitoring, and user interface considerations. By investing in these supporting systems, the search functionality became a central, reliable part of the application that users can depend on.&lt;/p>
&lt;p>The future scope would include:&lt;/p>
&lt;ol>
&lt;li>Using a database (for example, SQLite), instead of JSON, which is better for this use case than JSON due to better and efficient query performance and atomic (CRUD) operations.&lt;/li>
&lt;li>Integrating any suggestions from my mentors, as well as improvements we feel are necessary.&lt;/li>
&lt;li>Developing unit tests for further functionalities and improvements.&lt;/li>
&lt;/ol>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Thank You!" srcset="
/report/osre25/northwestern/statwrap/20250823-debangi29/image_hu81a7405087771991938f164c6a45c6d2_109315_f70985a589ad6b79f8c95b36c5279852.webp 400w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image_hu81a7405087771991938f164c6a45c6d2_109315_b28b9dbb6c70c33ca845fda461a64fcf.webp 760w,
/report/osre25/northwestern/statwrap/20250823-debangi29/image_hu81a7405087771991938f164c6a45c6d2_109315_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250823-debangi29/image_hu81a7405087771991938f164c6a45c6d2_109315_f70985a589ad6b79f8c95b36c5279852.webp"
width="760"
height="235"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p></description></item><item><title>[Final]Reproducibility of Interactive Notebooks in Distributed Environments</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/depaul/notebook-rep/08202025-rahmad/</link><pubDate>Wed, 20 Aug 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/depaul/notebook-rep/08202025-rahmad/</guid><description>&lt;p>I am sharing a overview of my project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/ucsc/06122025-rahmad">Reproducibility of Interactive Notebooks in Distributed Environments&lt;/a> and the work that I did this summer.&lt;/p>
&lt;h1 id="project-overview">Project Overview&lt;/h1>
&lt;p>This project aims at improving the reproducibility of interactive notebooks which are executed in a distributed environment. Notebooks like in the &lt;a href="https://jupyter.org/" target="_blank" rel="noopener">Jupyter&lt;/a> environment have become increasingly popular and are widely used in the scientific community due to their ease of use and portability. Reproducing these notebooks is a challenging task especially in a distributed cluster environment.&lt;/p>
&lt;p>In the distributed environments we consider, the notebook code is divided into manager and worker code. The manager code is the main entry point of the program which divides the task at hand into one or more worker codes which run in a parallel, distributed fashion. We utlize several open source tools to package and containerize the application code which can be used to reproduce it across different machines and environments. They include &lt;a href="https://github.com/radiant-systems-lab/sciunit" target="_blank" rel="noopener">Sciunit&lt;/a>, &lt;a href="https://github.com/radiant-systems-lab/Flinc" target="_blank" rel="noopener">FLINC&lt;/a>, and &lt;a href="https://cctools.readthedocs.io/en/stable/taskvine/" target="_blank" rel="noopener">TaskVine&lt;/a>. These are the high-level goals of this project:&lt;/p>
&lt;ol>
&lt;li>Generate execution logs for a notebook program.&lt;/li>
&lt;li>Generate code and data dependencies for notebook programs in an automated manner.&lt;/li>
&lt;li>Utilize the generated dependencies at various granularities to automate the deployment and execution of notebooks in a parallel and distributed environment.&lt;/li>
&lt;li>Audit and package the notebook code running in a distributed environment.&lt;/li>
&lt;li>Overall, support efficient reproducibility of programs in a notebook program.&lt;/li>
&lt;/ol>
&lt;h1 id="progress-highlights">Progress Highlights&lt;/h1>
&lt;p>Here are the details of the work that I did during this summer.&lt;/p>
&lt;h2 id="generation-of-execution-logs">Generation of Execution Logs&lt;/h2>
&lt;p>We generate execution logs for the notebook programs in a distributed environment the Linux utility &lt;a href="https://man7.org/linux/man-pages/man1/strace.1.html" target="_blank" rel="noopener">strace&lt;/a> which records every system call made by the notebook. It includes all files accessed during its execution. We collect separate logs for both manager and the worker code since they are executed on different machines and the dependencies for both are different. By recording the entire notebook execution, we capture all libraries, packages, and data files referenced during notebook execution in the form of execution logs. These logs are then utilized for further analyses.&lt;/p>
&lt;h2 id="extracting-software-dependencies">Extracting Software Dependencies&lt;/h2>
&lt;p>When a library such as a Python package like &lt;em>Numpy&lt;/em> is used by the notebook program, an entry is made in the execution log which has the complete path of the accessed library file(s) along with additional information. We analyze the execution logs for both manager and workers to find and enlist all dependencies. So far, we are limited to Python packages, though this methodology is general and can be used to find dependencies for any programing language. For Python packages, their version numbers are also obtained by querying the package managers like &lt;em>pip&lt;/em> or &lt;em>Conda&lt;/em> on the local system.&lt;/p>
&lt;h2 id="extracting-data-dependencies">Extracting Data Dependencies&lt;/h2>
&lt;p>We utilze similar execution logs to identify which data files were used by the notebook program. The list of logged files also contain various configuration or setting files used by certain packages and libraries. These files are removed from the list of data dependencies through post-processing done by analyzing file paths.&lt;/p>
&lt;h2 id="testing-the-pipeline">Testing the Pipeline&lt;/h2>
&lt;p>We have conducted our experiments on three use cases obtained from different domains using between 5 and 10 workers. They include distributed image convolution, climate trend analysis, and high energy physics experiment analysis. The results so far are promising with good accuracy and with a slight running time overhead.&lt;/p>
&lt;h2 id="processing-at-cell-level">Processing at Cell-level&lt;/h2>
&lt;p>We perform the same steps of log generation and data and software dependency extraction at the level of individual cells in a notebook instead of once for the whole notebook. As a result, we generate software and data dependencies at the level of individual notebook cells. This is achieved by interrupting control flow before and after execution of each cell to write special instructions to the execution log for marking boundaries of cell execution. We then analyze the intervals between these instructions to identify which files and Python packages are accessed by each specific cell. We use this information to generate the list of software dependencies used by that cell only.&lt;/p>
&lt;p>We also capture data dependencies by overriding analyzing the execution logs generated by overriding the function of the &lt;em>open&lt;/em> function call used to access various files.&lt;/p>
&lt;h2 id="distributed-notebook-auditing">Distributed Notebook Auditing&lt;/h2>
&lt;p>In order to execute and audit workloads in parallel, we use &lt;a href="https://github.com/radiant-systems-lab/parallel-sciunit" target="_blank" rel="noopener">Sciunit Parallel&lt;/a> which uses GNU Parallel for efficient parallel execution of tasks. The user specifies the number of tasks or machines to run the task on which is then distributed across them. Once the execution completes, their containerized executions need to be gathered at the host location.&lt;/p>
&lt;h2 id="efficient-reproducibility-with-checkpointing">Efficient Reproducibility with Checkpointing&lt;/h2>
&lt;p>An important challenge with Jupyter notebooks is that sometimes they are unnecessarily time-consuming and resource-intensive, especially when most cells remain unchanged. We worked on &lt;a href="https://github.com/talha129/NBRewind/tree/master" target="_blank" rel="noopener">NBRewind&lt;/a> which is a lightweight tool to accelerate notebook re-execution by avoiding redundant computation. It integrates checkpointing, application virtualization, and content-based deduplication. It enables two kinds of checkpoints: incremental and full-state. In incremental checkpoints, notebook states and dependencies across multiple cells are stored once such that only their deltas are stored again. In full-state checkpoints, the same is stored after each cell. During its restore process, it restores outputs for unchanged cells and thus enables efficient re-execution. Our empirical
evaluation demonstrates that NBRewind can significantly reduce both notebook audit and repeat times with incremental checkpoints.&lt;/p>
&lt;p>I am very happy abut the experience I have had in this project and I would encourage other students to join this program in the future.&lt;/p></description></item><item><title>Mid-Term Update: MPI Appliance for HPC Research on Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250803-rohan-babbar/</link><pubDate>Sun, 03 Aug 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250803-rohan-babbar/</guid><description>&lt;p>Hi everyone! This is my mid-term blog update for the project &lt;a href="https://ucsc-ospo.github.io/project/osre25/uchicago/mpi/" target="_blank" rel="noopener">MPI Appliance for HPC Research on Chameleon&lt;/a>, developed in collaboration with Argonne National Laboratory and the Chameleon Cloud community.
This blog follows up on my earlier post, which you can find &lt;a href="https://ucsc-ospo.github.io/report/osre25/uchicago/mpi/20250614-rohan-babbar/" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;h3 id="-june-15--june-29-2025">🔧 June 15 – June 29, 2025&lt;/h3>
&lt;p>Worked on creating and configuring images on Chameleon Cloud for the following three sites:
CHI@UC, CHI@TACC, and KVM@TACC.&lt;/p>
&lt;p>Key features of the images:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Spack&lt;/strong>: Pre-installed and configured for easy package management of HPC software.&lt;/li>
&lt;li>&lt;strong>Lua Modules (LMod)&lt;/strong>: Installed and configured for environment module management.&lt;/li>
&lt;li>&lt;strong>MPI Support&lt;/strong>: Both MPICH and Open MPI are pre-installed, enabling users to run distributed applications out-of-the-box.&lt;/li>
&lt;/ul>
&lt;p>These images are now publicly available and can be seen directly on the Chameleon Appliance Catalog, titled &lt;a href="https://chameleoncloud.org/appliances/127/" target="_blank" rel="noopener">MPI and Spack for HPC (Ubuntu 22.04)&lt;/a>.&lt;/p>
&lt;p>I also worked on some example Jupyter notebooks on how to get started using these images.&lt;/p>
&lt;h3 id="-june-30--july-13-2025">🔧 June 30 – July 13, 2025&lt;/h3>
&lt;p>With the MPI Appliance now published on Chameleon Cloud, the next step was to automate the setup of an MPI-Spack cluster.&lt;/p>
&lt;p>To achieve this, I developed a set of Ansible playbooks that:&lt;/p>
&lt;ol>
&lt;li>Configure both master and worker nodes with site-specific settings&lt;/li>
&lt;li>Set up seamless access to Chameleon NFS shares&lt;/li>
&lt;li>Allow users to easily install Spack packages, compilers, and dependencies across all nodes&lt;/li>
&lt;/ol>
&lt;p>These playbooks aim to simplify the deployment of reproducible HPC environments and reduce the time required to get a working cluster up and running.&lt;/p>
&lt;h3 id="-july-14--july-28-2025">🔧 July 14 – July 28, 2025&lt;/h3>
&lt;p>This week began with me fixing some issues in python-chi, the official Python client for the Chameleon testbed.
We also discussed adding support for CUDA-based packages, which would make it easier to work with NVIDIA GPUs.
We successfully published a new image on Chameleon, titled &lt;a href="https://chameleoncloud.org/appliances/130/" target="_blank" rel="noopener">MPI and Spack for HPC (Ubuntu 22.04 - CUDA)&lt;/a>, and added an example to demonstrate its usage.&lt;/p>
&lt;p>We compiled the artifact containing the Jupyter notebooks and Ansible playbooks and published it on Chameleon Trovi.
Feel free to check it out &lt;a href="https://chameleoncloud.org/experiment/share/7424a8dc-0688-4383-9d67-1e40ff37de17" target="_blank" rel="noopener">here&lt;/a>. The documentation still needs some work.&lt;/p>
&lt;p>📌 That’s it for now! I’m currently working on the documentation, a ROCm-based image for AMD GPUs, and some container-based examples.
Stay tuned for more updates in the next blog.&lt;/p></description></item><item><title>Halfway Blog - WildBerryEye: Mechanical Design &amp; Weather-Resistant Enclosure</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/ucsc/wildberryeye/20250725-teolangan/</link><pubDate>Fri, 25 Jul 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/ucsc/wildberryeye/20250725-teolangan/</guid><description>&lt;p>Hi everyone! My name is &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/content/authors/teolangan">Teodor Langan&lt;/a>, and I am an undergraduate studying Robotics Engineering at the University of California, Santa Cruz. I’m happy to share the progress I have been able to make over the last six weeks on my GSoC 2025 project. Over the last six weeks, I have been working on developing the hardware for the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/ucsc/wildberryeye/">WildBerryEye&lt;/a> project, mentored by &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/content/authors/caiespin">Carlos Isaac Espinosa&lt;/a>.&lt;/p>
&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>The WildBerryEye project enables AI-powered ecological monitoring using Raspberry Pi cameras and computer vision models. However, achieving this requires a reliable enclosure that can support long-term deployment in the wild. The goal for my project is to address this need by designing a modular, 3D-printable camera casing that protects WildBerryEye’s electronics from outside factors such as rain, dust, and bugs, while remaining easy to print and assemble. To achieve this, my main responsibilities for this project include:&lt;/p>
&lt;ul>
&lt;li>Implementing a modular design and development-friendly features for ease of assembly and flexible use across hardware setups&lt;/li>
&lt;li>Prototyping and testing enclosures outdoors to assess durability, water resistance, and ventilation—then iterating based on results&lt;/li>
&lt;li>Developing clear documentation, assembly instructions, and designing with open-source tools&lt;/li>
&lt;li>Exploring material options and print techniques to improve outdoor lifespan and environmental resilience&lt;/li>
&lt;/ul>
&lt;p>Designed largely with FreeCAD and tested in real outdoor conditions, the open-source enclosure will ensure WildBerryEye hardware can be deployed in natural environments for continuous, low-maintenance data collection.&lt;/p>
&lt;h2 id="progress-so-far">Progress So Far&lt;/h2>
&lt;p>Over the past 6 weeks, great progress has been made on the design of the WildBerryEye camera enclosure. Some key accomplishments include:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Full 3D Assembly Model of Electronics:&lt;/strong> Modeled all core components used in the WildBerryEye system to serve as a reference for enclosure design. For parts without existing CAD models, accurate measurements were taken and custom models were created in FreeCAD.&lt;/li>
&lt;li>&lt;strong>Initial Enclosure Prototype:&lt;/strong> Designed and 3D-printed a first full prototype featuring a hinge-latch mechanism to allow tool-free easy access to internal electronics for development and maintenance.&lt;/li>
&lt;li>&lt;strong>Design Iteration Based on Testing:&lt;/strong> Based on the results of the first print, created an improved version with better electronics integration, port alignment, and more functionality.&lt;/li>
&lt;/ul>
&lt;h2 id="challenges--next-steps">Challenges &amp;amp; Next Steps&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Field-Ready Integration:&lt;/strong> Preparing for field testing with upcoming prototypes by making sure that all internal electronics are securely mounted and fully accessible within the enclosure.&lt;/li>
&lt;li>&lt;strong>Latch Mechanism Refinement:&lt;/strong> Finalizing a reliable hinge-latch design that can keep the enclosure sealed during outdoor use while remaining easy to open for maintenance.&lt;/li>
&lt;li>&lt;strong>Balancing Modularity, Size, and Weatherproofing:&lt;/strong> Maintaining a compact form factor without compromising on modularity or weather resistance—especially when routing cables and mounting components.&lt;/li>
&lt;li>&lt;strong>Material Experimentation:&lt;/strong> Beginning test prints with TPU, a flexible filament that may provide improved seals or gaskets for added protection.&lt;/li>
&lt;li>&lt;strong>Ventilation Without Exposure:&lt;/strong> Exploring airflow solutions such as labyrinth-style vents to enable heat dissipation without letting in moisture or debris.&lt;/li>
&lt;/ul>
&lt;h2 id="final-thoughts">Final Thoughts&lt;/h2>
&lt;p>These past 6 weeks have helped me immensely to grow my skills in mechanical design, CAD modeling, and field-focused prototyping. The WildBerryEye system can help researchers monitor pollinators and other wildlife in their natural habitats without requiring constant in-person observation or high-maintenance setups. By enabling long-term, autonomous data collection in outdoor environments, it opens new possibilities for low-cost, scalable ecological monitoring.&lt;/p>
&lt;p>I’m especially grateful to my mentor &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/content/authors/caiespin">Carlos Isaac Espinosa&lt;/a> and the WildBerryEye team for their ongoing support. Excited for the second half, where the design will face real-world testing and help bring this impactful system one step closer to field deployment!&lt;/p></description></item><item><title>Mid-term Blog: Building a Simulator for Benchmarking Replicated Systems</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-mchan/</link><pubDate>Fri, 25 Jul 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-mchan/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello there, I&amp;rsquo;m Michael. In this report, I&amp;rsquo;ll be sharing my progress as part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/umass/edge-replication/">Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges&lt;/a> project under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/fadhil-kurnia/">Fadhil Kurnia&lt;/a>.&lt;/p>
&lt;h2 id="about-the-project">About the Project&lt;/h2>
&lt;p>The goal of the project is to build a &lt;em>language-agnostic&lt;/em> interface that enables communication between clients and any consensus protocol such as MultiPaxos, Raft, Zookeeper Atomic Broadcast (ZAB), and others. Currently, many of these protocols implement their own custom mechanisms for the client to communicate with the group of peers in the network. An implementation of MultiPaxos from the &lt;a href="https://arxiv.org/abs/2405.11183" target="_blank" rel="noopener">MultiPaxos Made Complete&lt;/a> paper for example, uses a custom Protobuf definition for the packets client send to the MultiPaxos system. With the support of a generalized interface, different consensus protocols can now be tested under the same workload to compare their performance objectively.&lt;/p>
&lt;h2 id="progress">Progress&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Literature Study:&lt;/strong>
Reviewed papers and implementations of various protocols including GigaPaxos, Raft, Viewstamped Replication (VSR), and ZAB. Analysis focused on their log replication strategies, fault handling, and performance implications.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Development of Custom Protocol:&lt;/strong>
Two custom protocols are currently under development and will serve as initial test subjects for the testbed:&lt;/p>
&lt;ul>
&lt;li>A modified GigaPaxos protocol&lt;/li>
&lt;li>A Primary-Backup Replication protocol with strict log ordering similar to ZAB (logs are ordered based on the sequence proposed by the primary)&lt;/li>
&lt;/ul>
&lt;p>Most of my time has been spent working on the two protocols, particularly on snapshotting and state transfer functionality in the Primary-Backup protocol. Ideally, the testbed should be able to evaluate protocol performance in scenarios involving node failure or a new node being added. In these scenarios, different protocol implementations often vary in their decision of whether to take periodic snapshots or to roll forward whenever possible and generate a snapshot only when necessary.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="challenges">Challenges&lt;/h2>
&lt;p>Early in the project, the initial goal was to benchmark different consensus protocols using arbitrary full-stack web applications as their workload. Different protocols would replicate a full-stack application running inside Docker containers across multiple nodes and the testbed would send requests for them to coordinate between those nodes. In fact, the 2 custom protocols being worked on are specifically made to fit these constraints.&lt;/p>
&lt;p>Developing a custom protocol that supports the replication of a Docker container is in itself already a difficult task. Abstracting away the functionality that allows communicating with the docker containers, as well as handling entry logs and snapshotting the state, is an order of magnitude more complicated.&lt;/p>
&lt;p>As mentioned in the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250613-mchan/">first blog&lt;/a>, an application can be categorized into two types: deterministic and non-deterministic applications. The coordination of these two types of applications are handled in very different ways. Most consensus protocols support only deterministic systems, such as key-value stores and can&amp;rsquo;t easily handle coordination of complex services or external side effects. To allow support for non-deterministic applications would require abstracting over protocol-specific log structures. This effectively restricts the interface to only support protocols that conform to the abstraction, defeating the goal of making the interface broadly usable and protocol-agnostic.&lt;/p>
&lt;p>Furthermore, in order to allow &lt;strong>any&lt;/strong> existing protocols to support running something as complex as a stateful docker container without the protocol itself even knowing adds another layer of complexity to the system.&lt;/p>
&lt;h2 id="future-goals">Future Goals&lt;/h2>
&lt;p>Given these challenges, I decided to pivot to using only key-value stores as the application being used in the benchmark. This aligns with the implementations of most of the existing protocols which typically use key-value stores. In doing so, now the main focus would be to implement an interface that supports HTTP requests from clients to any arbitrary protocols.&lt;/p></description></item><item><title>Midterm Blog: Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/</link><pubDate>Fri, 25 Jul 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/</guid><description>&lt;p>Hello! I&amp;rsquo;m Panji Sri Kuncara Wisma and I want to share my midterm progress on the &amp;ldquo;Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges&amp;rdquo; project under the mentorship of Fadhil I. Kurnia.&lt;/p>
&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>The goal of our project is to create an open testbed that enables fair, reproducible evaluation of different consensus protocols (Paxos variants, EPaxos, Raft, etc.) when deployed at network edges. Currently, researchers struggle to compare these systems because they lack standardized evaluation environments and often rely on mock implementations of proprietary systems.&lt;/p>
&lt;p>XDN (eXtensible Distributed Network) is one of the important consensus systems we plan to evaluate in our benchmarking testbed. Built on GigaPaxos, it allows deployment of replicated stateful services across edge locations. As part of preparing our benchmarking framework, we need to ensure that the systems we evaluate, including XDN, are robust for fair comparison.&lt;/p>
&lt;h2 id="progress">Progress&lt;/h2>
&lt;p>As part of preparing our benchmarking tool, I have been working on refactoring XDN&amp;rsquo;s FUSE filesystem from C++ to Rust. This work is essential for creating a stable and reliable XDN platform.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="System Architecture" srcset="
/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_5600401ae6570bf38b96fa89a080f4f7.webp 400w,
/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_6d3b555dbec3bdb305839eda9b227acf.webp 760w,
/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_5600401ae6570bf38b96fa89a080f4f7.webp"
width="760"
height="439"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The diagram above illustrates how the FUSE filesystem integrates with XDN&amp;rsquo;s distributed architecture. On the left, we see the standard FUSE setup where applications interact with the filesystem through the kernel&amp;rsquo;s VFS layer. On the right, the distributed replication flow is shown: Node 1 runs &lt;code>fuselog_core&lt;/code> which captures filesystem operations and generates statediffs, while Nodes 2 and 3 run &lt;code>fuselog_apply&lt;/code> to receive and apply these statediffs, maintaining replica consistency across the distributed system.&lt;/p>
&lt;p>This FUSE component is critical for XDN&amp;rsquo;s operation as it enables transparent state capture and replication across edge nodes. By refactoring this core component from C++ to Rust, we&amp;rsquo;re hopefully strengthening the foundation for fair benchmarking comparisons in our testbed.&lt;/p>
&lt;h3 id="core-work-c-to-rust-fuse-filesystem-migration">Core Work: C++ to Rust FUSE Filesystem Migration&lt;/h3>
&lt;p>XDN relies on a FUSE (Filesystem in Userspace) component to capture filesystem operations and generate &amp;ldquo;statediffs&amp;rdquo; - records of changes that get replicated across edge nodes. The original C++ implementation worked but had memory safety concerns and limited optimization capabilities.&lt;/p>
&lt;p>I worked on refactoring from C++ to Rust, implementing several improvements:&lt;/p>
&lt;p>&lt;strong>New Features Added:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Zstd Compression&lt;/strong>: Reduces statediff payload sizes&lt;/li>
&lt;li>&lt;strong>Adaptive Compression&lt;/strong>: Intelligently chooses compression strategies&lt;/li>
&lt;li>&lt;strong>Advanced Pruning&lt;/strong>: Removes redundant operations (duplicate chmod/chown, created-then-deleted files)&lt;/li>
&lt;li>&lt;strong>Bincode Serialization&lt;/strong>: Helps avoid manual serialization code and reduces the risk of related bugs&lt;/li>
&lt;li>&lt;strong>Extended Operations&lt;/strong>: Added support for additional filesystem operations (mkdir, symlink, hardlinks, etc.)&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Architectural Improvements:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Memory Safety&lt;/strong>: Rust&amp;rsquo;s ownership system helps prevent common memory management issues&lt;/li>
&lt;li>&lt;strong>Type Safety&lt;/strong>: Using Rust enums instead of integer constants for better type checking&lt;/li>
&lt;/ul>
&lt;h2 id="findings">Findings&lt;/h2>
&lt;p>The optimization results performed as expected:&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Database Performance Comparison" srcset="
/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_cb1ea5caaa82d543dfeabd0c97f7c4fe.webp 400w,
/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_d65f44ef3f769dddda7f0211b94ad6b6.webp 760w,
/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_cb1ea5caaa82d543dfeabd0c97f7c4fe.webp"
width="760"
height="433"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;strong>Statediff Size Reductions:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>MySQL workload&lt;/strong>: 572MB → 29.6MB (95% reduction)&lt;/li>
&lt;li>&lt;strong>PostgreSQL workload&lt;/strong>: 76MB → 11.9MB (84% reduction)&lt;/li>
&lt;li>&lt;strong>SQLite workload&lt;/strong>: 4MB → 29KB (99% reduction)&lt;/li>
&lt;/ul>
&lt;p>The combination of write coalescing, pruning, and compression proves especially effective for database workloads, where many operations involve small changes to large files.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Rust vs C&amp;#43;&amp;#43; Performance Comparison" srcset="
/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_2adee964972897a04e60327dcfe9675e.webp 400w,
/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_dd86a6fc0dabbac3beb17266f1f49002.webp 760w,
/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_2adee964972897a04e60327dcfe9675e.webp"
width="760"
height="470"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;strong>Performance Comparison:&lt;/strong>
Remarkably, the Rust implementation matches or exceeds C++ performance:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>POST operations&lt;/strong>: 30% faster (10.5ms vs 15ms)&lt;/li>
&lt;li>&lt;strong>DELETE operations&lt;/strong>: 33% faster (10ms vs 15ms)&lt;/li>
&lt;li>&lt;strong>Overall latency&lt;/strong>: Consistently better (9ms vs 11ms)&lt;/li>
&lt;/ul>
&lt;h2 id="current-challenges">Current Challenges&lt;/h2>
&lt;p>While the core implementation is complete and functional, I&amp;rsquo;m currently debugging occasional latency spikes that occur under specific workload patterns. These edge cases need to be resolved before moving on to the benchmarking phase, as inconsistent performance could compromise the reliability of the evaluation.&lt;/p>
&lt;h2 id="next-steps">Next Steps&lt;/h2>
&lt;p>With the FUSE filesystem foundation nearly complete, next steps include:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Resolve latency spike issues&lt;/strong> and complete XDN stabilization&lt;/li>
&lt;li>&lt;strong>Build benchmarking framework&lt;/strong> - a comparison tool that can systematically evaluate different consensus protocols with standardized metrics.&lt;/li>
&lt;li>&lt;strong>Run systematic evaluation&lt;/strong> across protocols&lt;/li>
&lt;/ol>
&lt;p>The optimized filesystem will hopefully provide a stable base for reproducible performance comparisons between distributed consensus protocols.&lt;/p></description></item><item><title>Mid-term Blog: StatWrap: Cross-Project Searching and Classification using Local Indexing</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/</link><pubDate>Tue, 15 Jul 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello everyone!&lt;br>
I am Debangi Ghosh from India, an undergraduate student at the Indian Institute of Technology (IIT) BHU, Varanasi. As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/northwestern/statwrap/">StatWrap: Cross-Project Searching and Classification using Local Indexing&lt;/a> project, my &lt;a href="https://drive.google.com/file/d/1dxyBP2oMJwYDCKyIWzr465zNmm6UWtnI/view?usp=sharing" target="_blank" rel="noopener">proposal&lt;/a>, under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/luke-rasmussen/">Luke Rasmussen&lt;/a>, focuses on developing a full-text search service within the StatWrap user interface. This involves evaluating different search libraries and implementing a classification system to distinguish between active and past projects.&lt;/p>
&lt;h2 id="about-the-project">&lt;strong>About the Project&lt;/strong>&lt;/h2>
&lt;p>As part of the project, I am working on enhancing the usability of StatWrap by enabling efficient cross-project search capabilities. The goal is to make it easier for investigators to discover relevant projects, notes, and assets—across both current and archived work—using information that is either user-entered or passively collected by StatWrap.&lt;/p>
&lt;p>Given the sensitivity of the data involved, one of the key requirements is that all indexing and search operations must be performed locally. To address this, my responsibilities include:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Evaluating open-source search libraries&lt;/strong> suitable for local indexing and retrieval&lt;/li>
&lt;li>&lt;strong>Building the full-text search functionality&lt;/strong> directly into the StatWrap UI to allow seamless querying across projects&lt;/li>
&lt;li>&lt;strong>Ensuring reliability&lt;/strong> through the development of unit tests and comprehensive system testing&lt;/li>
&lt;li>&lt;strong>Implementing a classification system&lt;/strong> to label projects as “Active,” “Pinned,” or “Past” within the user interface&lt;/li>
&lt;/ul>
&lt;p>This project offers a great opportunity to work at the intersection of software development, information retrieval, and user-centric design—while contributing to research reproducibility and collaboration within scientific workflows.&lt;/p>
&lt;h2 id="progress">Progress&lt;/h2>
&lt;p>It has been more than six weeks since the project began, and significant progress has been made. Here&amp;rsquo;s a breakdown:&lt;/p>
&lt;h3 id="1-descriptive-comparison-of-open-source-libraries">1. &lt;strong>Descriptive Comparison of Open-Source Libraries&lt;/strong>&lt;/h3>
&lt;p>Compared various open-source search libraries based on evaluation criteria such as &lt;strong>indexing speed, search speed, memory usage, typo tolerance, fuzzy searching, partial matching, full-text queries, contextual search, Boolean support, exact word match, installation ease, maintenance, documentation&lt;/strong>, and &lt;strong>developer experience&lt;/strong>.&lt;/p>
&lt;h3 id="2-the-libraries">2. &lt;strong>The Libraries&lt;/strong>&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Lunr.js&lt;/strong>&lt;br>
A small, client-side full-text search engine that mimics Solr capabilities.&lt;/p>
&lt;ul>
&lt;li>Field-based search, boosting&lt;/li>
&lt;li>Supports TF-IDF, inverted index&lt;/li>
&lt;li>No built-in fuzzy search (only basic wildcards)&lt;/li>
&lt;li>Can serialize/deserialize index&lt;/li>
&lt;li>Not designed for large datasets&lt;/li>
&lt;li>Moderate memory usage and indexing speed&lt;/li>
&lt;li>Good documentation&lt;/li>
&lt;li>&lt;strong>Best for&lt;/strong>: Static websites or SPAs needing simple in-browser search&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>ElasticLunr.js&lt;/strong>&lt;br>
A lightweight, more flexible alternative to Lunr.js.&lt;/p>
&lt;ul>
&lt;li>Dynamic index (add/remove docs)&lt;/li>
&lt;li>Field-based and weighted search&lt;/li>
&lt;li>No advanced fuzzy matching&lt;/li>
&lt;li>Faster and more customizable than Lunr&lt;/li>
&lt;li>Smaller footprint&lt;/li>
&lt;li>Easy to use and maintain&lt;/li>
&lt;li>&lt;strong>Best for&lt;/strong>: Developers wanting Lunr-like features with simpler customization&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Fuse.js&lt;/strong>&lt;br>
A fuzzy search library ideal for small to medium datasets.&lt;/p>
&lt;ul>
&lt;li>Fuzzy search with typo tolerance&lt;/li>
&lt;li>Deep key/path searching&lt;/li>
&lt;li>No need to build index&lt;/li>
&lt;li>Highly configurable (threshold, distance, etc.)&lt;/li>
&lt;li>Linear scan = slower on large datasets&lt;/li>
&lt;li>Not full-text search (scoring-based match)&lt;/li>
&lt;li>Extremely easy to set up and use&lt;/li>
&lt;li>&lt;strong>Best for&lt;/strong>: Fuzzy search in small in-memory arrays (e.g., auto-suggest, dropdown filters)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>FlexSearch&lt;/strong>&lt;br>
A blazing-fast, modular search engine with advanced indexing options.&lt;/p>
&lt;ul>
&lt;li>Extremely fast search and indexing&lt;/li>
&lt;li>Supports phonetic, typo-tolerant, and partial matching&lt;/li>
&lt;li>Asynchronous support&lt;/li>
&lt;li>Multi-language + Unicode-friendly&lt;/li>
&lt;li>Low memory footprint&lt;/li>
&lt;li>Configuration can be complex for beginners&lt;/li>
&lt;li>&lt;strong>Best for&lt;/strong>: High-performance search in large/multilingual datasets&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>MiniSearch&lt;/strong>&lt;br>
A small, full-text search engine with balanced performance and simplicity.&lt;/p>
&lt;ul>
&lt;li>Fast indexing and searching&lt;/li>
&lt;li>Fuzzy search, stemming, stop words&lt;/li>
&lt;li>Field boosting and prefix search&lt;/li>
&lt;li>Compact, can serialize index&lt;/li>
&lt;li>Clean and modern API&lt;/li>
&lt;li>Lightweight and easy to maintain&lt;/li>
&lt;li>&lt;strong>Best for&lt;/strong>: Balanced, in-browser full-text search for moderate datasets&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Search-Index&lt;/strong>&lt;br>
A persistent, full-featured search engine for Node.js and browsers.&lt;/p>
&lt;ul>
&lt;li>Persistent storage with LevelDB&lt;/li>
&lt;li>Real-time indexing&lt;/li>
&lt;li>Fielded queries, faceting, filtering&lt;/li>
&lt;li>Advanced queries (Boolean, range, etc.)&lt;/li>
&lt;li>Slightly heavier setup&lt;/li>
&lt;li>Good for offline/local-first apps&lt;/li>
&lt;li>Browser usage more complex than others&lt;/li>
&lt;li>&lt;strong>Best for&lt;/strong>: Node.js apps, &lt;strong>not directly compatible with the Electron + React environment of StatWrap&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="3-developer-experience-and-maintenance">3. Developer Experience and Maintenance&lt;/h3>
&lt;p>We analyzed the download trends of the search libraries using npm trends, and also reviewed their maintenance statistics to assess how frequently they are updated.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="DOWNLOADS" srcset="
/report/osre25/northwestern/statwrap/20250715-debangi29/downloads_hu3acc13cb2503d87ec01b259eecff7d9f_205568_2981b0e25cc7e6da71dd1af69f1ab499.webp 400w,
/report/osre25/northwestern/statwrap/20250715-debangi29/downloads_hu3acc13cb2503d87ec01b259eecff7d9f_205568_52b5a1c87803e2c8a2f59ad52703cd75.webp 760w,
/report/osre25/northwestern/statwrap/20250715-debangi29/downloads_hu3acc13cb2503d87ec01b259eecff7d9f_205568_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/downloads_hu3acc13cb2503d87ec01b259eecff7d9f_205568_2981b0e25cc7e6da71dd1af69f1ab499.webp"
width="760"
height="362"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;br>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Maintenance" srcset="
/report/osre25/northwestern/statwrap/20250715-debangi29/Maintenance_hub392779bb7551900858e36e62009d315_166372_50f35746c2224661759e3d1f68308f5c.webp 400w,
/report/osre25/northwestern/statwrap/20250715-debangi29/Maintenance_hub392779bb7551900858e36e62009d315_166372_1f83a8585ae086eae8ad16a0d18c8fff.webp 760w,
/report/osre25/northwestern/statwrap/20250715-debangi29/Maintenance_hub392779bb7551900858e36e62009d315_166372_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/Maintenance_hub392779bb7551900858e36e62009d315_166372_50f35746c2224661759e3d1f68308f5c.webp"
width="760"
height="261"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="4-comparative-analysis-after-testing">4. Comparative Analysis After Testing&lt;/h3>
&lt;p>Each search library was benchmarked against a predefined set of queries based on the same evaluation criteria.&lt;br>
We are yet to finalize the weights for each criterion, which will be done during the end-term evaluation.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="COMPARATIVE ANALYSIS" srcset="
/report/osre25/northwestern/statwrap/20250715-debangi29/image_huff63b524c7af2307fdfe0ebf7a2c55bc_128809_cf08ab4466e54fc0970dac451ab583d2.webp 400w,
/report/osre25/northwestern/statwrap/20250715-debangi29/image_huff63b524c7af2307fdfe0ebf7a2c55bc_128809_4d08ea843125818ade4b1288b2ed91fd.webp 760w,
/report/osre25/northwestern/statwrap/20250715-debangi29/image_huff63b524c7af2307fdfe0ebf7a2c55bc_128809_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/image_huff63b524c7af2307fdfe0ebf7a2c55bc_128809_cf08ab4466e54fc0970dac451ab583d2.webp"
width="760"
height="578"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="5-the-user-interface">5. The User Interface&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="User Interface" srcset="
/report/osre25/northwestern/statwrap/20250715-debangi29/UI_hu614745e803a206ba95d1613340cef4da_263973_ad72fdc47d934ea42f989055b49d88aa.webp 400w,
/report/osre25/northwestern/statwrap/20250715-debangi29/UI_hu614745e803a206ba95d1613340cef4da_263973_51decc3c2ce6793ca567153dd67113d0.webp 760w,
/report/osre25/northwestern/statwrap/20250715-debangi29/UI_hu614745e803a206ba95d1613340cef4da_263973_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/UI_hu614745e803a206ba95d1613340cef4da_263973_ad72fdc47d934ea42f989055b49d88aa.webp"
width="760"
height="475"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;br>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Debug Tools" srcset="
/report/osre25/northwestern/statwrap/20250715-debangi29/image-1_huff1ce04307fd90cec714c35adb969f67_82199_e86edc8fa7aba824f1fd8a90948c619c.webp 400w,
/report/osre25/northwestern/statwrap/20250715-debangi29/image-1_huff1ce04307fd90cec714c35adb969f67_82199_ba6358e5089040847a0e39704677cc12.webp 760w,
/report/osre25/northwestern/statwrap/20250715-debangi29/image-1_huff1ce04307fd90cec714c35adb969f67_82199_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/northwestern/statwrap/20250715-debangi29/image-1_huff1ce04307fd90cec714c35adb969f67_82199_e86edc8fa7aba824f1fd8a90948c619c.webp"
width="760"
height="482"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The user interface includes options to search using three search modes (Basic, Advanced, Boolean operators) with configurable parameters. Results are sorted based on relevance score (highest first), and also grouped by category.&lt;/p>
&lt;h3 id="6-overall-functioning">6. Overall Functioning&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Indexing Workflow&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Projects are processed sequentially&lt;/li>
&lt;li>Metadata, files, people, and notes are indexed (larger files are queued for later)&lt;/li>
&lt;li>Uses a &amp;ldquo;brute-force&amp;rdquo; recursive approach to walk through project directories
&lt;ul>
&lt;li>Skips directories like &lt;code>node_modules&lt;/code>, &lt;code>.git&lt;/code>, &lt;code>.statwrap&lt;/code>&lt;/li>
&lt;li>Identifies eligible text files for indexing&lt;/li>
&lt;li>Logs progress every 10 files&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Document Creation Logic&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Reads file content as UTF-8 text&lt;/li>
&lt;li>Builds searchable documents with filename, content, and metadata&lt;/li>
&lt;li>Auto-generates tags based on content and file type&lt;/li>
&lt;li>Adds documents to the search index and document store&lt;/li>
&lt;li>Handles errors gracefully with debug logging&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Search Functionality&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Uses field-weighted search&lt;/li>
&lt;li>Enriches results with document metadata&lt;/li>
&lt;li>Supports filtering by type or project&lt;/li>
&lt;li>Groups results by category (files, projects, people, etc.)&lt;/li>
&lt;li>Implements caching for improved performance&lt;/li>
&lt;li>Search statistics are generated to monitor performance&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="challenges-and-end-term-goals">Challenges and End-Term Goals&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>In-memory Indexing Metadata Storing&lt;/strong>&lt;br>
Most JavaScript search libraries (like Fuse.js, Lunr, MiniSearch) store indexes entirely in memory, which can become problematic for large-scale datasets. A key challenge is designing a scalable solution that allows for disk persistence or lazy loading to prevent memory overflows.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Deciding the Weights Accordingly&lt;/strong>&lt;br>
An important challenge is tuning the relevance scoring by assigning appropriate weights to different aspects of the search, such as exact word matches, prefix matches, and typo tolerance. For instance, we prefer exact matches to be ranked higher than fuzzy or partial matches.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Implementing the Selected Library&lt;/strong>&lt;br>
Once a library is selected (based on speed, features, and compatibility with Electron + React), the next challenge is integrating it into StatWrap efficiently—ensuring local indexing, accurate search results, and smooth performance even with large projects.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Classifying Active and Past Projects in the User Interface&lt;/strong>&lt;br>
To improve navigation and search scoping, we plan to introduce three project sections in the interface: &lt;strong>Pinned&lt;/strong>, &lt;strong>Active&lt;/strong>, and &lt;strong>Past&lt;/strong> projects. This classification will help users prioritize relevant content while enabling smarter indexing strategies.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Stay tuned for the next blog!&lt;/p></description></item><item><title>MPI Appliance for HPC Research on Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250614-rohan-babbar/</link><pubDate>Sat, 14 Jun 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/uchicago/mpi/20250614-rohan-babbar/</guid><description>&lt;p>Hi Everyone,&lt;/p>
&lt;p>I’m Rohan Babbar from Delhi, India. This summer, I’m excited to be working with the Argonne National Laboratory and the Chameleon Cloud community. My &lt;a href="https://ucsc-ospo.github.io/project/osre25/uchicago/mpi/" target="_blank" rel="noopener">project&lt;/a> focuses on developing an MPI Appliance to support reproducible High-Performance Computing (HPC) research on the Chameleon testbed.&lt;/p>
&lt;p>For more details about the project and the planned work for the summer, you can read my proposal &lt;a href="https://docs.google.com/document/d/1iOx95-IcEOSVxpOkL20-jT5SSDOwBiP78ysSUNpRwXs/edit?usp=sharing" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;h3 id="-community-bonding-period">👥 Community Bonding Period&lt;/h3>
&lt;p>Although the project officially started on June 2, 2025, I made good use of the community bonding period beforehand.&lt;/p>
&lt;ul>
&lt;li>I began by getting access to the Chameleon testbed, familiarizing myself with its features and tools.&lt;/li>
&lt;li>I experimented with different configurations to understand the ecosystem.&lt;/li>
&lt;li>My mentor, &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/ken-raffenetti/">Ken Raffenetti&lt;/a>, and I had regular check-ins to align our vision and finalize our milestones, many of which were laid out in my proposal.&lt;/li>
&lt;/ul>
&lt;h3 id="-june-2--june-14-2025">🔧 June 2 – June 14, 2025&lt;/h3>
&lt;p>Our first milestone was to build a base image with MPI pre-installed. For this:&lt;/p>
&lt;ul>
&lt;li>We decided to use &lt;a href="https://spack.io/" target="_blank" rel="noopener">Spack&lt;/a>, a flexible package manager tailored for HPC environments.&lt;/li>
&lt;li>The image includes multiple MPI implementations, allowing users to choose the one that best suits their needs and switch between them using simple &lt;a href="https://lmod.readthedocs.io/en/latest/" target="_blank" rel="noopener">Lua Module&lt;/a> commands.&lt;/li>
&lt;/ul>
&lt;p>📌 That’s all for now! Stay tuned for more updates in the next blog.&lt;/p>
&lt;p>Thanks for reading!&lt;/p></description></item><item><title>ML-Powered Problem Detection in Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20241018-syed/</link><pubDate>Fri, 18 Oct 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20241018-syed/</guid><description>&lt;p>Hello! My name is &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/syed-mohammad-qasim/">Syed Mohammad Qasim&lt;/a>, a PhD candidate at the Department of Electrical and Computer Engineering, Boston University.
This summer I worked on the project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/uchicago/ml_detect_chameleon/">ML-Powered Problem Detection in Chameleon&lt;/a>
as part of the Summer of Reproducibility (SoR) program with the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/ayse-coskun/">Ayse Coskun&lt;/a> and &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/michael-sherman/">Michael Sherman&lt;/a>.&lt;/p>
&lt;p>Chameleon is an open testbed that has supported over 5,000 users working on more than 500 projects.
It provides access to over 538 bare metal nodes across various sites, offering approximately 15,000 CPU cores and 5 petabytes of storage.
Each site runs independent OpenStack services to deliver its offerings.
Currently, Chameleon Cloud comprehensively monitors the sites at the Texas Advanced Computing Center (TACC) and the University of Chicago.
Metrics are collected using Prometheus at each site and fed into a central Mimir cluster.
All logs are sent to a central Loki, with Grafana used for visualization and alerting.
Chameleon currently collects around 3,000 metrics. Manually reviewing and setting alerts for them is time-consuming and labor-intensive.
This project aims to help Chameleon operators monitor their systems more effectively and improve overall reliability by creating an anomaly detection service to augment the existing alerting framework.
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="High level data flow" srcset="
/report/osre24/uchicago/chameleoncloud/20241018-syed/ad_hubea58d1d850a610a2195fe38eece12fb_58199_deb097bd50da0d94a76fc0dc7719233e.webp 400w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/ad_hubea58d1d850a610a2195fe38eece12fb_58199_deeb0941e942a319e1cc5a8b743b6993.webp 760w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/ad_hubea58d1d850a610a2195fe38eece12fb_58199_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20241018-syed/ad_hubea58d1d850a610a2195fe38eece12fb_58199_deb097bd50da0d94a76fc0dc7719233e.webp"
width="760"
height="412"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Over the summer, we focused on analyzing the data and identified 33 key metrics, after discussions with Chameleon operators, from the Prometheus Node Exporter that serve as leading indicators of resource usage on the nodes. For example:&lt;/p>
&lt;ul>
&lt;li>CPU usage: Metrics like node_load1, node_load5, and node_load15.&lt;/li>
&lt;li>Memory usage: Including buffer utilization.&lt;/li>
&lt;li>Disk usage: Metrics for I/O time, and read/write byte rates.&lt;/li>
&lt;li>Network activity: Rate of bytes received and transmitted.&lt;/li>
&lt;li>Filesystem metrics: Such as inode_utilization_ratio and node_procs_blocked.&lt;/li>
&lt;li>System-level metrics: Including node forks, context switches, and interrupts.&lt;/li>
&lt;/ul>
&lt;p>Collected at a rate of every 5 minutes, these metrics provide a comprehensive view of node performance and resource consumption.
After finalizing the metrics we wanted to monitor, we selected the following four anomaly detection methods, primarily due to their popularity in academia and recent publication in high-impact conferences such as SIG-KDD and SC.&lt;/p>
&lt;ul>
&lt;li>Omni Anomaly, [KDD 2019] [without POT selection as it requires labels.]&lt;/li>
&lt;li>USAD, [KDD 2020]&lt;/li>
&lt;li>TranAD, [KDD 2022]&lt;/li>
&lt;li>Prodigy, [SC 2023] [Only the VAE, not using their feature selection as it requires labels.]&lt;/li>
&lt;/ul>
&lt;p>We collected 75 days of healthy data from Chameleon, and after applying min-max scaling, we trained the models.
We then used these models to run inference on the metrics collected during outages, as marked by Chameleon operators.
The goal was to determine whether the outage data revealed something interesting or anomalous.
We can verify our approach by manually reviewing the results generated by these four anomaly detection methods.
Below are the results from the four methods on different outages, followed by an example of how these methods identified the root cause of an anomaly.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Resulsts of different approaches" srcset="
/report/osre24/uchicago/chameleoncloud/20241018-syed/comparison_plot_huca4b68c1c2625c3b2e86230c54612ea2_129034_adb242a18524d714dae87d46b29e1612.webp 400w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/comparison_plot_huca4b68c1c2625c3b2e86230c54612ea2_129034_9dcdbbc6bac285c06195f54d49bd5ffe.webp 760w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/comparison_plot_huca4b68c1c2625c3b2e86230c54612ea2_129034_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20241018-syed/comparison_plot_huca4b68c1c2625c3b2e86230c54612ea2_129034_adb242a18524d714dae87d46b29e1612.webp"
width="760"
height="355"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The above figure shows the percentage of outage data that was flagged as anomalous by different models.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="cause of anomaly according to each model" srcset="
/report/osre24/uchicago/chameleoncloud/20241018-syed/partial-authentication-outage_plot_hu92456580e56fddf4f3c592621d13c105_392593_6edd22782678b48ce3a7cebad859b982.webp 400w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/partial-authentication-outage_plot_hu92456580e56fddf4f3c592621d13c105_392593_3da0e020cdc4ddfd508b77b6a0adc3d2.webp 760w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/partial-authentication-outage_plot_hu92456580e56fddf4f3c592621d13c105_392593_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20241018-syed/partial-authentication-outage_plot_hu92456580e56fddf4f3c592621d13c105_392593_6edd22782678b48ce3a7cebad859b982.webp"
width="760"
height="532"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="cause of anomaly according to each model" srcset="
/report/osre24/uchicago/chameleoncloud/20241018-syed/chiuc-uplink-networking_plot_hu92456580e56fddf4f3c592621d13c105_376789_03e6f344d24d9b37a7d615ee3207586b.webp 400w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/chiuc-uplink-networking_plot_hu92456580e56fddf4f3c592621d13c105_376789_6193a4b7b9107cb2693435514d80d21d.webp 760w,
/report/osre24/uchicago/chameleoncloud/20241018-syed/chiuc-uplink-networking_plot_hu92456580e56fddf4f3c592621d13c105_376789_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20241018-syed/chiuc-uplink-networking_plot_hu92456580e56fddf4f3c592621d13c105_376789_03e6f344d24d9b37a7d615ee3207586b.webp"
width="760"
height="532"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The above two plots shows two examples of the top 5 metrics which contributed to the anomaly score by each anomaly detection model.&lt;/p>
&lt;p>Although the methods seem to indicate anomalies during outages, they are not able to pinpoint the affected service or the exact cause.
For example, the first partial authentication outage was due to a DNS error, which can manifest in various ways, such as reduced CPU, memory, or network usage.
This work is still in progress, and we are conducting the same analysis on container-level metrics for each service, allowing us to narrow the scope to the affected service and more effectively identify the root cause of anomalies.
We will share the next set of results soon.&lt;/p>
&lt;p>Thanks for your time, please feel free to reach out to me for any details or questions.&lt;/p></description></item><item><title>Data Leakage in Applied ML: model uses features that are not legitimate</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240924-shaivimalik/</link><pubDate>Tue, 24 Sep 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240924-shaivimalik/</guid><description>&lt;p>Hello everyone!&lt;/p>
&lt;p>I have been working on reproducing the results from &lt;strong>Identification of COVID-19 Samples from Chest X-Ray Images Using Deep Learning: A Comparison of Transfer Learning Approaches&lt;/strong>. This study aimed to distinguish COVID-19 cases from normal and pneumonia cases using chest X-ray images. Since my last blog post, we have successfully reproduced the results using the VGG19 model, achieving a 92% accuracy on the test set. However, a significant demographic inconsistency exists: normal and pneumonia chest X-ray images were from pediatric patients, while COVID-19 chest X-ray images were from adults. This allowed the model to achieve high accuracy by learning features that were not clinically relevant.&lt;/p>
&lt;p>In &lt;a href="https://github.com/shaivimalik/covid_illegitimate_features/blob/main/notebooks/Correcting_Original_Result.ipynb" target="_blank" rel="noopener">Reproducing “Identification of COVID-19 samples from chest X-Ray images using deep learning: A comparison of transfer learning approaches” without Data Leakage&lt;/a>, we followed the methodology outlined in the paper, but with a key change: we used datasets containing adult chest X-ray images. This time, the model achieved an accuracy of 51%, a 41% drop from the earlier results, confirming that the metrics reported in the paper were overly optimistic due to data leakage, where the model learned illegitimate features.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="GradCAM from husky vs wolf example " srcset="
/report/osre24/nyu/data-leakage/20240924-shaivimalik/gradcam_hu02772b80d1d95ff5ae817af6261a6059_438521_7bc94e0816aa962665434756bf41e27d.webp 400w,
/report/osre24/nyu/data-leakage/20240924-shaivimalik/gradcam_hu02772b80d1d95ff5ae817af6261a6059_438521_a160058d1708baa257daa63de5fada34.webp 760w,
/report/osre24/nyu/data-leakage/20240924-shaivimalik/gradcam_hu02772b80d1d95ff5ae817af6261a6059_438521_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240924-shaivimalik/gradcam_hu02772b80d1d95ff5ae817af6261a6059_438521_7bc94e0816aa962665434756bf41e27d.webp"
width="760"
height="329"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>To further illustrate this issue, we created a &lt;a href="https://github.com/shaivimalik/covid_illegitimate_features/blob/main/notebooks/Exploring_ConvNet_Activations.ipynb" target="_blank" rel="noopener">toy example&lt;/a> demonstrating how a model can learn illegitimate features. Using a small dataset of wolf and husky images, the model achieved an accuracy of 90%. We then revealed that this performance was due to a data leakage issue: all wolf images had snowy backgrounds, while husky images had grassy backgrounds. When we trained the model on a dataset where both wolf and husky images had white backgrounds, the accuracy dropped to 70%. This shows that the accuracy obtained earlier was an overly optimistic measure due to data leakage.&lt;/p>
&lt;p>You can explore our work on the COVID-19 paper &lt;a href="https://github.com/shaivimalik/covid_illegitimate_features" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>Lastly, I would like to thank &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/fraida-fund/">Fraida Fund&lt;/a> and &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/mohamed-saeed/">Mohamed Saeed&lt;/a> for their support and guidance throughout my SoR journey.&lt;/p></description></item><item><title>Final Post: Enhancing Reproducibility and Portability in Network Experiments</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tum/slices/20240905-warmuth/</link><pubDate>Thu, 05 Sep 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tum/slices/20240905-warmuth/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>As my project with the Summer of Reproducibility (SoR) 2024 comes to a close, I’d like to reflect on the journey and the outcomes achieved. My project focused on &lt;strong>enhancing the reproducibility and portability of network experiments&lt;/strong> by integrating the &lt;strong>RO-Crate standard&lt;/strong> into the &lt;strong>TUM intern testbed pos (plain orchestrating service)&lt;/strong>, and deploying this testbed on the &lt;strong>Chameleon cloud infrastructure&lt;/strong>. The aim was to ensure that experiments conducted on one platform could be seamlessly reproduced on another, adhering to the &lt;strong>FAIR principles&lt;/strong> (Findable, Accessible, Interoperable, Reusable) for research data.&lt;/p>
&lt;h2 id="project-recap">Project Recap&lt;/h2>
&lt;p>The core goal was to make the experiments reproducible and portable between different testbeds like TUM’s pos and Chameleon. To achieve this, I integrated the &lt;strong>RO-Crate standard&lt;/strong>, which ensures that all experiment data is automatically documented and stored with metadata, making it easier for others and especially for machines to understand, replicate, and build on the results. Additionally, deploying a lightweight version of pos on the &lt;strong>Chameleon testbed&lt;/strong> enabled cross-testbed execution, allowing experiments to be replicated across both environments without significant modifications.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;p>Over the course of the project, several key milestones were achieved:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>RO-Crate Integration&lt;/strong>: The first step was restructuring the results folder and automating the generation of metadata using RO-Crate. This ensured that all experiment data was comprehensively documented with details like author information, hardware configurations, and experiment scripts resulting in comprehensive &lt;code>ro-crate-metadata.json&lt;/code> files as important part of each result folder.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Improved Data Management&lt;/strong>: The integration of RO-Crate greatly simplified the process of organizing and retrieving experiment data and metadata with information about the experiment and the result files. All metadata was automatically generated, making it easier to share and document the experiments for other researchers to replicate.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Automatic Upload to Zenodo&lt;/strong>: Another crucial achievement was the implementation of automatic uploading of pos experiment result folders to &lt;strong>Zenodo&lt;/strong>, an open-access repository. This step significantly improved the reproducibility and sharing of experiment results, making them easily accessible to the broader scientific community. By utilizing Zenodo, we ensured that experiment results, along with their RO-Crate metadata, could be archived and referenced, fostering greater transparency and collaboration in scientific research.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Chameleon Deployment&lt;/strong>: Deploying the pos testbed within the Chameleon environment required managing various complexities, particularly related to Chameleon’s OpenStack API, networking setup, and hardware configurations. Coordinating the network components and infrastructure to support pos functionality in this testbed environment demanded significant adjustments to ensure smooth integration and operation.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="challenges">Challenges&lt;/h2>
&lt;p>Like any project, this one came with its own set of challenges:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Balancing Automation and Flexibility&lt;/strong>: While automating the generation of RO-Crate metadata, it was crucial to ensure that the flexibility required by researchers for customizing their documentation was not compromised. Finding this balance required in-depth adjustments to the testbed infrastructure.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Complexity of Testbed Systems&lt;/strong>: Integrating RO-Crate into a complex system like pos, and ensuring it works seamlessly with Chameleon, involved understanding and adapting to the complexities of both testbeds.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="future-directions">Future Directions&lt;/h2>
&lt;p>As I move forward with my master&amp;rsquo;s thesis working on these challenges, we plan to expand on this work by:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Extending the Chameleon Deployment&lt;/strong>: We aim to deploy the full version of pos on Chameleon, supporting more complex and larger-scale experiments.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Supporting Complex Experiment Workflows&lt;/strong>: Future work will focus on handling more intricate and larger datasets, ensuring reproducibility for complex workflows. Only by executing more complex experiments will we be able to thoroughly analyze and compare the differences between executions in pos and the pos deployed on Chameleon, helping us better understand the impact of different testbed environments on experiment outcomes.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Automation&lt;/strong>: The ultimate goal is to fully automate the process of experiment execution, result documentation, and sharing across testbeds, reducing manual intervention and further enhancing reproducibility.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="reflections">Reflections&lt;/h2>
&lt;p>By integrating the RO-Crate standard and deploying pos on the Chameleon testbed, we have made significant steps toward enhancing the reproducibility, accessibility, and portability of network experiments across research platforms. These efforts contribute to more shareable, and replicable research processes in the scientific community.&lt;/p>
&lt;p>I am excited about the future work ahead and am grateful for the mentorship and support I received during this project.&lt;/p>
&lt;h2 id="deliverables-and-availability">Deliverables and Availability&lt;/h2>
&lt;p>Due to the current non-public status of the pos framework, &lt;strong>the code and deliverables are not publicly available&lt;/strong> at the moment.&lt;/p>
&lt;h2 id="previous-blogs">Previous Blogs&lt;/h2>
&lt;p>Make sure to check out my other blogs to see how I started this project and the challenges I faced along the way:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tum/slices/20240517-warmuth/">Introduction&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tum/slices/20240716-warmuth/">Midterm Blog&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Servus!&lt;/p></description></item><item><title>AutoAppendix: Towards One-Click reproducibility of high-performance computing experiments</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tuwien/autoappendix/20240904-kkrassni/</link><pubDate>Wed, 04 Sep 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tuwien/autoappendix/20240904-kkrassni/</guid><description>&lt;p>Hi everyone,&lt;/p>
&lt;p>I&amp;rsquo;m excited to wrap up the AutoAppendix project with our final findings and
insights. Over the course of this initiative, we’ve worked to assess the
reproducibility of artifacts submitted to the SC24 conference and create
guidelines that aim to improve the standard for reproducible experiments in the
future. Here&amp;rsquo;s a summary of the project&amp;rsquo;s final phase and what we’ve learned.&lt;/p>
&lt;h2 id="project-goals-and-progress">Project Goals and Progress&lt;/h2>
&lt;p>The goal of AutoAppendix was to evaluate the computational artifacts provided by
SC24 paper submissions, focusing on reproducibility. These artifacts accompany
papers applying for the &amp;ldquo;Artifact Replicable&amp;rdquo; badge in the conference&amp;rsquo;s
reproducibility initiative. Volunteer members of this initiative assess 1-2 paper appendices each. In this project, we analyzed a larger portion of artifacts to gain a broader perspective on potential improvements to the reproducibility process.&lt;/p>
&lt;p>We selected 18 out of 45 submissions, focusing on experiments that could be
easily replicated on Chameleon Cloud. Our evaluation criteria were based on
simplicity (single-node setups) and availability of resources. The final
analysis expanded on the earlier midterm findings, shedding light on various
challenges and best practices related to artifact reproducibility.&lt;/p>
&lt;h2 id="artifact-evaluation-process">Artifact Evaluation Process&lt;/h2>
&lt;p>During the evaluation process, we focused on examining the completeness and
clarity of the provided artifacts, looking closely at documentation, setup
instructions, and the degree of automation.&lt;/p>
&lt;p>Our first step was to replicate the environments used in the original
experiments as closely as possible using the resources from Chameleon. Many papers included instructions for creating the necessary software environments,
but the clarity of these instructions varied significantly across submissions.
In some cases, we even encountered challenges in reproducing results due to unclear
instructions or missing dependencies, which reinforced the need for
standardized, clear documentation as part of the artifact submission process.&lt;/p>
&lt;p>We observed that &lt;em>containerization&lt;/em> and &lt;em>semi-automated setups&lt;/em> (with scripts
that break down the experiment into smaller steps) were particularly effective
in enhancing the reproducibility of the artifacts. One artifact
particularly caught our attention due to its usage of the Chameleon JupyterHub
platform, making it reproducible with a &lt;em>single click&lt;/em>. This highlighted the
potential for
streamlining the reproducibility process and showcased that, with sufficient
effort and the right tools, experiments can indeed be made replicable by
&lt;em>anyone&lt;/em>.&lt;/p>
&lt;h2 id="results">Results&lt;/h2>
&lt;p>Throughout the evaluation, we observed that reproducibility could vary widely
based on the clarity and completeness of the documentation and the automation of
setup procedures. Artifacts that were structured with clear, detailed steps for
installation and execution tended to perform well in terms of replicability.&lt;/p>
&lt;p>From our evaluation, we derived a set of guidelines (intended as must-haves) and
best practices (recommended) for artifact reproducibility, which can be found below.&lt;/p>
&lt;p>Due to our fascination of the potential of the Chameleon JupyterHub platform and its adjacent &lt;a href="https://www.chameleoncloud.org/experiment/share/" target="_blank" rel="noopener">Trovi&lt;/a> artifact repository, we decided to create
several templates that can be used as a starting point for authors to make integration
of their artifacts with the platform easier. In the design of these templates,
we made sure that artifacts structured according to our guidelines are
particularly easy to integrate.&lt;/p>
&lt;h3 id="guidelines">Guidelines&lt;/h3>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Clear Documentation&lt;/strong>: Provide clear and detailed documentation for the artifact in the corresponding appendix, such that the artifact can be replicated without the need for additional information. For third-party software, it is acceptable to refer to the official documentation.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Software Setup&lt;/strong>: Clearly specify the versions of all (necessary) software components used
in the creation of the artifact. This includes the operating system, libraries, and tools.
Particularly, state all software setup steps to replicate the software environment&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Hardware Specifications&lt;/strong>: Specify the hardware the experiment was conducted on. Importantly,
state the architecture the experiments are intended to run on, and ensure that
provided software (e.g. docker images) are compatible with commonly available
architectures.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Expected Results&lt;/strong>: Always provide the expected outputs of the experiment, especially when run on different hardware, to make it easier for reviewers to assess the success of the replication.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Public Data&lt;/strong>: Publish the experiment data to a public repository, and make
sure the data is available for download to reviewers and readers, especially during
the evaluation period. Zenodo is a recommended repository for this purpose.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Automated Reproducibility&lt;/strong>: For long-running experiments, provide
progress output to the reviewer to ensure the experiment is running as expected.
Give an idea in the documentation of&lt;/p>
&lt;/li>
&lt;/ol>
&lt;ul>
&lt;li>how much time long-running steps in the reproduction will take&lt;/li>
&lt;li>what the progress output looks like or how frequently it is emitted&lt;/li>
&lt;/ul>
&lt;ol start="7">
&lt;li>&lt;strong>Sample Execution&lt;/strong>: Conduct a sample evaluation with hardware and software
as similar as possible to the intended reproduction environment.&lt;/li>
&lt;/ol>
&lt;h3 id="best-practices">Best Practices&lt;/h3>
&lt;ol>
&lt;li>&lt;strong>Reproduciible Environment&lt;/strong>:
Use a reproducible environment for the artifact. This can come in several forms:&lt;/li>
&lt;/ol>
&lt;ul>
&lt;li>&lt;strong>Containerization&lt;/strong>: Provide instructions for building the environment, or,
ideally, provide a ready-to-use image. For example, Docker, Signularity or VirtualBox images can be used for this purpose&lt;/li>
&lt;li>&lt;strong>Reproducible Builds&lt;/strong>: Package managers like &lt;a href="https://nixos.org/" target="_blank" rel="noopener">Nix&lt;/a> or &lt;a href="https://guix.gnu.org/" target="_blank" rel="noopener">Guix&lt;/a> have recently spiked in popularity and allow their users to create reproducible environments, matching the exact software versions across different systems.&lt;/li>
&lt;/ul>
&lt;ol start="2">
&lt;li>
&lt;p>&lt;strong>Partial Automation&lt;/strong>: It often makes sense to break an experiment down into
smaller, more manageable steps. For Linux-based systems, bash scripts are particularly viable for this purpose. We recommend prefixing the scripts for each step with
a number, such that the order of execution is clear.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>X11 Availability&lt;/strong>: Usually, reviewers will not have access to a graphical user
interface on the system where the artifact is evaluated. If the artifact requires a
graphical user interface, provide a way to run the artifact without it. For example,
save &lt;code>matplotlib&lt;/code> plots to disk instead of showing them with &lt;code>plt.show()&lt;/code>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Experiment output&lt;/strong>: Do not provide output files of the experiment in your artifact,
unless explicitly intended. If provided output files are intended for comparison,
they should be marked as such (e.g. in their filename). Similarly, any output logs
or interactive outputs in Jupyter notebook should not be part of the artifact, but
rather be initially generate during the artifact evaluation.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h3 id="trovi-templates">Trovi Templates&lt;/h3>
&lt;p>Our templates share a common base that features
a &lt;em>central configuration file&lt;/em> for modifying the
Chameleon experiment parameters (such as node type). Building on this base, we provide three templates with sample experiments that each use different environments:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Docker template&lt;/strong>: This template is designed for containerized experiments and supports nvidia GPUs over the &lt;code>nvidia-container-toolkit&lt;/code> integration.&lt;/li>
&lt;li>&lt;strong>Nix template&lt;/strong>: Sets up the Nix package manager with a &lt;code>shell.nix&lt;/code> file that can be used to configure the environment.&lt;/li>
&lt;li>&lt;strong>Guix template&lt;/strong>: Installs the Guix package manager and executes a sample experiment from an existing reproducible paper that hinges on the reproducibility of the software environment.&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>In summary, the AutoAppendix project has been an insightful journey into the
complexities of artifact reproducibility. Our evaluations highlight both the
challenges and potential solutions for future reproducibility initiatives. By
following these essential guidelines and implementing best practices, we aim for the
research community to achieve higher standards of transparency and reliability
in scientific research and help to ensure that the results of experiments can be replicated by others.&lt;/p>
&lt;p>Thanks for following along with our progress! We’re excited to see the positive
impact these findings will have on the research community.&lt;/p>
&lt;p>If you are interested in the full project report, you can find it &lt;a href="https://drive.google.com/drive/folders/113OsxGAlfyvlJnvpH5zL2XD-8gE3CYyu?usp=sharing" target="_blank" rel="noopener">here&lt;/a>, together with the &lt;em>Trovi&lt;/em> templates.&lt;/p></description></item><item><title>Final Blogpost: Reproducibility in Data Visualization</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240828-triveni5/</link><pubDate>Wed, 28 Aug 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240828-triveni5/</guid><description>&lt;p>Hello everyone!&lt;/p>
&lt;p>I&amp;rsquo;m Triveni, a Master&amp;rsquo;s student in Computer Science at Northern Illinois University (NIU). I&amp;rsquo;m excited to share my progress on the OSRE 2024 project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/niu/repro-vis/">Categorize Differences in Reproduced Visualizations&lt;/a> focusing on data visualization reproducibility. Working under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/david-koop/">David Koop&lt;/a>, I&amp;rsquo;ve made some significant strides and faced some interesting challenges.&lt;/p>
&lt;h1 id="reproducibility-in-data-visualization">Reproducibility in data visualization&lt;/h1>
&lt;p>Reproducibility is crucial in data visualization, ensuring that two visualizations accurately convey the same data. This is essential for maintaining transparency and trust in data-driven decision-making. When comparing two visualizations, the challenge is not just spotting differences but determining which differences are meaningful. Tools like OpenCV are often used for image comparison, but they may detect all differences, including those that do not impact the data&amp;rsquo;s interpretation. For example, slight shifts in labels might be flagged as differences even if the underlying data remains unchanged, making it challenging to assess whether the visualizations genuinely differ in terms of the information they convey.&lt;/p>
&lt;h1 id="a-breakthrough-with-chartdetective">A Breakthrough with ChartDetective&lt;/h1>
&lt;p>Among various tools like ChartOCR and ChartReader, ChartDetective proved to be the most effective. This tool enabled me to extract data from a range of visualizations, including bar charts, line charts, box plots, and scatter plots. To enhance its capabilities, I modified the codebase to capture pixel values alongside the extracted data and store both in a CSV file. This enhancement allowed for a direct comparison of data values and their corresponding pixel coordinates between two visualizations, focusing on meaningful differences that truly impact data interpretation.&lt;/p>
&lt;h1 id="example-comparing-two-bar-plots-with-chartdetective">Example: Comparing Two Bar Plots with ChartDetective&lt;/h1>
&lt;p>Consider two bar plots that visually appear similar but have slight differences in their data values. Using ChartDetective, I extracted the data and pixel coordinates from both plots and stored this information in a CSV file. The tool then compared these values to identify any discrepancies.&lt;/p>
&lt;p>For instance, in one bar plot, the height of a specific bars were slightly increased. By comparing the CSV files generated by ChartDetective, I was able to pinpoint these differences precisely. The final step involved highlighting these differences on one of the plots using OpenCV, making it clear where visualizations diverged.This approach ensures that only meaningful differences—those that reflect changes in the data—are considered when assessing reproducibility.&lt;/p>
&lt;ul>
&lt;li>ChartDetective: SVG or PDF file of the visualization is uploaded to extract data.&lt;/li>
&lt;/ul>
&lt;p align="center">
&lt;img src="./barplot_chartdetective.png" alt="ChartDetective" style="width: 80%; height: auto;">
&lt;/p>
- Data Extraction: Data values along with pixel details are stored in the CSV files.
&lt;p align="center">
&lt;img src="./barplots_pixels.png" alt="Data_Extraction" style="width: 80%; height: auto;">
&lt;/p>
- Highlighting the differences: Differences are highlighted on one of the plots using OpenCV
&lt;p align="center">
&lt;img src="./Highlighted_differences.png" alt="Highlighting the differences" style="width: 60%; height: auto;">
&lt;/p>
&lt;h1 id="understanding-user-perspectives-on-reproducibility">Understanding User Perspectives on Reproducibility&lt;/h1>
&lt;p>To complement the technical analysis, I created a pilot survey to understand how users perceive reproducibility in data visualizations. The survey evaluates user interpretations of two visualizations and explores which visual parameters impact their decision-making. This user-centered approach is crucial because even minor differences in visual representation can significantly affect how data is interpreted and used.&lt;/p>
&lt;p>Pilot Survey Example:&lt;/p>
&lt;p>Pixel Differences: In one scenario, the height of two bars was altered slightly, introducing a noticeable yet subtle change.&lt;/p>
&lt;p>Label Swapping: In another scenario, the labels of two bars were swapped without changing their positions or heights.&lt;/p>
&lt;p align="center">
&lt;img src="./barchart_labels_swap.png" alt="Label Swapping" style="width: 80%; height: auto;">
&lt;/p>
&lt;p>Participants will be asked to evaluate the reproducibility of these visualizations, considering whether the differences impacted their interpretation of the data. The goal was to determine which visual parameters—such as bar height or label positioning—users find most critical when assessing the similarity of visualizations.&lt;/p>
&lt;h1 id="future-work-and-conclusion">Future Work and Conclusion&lt;/h1>
&lt;p>Going forward, I plan to develop a proof of concept based on these findings and implement an extensive survey to further explore the impact of visual parameters on users&amp;rsquo; perceptions of reproducibility. Understanding this will help refine tools and methods for comparing visualizations, ensuring they not only look similar but also accurately represent the same underlying data.&lt;/p></description></item><item><title>Final blog: Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240822-architd/</link><pubDate>Thu, 22 Aug 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240822-architd/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello everyone,&lt;/p>
&lt;p>I&amp;rsquo;m Archit from India, an undergraduate student at the Indian Institute of Technology, Banaras Hindu University (IIT BHU), Varanasi. As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/bsc/ro-crate-compss/">Automatic Reproducibility of COMPSs Experiments through the Integration of RO-Crate in Chameleon&lt;/a> project, my &lt;a href="https://drive.google.com/file/d/1qY-uipQZPox144LD4bs05rn3islfcjky/view" target="_blank" rel="noopener">proposal&lt;/a>, under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/raul-sirvent/">Raül Sirvent&lt;/a>, aims to develop a service that facilitates the automated replication of COMPSs experiments within the Chameleon infrastructure.&lt;/p>
&lt;h2 id="about-the-project">About the Project&lt;/h2>
&lt;p>The project proposes to create a service that can take a COMPSs crate (an artifact adhering to the RO-Crate specification) and, through analysis of the provided metadata, construct a Chameleon-compatible image for replicating the experiment on the testbed.&lt;/p>
&lt;h2 id="final-product">Final Product&lt;/h2>
&lt;p align="center">
&lt;img src="./logo.png" alt="Logo" style="width: 60%; height: auto;">
&lt;/p>
&lt;p>The basic workflow of the COMPSs Reproducibility Service can be explained as follows:&lt;/p>
&lt;ol>
&lt;li>The service takes the workflow path or link as the first argument from the user.&lt;/li>
&lt;li>The program shifts the execution to a separate sub-directory, &lt;code>reproducibility_service_{timestamp}&lt;/code>, to store the results from the reproducibility process.&lt;/li>
&lt;li>Two main flags are required:
&lt;ul>
&lt;li>&lt;strong>Provenance flag&lt;/strong>: If you want to generate the provenance of the workflow via the runcompss runtime.&lt;/li>
&lt;li>&lt;strong>New Dataset flag&lt;/strong>: If you want to reproduce the experiment with a new dataset instead of the one originally used.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>If there are any remote datasets, they are fetched into the sub-directory.&lt;/li>
&lt;li>The main work begins with parsing the metadata from &lt;code>ro-crate-metadata.json&lt;/code> and verifying the files present inside the dataset, as well as any files downloaded as remote datasets. This step generates a status table for the user to check if any files are missing or have modified sizes.&lt;/li>
&lt;/ol>
&lt;p align="center">
&lt;img src="./status_table.png" alt="Status Table" style="width: 70%; height: auto;">
&lt;/p>
&lt;ol start="6">
&lt;li>The final step is to transform the &lt;code>compss-command-line.txt&lt;/code> and all the paths specified inside it to match the local environment where the experiment will be reproduced. This includes:
&lt;ul>
&lt;li>Mapping the paths from the old machine to new paths inside the RO-Crate.&lt;/li>
&lt;li>Changing the runtime to &lt;code>runcompss&lt;/code> or &lt;code>enqueue_compss&lt;/code>, depending on whether the environment is a SLURM cluster.&lt;/li>
&lt;li>Detecting if the paths specified in the command line are for results, and redirecting them to new results inside the &lt;code>reproducibility_service_{timestamp}\Results&lt;/code> directory.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>After this, the service prompts the user to add any additional flags to the final command. Upon final verification, the command is executed via Python&amp;rsquo;s subprocess pipe.&lt;/li>
&lt;/ol>
&lt;p align="center">
&lt;img src="./end.png" alt="End Image" style="width: 50%; height: auto;">
&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Logging System&lt;/strong>: All logs related to the Reproducibility Service are stored inside the &lt;code>reproducibility_service_{timestamp}\log&lt;/code>.&lt;/li>
&lt;/ul>
&lt;p>You can view the basic &lt;a href="https://github.com/Minimega12121/COMPSs-Reproducibility-Service/blob/main/pseudocode.txt" target="_blank" rel="noopener">pseudocode&lt;/a> of the service.&lt;/p>
&lt;h2 id="conclusion-and-future-work">Conclusion and Future Work&lt;/h2>
&lt;p>It&amp;rsquo;s been a long journey since I started this project, and now it&amp;rsquo;s finally coming to an end. I have learned a lot from this experience, from weekly meetings with my mentor to working towards long-term goals—it has all been thrilling. I would like to thank the OSRE community and my mentor for providing me with this learning opportunity.&lt;/p>
&lt;p>This is only version 1.0.0 of the Reproducibility Service. If I have time from my coursework, I would like to fix any bugs or improve the service further to meet user needs.&lt;/p>
&lt;p>However, the following issues still exist with the service and can be improved upon:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Third-party software dependencies&lt;/strong>: Automatic detection and loading of these dependencies on a SLURM cluster are not yet implemented. Currently, these must be handled manually by the user.&lt;/li>
&lt;li>&lt;strong>Support for workflows with &lt;code>data_persistence = False&lt;/code>&lt;/strong>: There is no support for workflows where all datasets are remote files.&lt;/li>
&lt;/ul>
&lt;h2 id="deliverables">Deliverables&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://github.com/Minimega12121/COMPSs-Reproducibility-Service" target="_blank" rel="noopener">Reproducibility Service Repository&lt;/a>: This repository contains the main service along with guidelines on how to use it. The service will be integrated with the COMPSs official distribution in its next release.&lt;/li>
&lt;li>&lt;a href="https://www.chameleoncloud.org/appliances/121/" target="_blank" rel="noopener">Chameleon Appliance&lt;/a> : This is a single-node appliance with COMPSs 3.3.1 installed, so that anyone with access to Chameleon can reproduce experiments.&lt;/li>
&lt;/ul>
&lt;!-- - [Experiments Analysis](https://docs.google.com/spreadsheets/d/1W4CKqiYVPquSwXFRITbb1Hga1xcyv2_3DJIcq7JalZk/edit?gid=0#gid=0) : This report contains details of experiments I have reproduced using the Reproducibility Service on a SLURM cluster, a local machine, and a Chameleon appliance, along with observations. -->
&lt;h2 id="previous-blogs">Previous Blogs&lt;/h2>
&lt;p>Make sure to check out my other blogs to see how I started this project and the challenges I faced along the way:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240612-architd/">First blog&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240729-architd/">Mid-term blog&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Thank you for reading the blog, have a nice day!!&lt;/p></description></item><item><title>Data Leakage in Applied ML</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240813-shaivimalik/</link><pubDate>Tue, 13 Aug 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240813-shaivimalik/</guid><description>&lt;p>Hello everyone!&lt;/p>
&lt;p>I have been working on reproducing the results from &lt;strong>Characterization of Term and Preterm Deliveries using Electrohysterograms Signatures&lt;/strong>. This paper aims to predict preterm birth using Support Vector Machine with RBF kernel. However, there is a major flaw in the methodology: &lt;strong>preprocessing on training and test set&lt;/strong>. This happens when preprocessing is performed on the entire dataset before splitting it into training and test sets.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Sample produced from test and training set samples" srcset="
/report/osre24/nyu/data-leakage/20240813-shaivimalik/leakage_hu7171e85c8455cc3219721a2e3b71a711_62548_687703a1dee465e80fb3dbe262dd5860.webp 400w,
/report/osre24/nyu/data-leakage/20240813-shaivimalik/leakage_hu7171e85c8455cc3219721a2e3b71a711_62548_42051adaf7804083284553c10ca73861.webp 760w,
/report/osre24/nyu/data-leakage/20240813-shaivimalik/leakage_hu7171e85c8455cc3219721a2e3b71a711_62548_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240813-shaivimalik/leakage_hu7171e85c8455cc3219721a2e3b71a711_62548_687703a1dee465e80fb3dbe262dd5860.webp"
width="760"
height="589"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Sample produced from training set samples" srcset="
/report/osre24/nyu/data-leakage/20240813-shaivimalik/no_leakage_hu60ff986c558a17237e53708798334267_66856_47e6397030251c1681ff92260f687641.webp 400w,
/report/osre24/nyu/data-leakage/20240813-shaivimalik/no_leakage_hu60ff986c558a17237e53708798334267_66856_8bad9197813df4344757765d43878a56.webp 760w,
/report/osre24/nyu/data-leakage/20240813-shaivimalik/no_leakage_hu60ff986c558a17237e53708798334267_66856_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240813-shaivimalik/no_leakage_hu60ff986c558a17237e53708798334267_66856_47e6397030251c1681ff92260f687641.webp"
width="760"
height="594"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Reproducing the published results came with its own challenges, including updating EHG-Oversampling to extract meaningful features from EHG signals and finding optimal hyperparameters for the model. Through our work on reproducing the published results and creating toy example notebooks, we have been able to demonstrate that data leakage leads to overly optimistic measures of model performance and models trained with data leakage fail to generalize to real-world data. In such cases, performance on test set doesn&amp;rsquo;t translate to performance in the real-world.&lt;/p>
&lt;p>Next, I&amp;rsquo;ll be reproducing the results published in &lt;strong>Identification of COVID-19 Samples from Chest X-Ray Images Using Deep Learning: A Comparison of Transfer Learning Approaches&lt;/strong>.&lt;/p>
&lt;p>You can follow my work on the EHG paper &lt;a href="https://github.com/shaivimalik/medicine_preprocessing-on-entire-dataset" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>Stay tuned for more insights on data leakage and updates on our progress!&lt;/p></description></item><item><title>Midterm Check-In: Progress on the AutoAppendix Project</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tuwien/autoappendix/20240803-kkrassni/</link><pubDate>Sat, 03 Aug 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tuwien/autoappendix/20240803-kkrassni/</guid><description>&lt;p>Hi all,&lt;/p>
&lt;p>I&amp;rsquo;m happy to share a quick update on the AutoAppendix project as we’re about
halfway through. We’ve made some steady progress on evaluating artifacts from SC24 papers, and we&amp;rsquo;re starting
to think about how we can use what we&amp;rsquo;ve learned to
improve the artifact evaluation process in the future.&lt;/p>
&lt;h2 id="what-weve-been-up-to">What We’ve Been Up To&lt;/h2>
&lt;p>As a quick reminder, the goal of our project is to develop a set of guidelines that
researchers can use to improve the reproducibility of their work. We&amp;rsquo;re focusing
on papers from the Supercomputing Conference 2024 that applied for an &amp;ldquo;Artifact Replicable&amp;rdquo; badge, and we&amp;rsquo;re
evaluating their artifacts to see how well the experiments can be replicated. As it was difficult to make assumptions about the exact outcomes of the project besides detailed experiment recreation, our main goal of this
midterm check-in is to share what insights we have gathered so far and to set the stage for the final outcomes.&lt;/p>
&lt;p>Our main task so far has been making a selection of submissions with experiments designed
for Chameleon Cloud, or those that could be easily adapted to run on Chameleon. As there were 45 submissions that applied
for an &amp;ldquo;Artifact Replicable&amp;rdquo; badge, it was not easy
to choose which ones to evaluate, but we managed to narrow
it down to 18 papers that we thought would be a good fit for our project.&lt;/p>
&lt;p>We&amp;rsquo;ve chosen to focus on papers that do not require
special hardware (like a specific supercomputer) or
complex network setups, as it would be difficult to
generalize the insights from these kinds of
experiments. Instead, we&amp;rsquo;ve been looking at those
that require only a &lt;em>single computation node&lt;/em>, and
could theoretically be run with the available hardware
on Chameleon.&lt;/p>
&lt;h2 id="observations-and-learning-points">Observations and Learning Points&lt;/h2>
&lt;p>At the moment, we&amp;rsquo;re about halfway through the
evaluation process. So far, we&amp;rsquo;ve noticed a range of
approaches to documenting and setting up computational
experiments. Even without looking at the appendices in
detail, it&amp;rsquo;s clear that there’s a lot of room for
standardization of the documentation format and software setup, which could make life easier for
everyone involved. This particularly applies to
software setups, which are often daunting to replicate,
especially when there are specific version requirements, version
incompatibilities or outright missing dependencies. Since the main goal of this
project is to develop a set of guidelines that
researchers can use to improve the reproducibility of
their work, suggesting a way to deal with software
versions and dependencies will be a key part of our
results.&lt;/p>
&lt;p>We’ve observed that submissions with well-structured and detailed appendices
tend to fare better in reproducibility checks. This includes those that utilized
containerization solutions like Docker, which encapsulate the computing
environment needed to run the experiments and thus
eliminates the need for installing specific software
packages. It’s these kinds of practices that we
think could be encouraged more broadly.&lt;/p>
&lt;h2 id="looking-ahead">Looking Ahead&lt;/h2>
&lt;p>The next steps are pretty exciting! We’re planning to use what we’ve learned to draft some
guidelines that could help future SC conference submissions be more consistent.
This might include templates or checklists that ensure all the necessary details
are covered.&lt;/p>
&lt;p>Additionally, we’re thinking about ways to automate some parts of the artifact
evaluation process. The goal here is to make it less labor-intensive and more
objective. A particularly nice way
of reproducible artifact evaluation is
Chameleon&amp;rsquo;s JupyterHub interface, which in conglomeration with the &lt;em>Trovi&lt;/em>
artifact sharing platform makes it easy to share artifacts and allow interested
parties to reproduce the experiments with minimal effort. We are thus looking into ways to
utilize and contribute to these tools in a way that could benefit the broader research community.&lt;/p>
&lt;h2 id="wrapping-up">Wrapping Up&lt;/h2>
&lt;p>That’s it for now! We are working towards getting
as many insights as possible from the rest of the
artifact evaluations, and hopefully, by the end of this project, we’ll have some solid
recommendations and tools to show for it. Thanks for keeping up with our
progress, and I’ll be back with more updates as we move into the final stages of
our work.&lt;/p></description></item><item><title>Mid-term Blog: Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240729-architd/</link><pubDate>Mon, 29 Jul 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240729-architd/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello everyone
I&amp;rsquo;am Archit from India. An undergraduate student at the Indian Institute of Technology, Banaras Hindu University, IIT (BHU), Varanasi. As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/bsc/ro-crate-compss/">Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon&lt;/a> my &lt;a href="https://drive.google.com/file/d/1qY-uipQZPox144LD4bs05rn3islfcjky/view" target="_blank" rel="noopener">proposal&lt;/a> under mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/raul-sirvent/">Raül Sirvent&lt;/a> aims to develop a service that facilitates the automated replication of COMPSs experiments within the Chameleon infrastructure.&lt;/p>
&lt;h2 id="about-the-project">About the project:&lt;/h2>
&lt;p>The project proposes to create a service that will have the capability to take a COMPSs crate (an artifact adhering to the RO-Crate specification) and, through analysis of the provided metadata construct a Chameleon-compatible image for replicating the experiment on the testbed.&lt;/p>
&lt;h2 id="progress">Progress&lt;/h2>
&lt;p>It has been more than six weeks since the ReproducibilityService project began, and significant progress has been made. You can test the actual service from my GitHub repository: &lt;a href="https://github.com/Minimega12121/COMPSs-Reproducibility-Service" target="_blank" rel="noopener">ReproducibilityService&lt;/a>. Let&amp;rsquo;s break down what the ReproducibilityService is capable of doing now:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Support for Reproducing Basic COMPSs Experiments&lt;/strong>: The RS program is now fully capable of reproducing basic COMPSs experiments with no third-party dependencies on any device with the COMPSs Runtime installed. Here&amp;rsquo;s how it works:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Getting the Crate&lt;/strong>: The RS program can accept the COMPSs workflow from the user either as a path to the crate or as a link from WorkflowHub. In either case, it creates a sub-directory for further execution named &lt;code>reproducibility_service_{timestamp}&lt;/code> and stores the workflow as &lt;code>reproducibility_service_{timestamp}/Workflow&lt;/code>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Address Mapping&lt;/strong>: The ro-crate contains &lt;code>compss_submission_command_line.txt&lt;/code>, which is the command originally used to execute the experiment. This command may include many paths such as &lt;code>runcompss flag1 flag2 ... flagn &amp;lt;main_workflow_file.py&amp;gt; input1 input2 ... inputn output&lt;/code>. The RS program maps all the paths for &lt;code>&amp;lt;main_workflow_file.py&amp;gt; input1 input2 ... inputn output&lt;/code> to paths inside the machine where we want to reproduce the experiment. The flags are dropped as they may be device-specific, and the service asks the user for any new flags they want to add to the COMPSs runtime.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Verifying Files&lt;/strong>: Before reproducing an experiment, it&amp;rsquo;s crucial to check whether the inputs or outputs have been tampered with. The RS program cross-verifies the &lt;code>contentSize&lt;/code> from the &lt;code>ro-crate-metadata.json&lt;/code> and generates warnings in case of any abnormalities.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Error Logging&lt;/strong>: In case of any problems during execution, the &lt;code>std_out&lt;/code> and &lt;code>std_err&lt;/code> are stored inside &lt;code>reproducibility_service_{timestamp}/log&lt;/code>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Results&lt;/strong>: If any results do get generated by the experiment, the RS program stores them inside &lt;code>reproducibility_service_{timestamp}/Results&lt;/code>. If we
ask for the provenance of the workflow also, the ro-crate thus generated is also stored here only.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="REPRODUCIBILITY SERVICE FLOWCHART" srcset="
/report/osre24/intel/20240729-architd/RS_chart_hu1a952b7a4697c53cd74822153911f260_56808_4df9e9a771513277aaf5c7a4d8182666.webp 400w,
/report/osre24/intel/20240729-architd/RS_chart_hu1a952b7a4697c53cd74822153911f260_56808_0b96071409b70d8356241465bf214510.webp 760w,
/report/osre24/intel/20240729-architd/RS_chart_hu1a952b7a4697c53cd74822153911f260_56808_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240729-architd/RS_chart_hu1a952b7a4697c53cd74822153911f260_56808_4df9e9a771513277aaf5c7a4d8182666.webp"
width="760"
height="267"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ol start="2">
&lt;li>&lt;strong>Support for Reproducing Remote Datasets&lt;/strong>: If a remote dataset is specified inside the metadata file, the RS program fetches the dataset from the specified link using &lt;code>wget&lt;/code>, stores the remote dataset inside the crate, and updates the path in the new command line it generates.&lt;/li>
&lt;/ol>
&lt;h2 id="challenges-and-end-term-goals">Challenges and End-Term Goals&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Support for DATA_PERSISTENCE_FALSE&lt;/strong>: The RS program still needs to support crates with &lt;code>dataPersistence&lt;/code> set to false. After weeks of brainstorming ideas on how to implement this, we recently concluded that since the majority of &lt;code>DATA_PERSISTENCE_FALSE&lt;/code> crates are run on SLURM clusters, and the dataset required to fetch in such a case is somewhere inside the cluster, the RS program will support this case for such clusters. Currently, I am working with the Nord3v2 cluster to further enhance the functionality of ReproducibilityService.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Chameleon Cluster Setup&lt;/strong>: I have made some progress towards creating a new COMPSs 3.3 Appliance on Chameleon to test the service. However, creating the cluster setup script needed for the service to run on a COMPSs 3.3.1 cluster to execute large experiments has been challenging.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Integrating with COMPSs Repository&lt;/strong>: After completing the support for &lt;code>dataPersistence&lt;/code> false cases, we aim to launch this service as a tool inside the &lt;a href="https://github.com/bsc-wdc/compss" target="_blank" rel="noopener">COMPSs repository&lt;/a>. This will be a significant milestone in my developer journey as it will be the first real-world project I have worked on, and I hope everything goes smoothly.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Stay tuned for the next blog!!&lt;/p></description></item><item><title>Enabling VAA Execution: Environment and VAA Preparation and/or Reproducibility for Dynamic Bandwidth Allocation (CONCIERGE)</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/edgerep/20240720-rafaelsw/</link><pubDate>Sat, 20 Jul 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/edgerep/20240720-rafaelsw/</guid><description>&lt;p>Hi there!&lt;/p>
&lt;p>I am Rafael Sinjunatha Wulangsih, a Telecommunication Engineering graduate from the Bandung Institute of Technology (ITB), Bandung, Indonesia. I&amp;rsquo;m currently contributing to the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/uchicago/edgerep">&amp;ldquo;EdgeRep: Reproducing and benchmarking edge analytic systems&amp;rdquo;&lt;/a> project under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/yuyang-roy-huang/">Yuyang (Roy) Huang&lt;/a> and Prof. &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/junchen-jiang/">Junchen Jiang&lt;/a>. You can find more details about the project proposal &lt;a href="https://drive.google.com/file/d/1GUMiglFqezOqEeQiMaL4QVgsXZOHYoEK/view?usp=drive_link" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>This project addresses the challenges posed by the massive deployment of edge devices, such as traffic or security cameras, in smart cities and other environments. In the previous Edgebench project, the team proposed a solution to dynamically allocate bandwidth and compute resources to video analytic applications (VAAs) running on edge devices. However, that project was limited to a single VAA, which may not represent the diverse applications running on edge devices. Therefore, the main goal of this project, &amp;ldquo;EdgeRep,&amp;rdquo; is to diversify the VAAs running on edge devices while utilizing a solution similar to that of the Edgebench project. EdgeRep aims to reproduce state-of-the-art self-adaptive VAAs (with seven candidates) and maintain self-adaptation in these video analytics pipelines. We will implement it ourselves if the video analytics applications do not support self-adaptation.&lt;/p></description></item><item><title>Halfway Through GSOC: Heterogeneous Graph Neural Networks for I/O Performance Bottleneck Diagnosis</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240720-mahdi/</link><pubDate>Sat, 20 Jul 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240720-mahdi/</guid><description>&lt;p>Hello, I&amp;rsquo;m &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/mahdi-banisharifdehkordi/">Mahdi Banisharifdehkordi&lt;/a>, a Ph.D. student in Computer Science at Iowa State University. I&amp;rsquo;m currently working on the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/lbl/aiio/">AIIO / Graph Neural Network&lt;/a> project under the guidance of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/bin-dong/">Bin Dong&lt;/a> and Suren Byna. Our project focuses on enhancing the AIIO framework to automatically diagnose I/O performance bottlenecks in high-performance computing (HPC) systems using Graph Neural Networks (GNNs).&lt;/p>
&lt;h1 id="project-overview">Project Overview&lt;/h1>
&lt;p>Our primary goal is to tackle the persistent issue of I/O bottlenecks in HPC applications. Identifying these bottlenecks manually is often labor-intensive and prone to errors. By integrating GNNs into the AIIO framework, we aim to create an automated solution that can diagnose these bottlenecks with high accuracy, ultimately improving the efficiency and reliability of HPC systems.&lt;/p>
&lt;h1 id="progress-and-challenges">Progress and Challenges&lt;/h1>
&lt;p>Over the past few weeks, my work has been centered on developing a robust data pre-processing pipeline. This pipeline is crucial for converting raw I/O log data into a graph format suitable for GNN analysis. The data pre-processing involves extracting relevant features from Darshan I/O logs, which include job-related information and performance metrics. One of the main challenges has been dealing with the heterogeneity and sparsity of the data, which can affect the accuracy of our models. To address this, we&amp;rsquo;ve focused on using correlation analysis to identify and select the most relevant features, ensuring that the dataset is well-structured and informative for GNN processing.&lt;/p>
&lt;p>We&amp;rsquo;ve also started constructing the GNN model. The model is designed to capture the complex relationships between different I/O operations and their impact on system performance. This involves defining nodes and edges in the graph that represent job IDs, counter types, and their values. We explored different graph structures, including those that focus on counter types and those that incorporate more detailed information. While more detailed graphs offer better accuracy, they also require more computational resources.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Overview" srcset="
/report/osre24/lbl/aiio/20240720-mahdi/overview_hu3b8a0374313d077c49f26c894c548b00_437453_efa6bf6f7434ca74fff6a35fcb540861.webp 400w,
/report/osre24/lbl/aiio/20240720-mahdi/overview_hu3b8a0374313d077c49f26c894c548b00_437453_de1d11a65f3f46dfd75b1bc00e8e6406.webp 760w,
/report/osre24/lbl/aiio/20240720-mahdi/overview_hu3b8a0374313d077c49f26c894c548b00_437453_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240720-mahdi/overview_hu3b8a0374313d077c49f26c894c548b00_437453_efa6bf6f7434ca74fff6a35fcb540861.webp"
width="760"
height="566"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h1 id="current-achievements">Current Achievements&lt;/h1>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Data Pre-processing Pipeline&lt;/strong>: We have successfully developed and tested the pipeline to transform Darshan I/O logs into graph-structured data. This was a significant milestone, as it sets the foundation for all subsequent GNN modeling efforts.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>GNN Model Construction&lt;/strong>: The initial version of our GNN model has been implemented. This model is now capable of learning from the graph data and making predictions about I/O performance bottlenecks.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Correlation Analysis for Graph Structure Design&lt;/strong>: We have used correlation analysis on the dataset to understand the relationships between I/O counters. This analysis has been instrumental in designing a more effective graph structure, helping to better capture the dependencies and interactions critical for accurate performance diagnosis.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Correlation Analysis1" srcset="
/report/osre24/lbl/aiio/20240720-mahdi/correlation1_huf0ba9e5fcd08c89560bf3e668ac22994_763024_211eb50374f4febd5aee688644797792.webp 400w,
/report/osre24/lbl/aiio/20240720-mahdi/correlation1_huf0ba9e5fcd08c89560bf3e668ac22994_763024_fd5992e42a60d6cb85be9cd136a5d93b.webp 760w,
/report/osre24/lbl/aiio/20240720-mahdi/correlation1_huf0ba9e5fcd08c89560bf3e668ac22994_763024_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240720-mahdi/correlation1_huf0ba9e5fcd08c89560bf3e668ac22994_763024_211eb50374f4febd5aee688644797792.webp"
width="760"
height="614"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Correlation Analysis2" srcset="
/report/osre24/lbl/aiio/20240720-mahdi/correlation2_hu550eeb7f303ef774f36732146058c5a3_277912_b05324cc90f73bd1b2ff53c9d2d04ecb.webp 400w,
/report/osre24/lbl/aiio/20240720-mahdi/correlation2_hu550eeb7f303ef774f36732146058c5a3_277912_0115179de349c5834c2b3fc2636ecd23.webp 760w,
/report/osre24/lbl/aiio/20240720-mahdi/correlation2_hu550eeb7f303ef774f36732146058c5a3_277912_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240720-mahdi/correlation2_hu550eeb7f303ef774f36732146058c5a3_277912_b05324cc90f73bd1b2ff53c9d2d04ecb.webp"
width="760"
height="309"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ol start="4">
&lt;li>&lt;strong>Training for Different Graph Structures&lt;/strong>: We are currently training our model using various graph structures to determine the most effective configuration for accurate I/O performance diagnosis. This ongoing process aims to refine our approach and improve the model&amp;rsquo;s predictive accuracy.&lt;/li>
&lt;/ol>
&lt;h1 id="next-steps">Next Steps&lt;/h1>
&lt;p>Looking ahead, we plan to focus on several key areas:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Refinement and Testing&lt;/strong>: We&amp;rsquo;ll continue refining the GNN model, focusing on improving its accuracy and efficiency. This includes experimenting with different graph structures and training techniques.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>SHAP Analysis&lt;/strong>: To enhance the interpretability of our model, we&amp;rsquo;ll incorporate SHAP (SHapley Additive exPlanations) values. This will help us understand the contribution of each feature to the model&amp;rsquo;s predictions, making it easier to identify critical factors in I/O performance.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Documentation and Community Engagement&lt;/strong>: As we make progress, we&amp;rsquo;ll document our methods and findings, sharing them with the broader community. This includes contributing to open-source repositories and engaging with other researchers in the field.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>This journey has been both challenging and rewarding, and I am grateful for the support and guidance from my mentors and the community. I look forward to sharing more updates as we continue to advance this exciting project.&lt;/p></description></item><item><title>Reproducibility in Data Visualization</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240719-triveni5/</link><pubDate>Fri, 19 Jul 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240719-triveni5/</guid><description>&lt;p>Hello everyone!&lt;/p>
&lt;p>I&amp;rsquo;m Triveni, a Master&amp;rsquo;s student in Computer Science at Northern Illinois University (NIU). I&amp;rsquo;m excited to share my progress on the OSRE 2024 project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/niu/repro-vis/">Categorize Differences in Reproduced Visualizations&lt;/a> focusing on data visualization reproducibility. Working under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/david-koop/">David Koop&lt;/a>, I&amp;rsquo;ve made some significant strides and faced some interesting challenges.&lt;/p>
&lt;h2 id="initial-approach-and-challenges">Initial Approach and Challenges&lt;/h2>
&lt;p>I began my work by comparing original visualizations with reproduced ones using OpenCV for pixel-level comparison. This method helped highlight structural differences but also brought to light some challenges. Different versions of libraries rendered visualizations slightly differently, causing minor positional changes that didn&amp;rsquo;t affect the overall message but were still flagged as discrepancies.&lt;/p>
&lt;p>To address this, I experimented with machine learning models like VGG16, ResNet, and Detectron2. These models are excellent for general image recognition but fell short for our specific needs with charts and visualizations. The results were not as accurate as I had hoped, primarily because these models aren&amp;rsquo;t tailored to handle the unique characteristics of data visualizations.&lt;/p>
&lt;h2 id="shifting-focus-to-chart-specific-models">Shifting Focus to Chart-Specific Models&lt;/h2>
&lt;p>Recognizing the limitations of general ML models, I shifted my focus to chart-specific models like ChartQA, ChartOCR, and ChartReader. These models are designed to understand and summarize chart data, making them more suitable for our goal of comparing visualizations based on the information they convey.&lt;/p>
&lt;h2 id="generating-visualization-variations-and-understanding-human-perception">Generating Visualization Variations and Understanding Human Perception&lt;/h2>
&lt;p>Another exciting development in my work has been generating different versions of visualizations. This will allow me to create a survey to collect human categorization of visualizations. By understanding how people perceive differences whether it&amp;rsquo;s outliers, shapes, data points, or colors. We can gain insights into what parameters impact human interpretation of visualizations.&lt;/p>
&lt;h2 id="next-steps">Next Steps&lt;/h2>
&lt;p>Moving forward, I&amp;rsquo;ll continue to delve into chart-specific models to refine our comparison techniques. Additionally, the survey will provide valuable data on human perception, which can be used to improve our automated comparison methods. By combining these approaches, I hope to create a robust framework for reliable and reproducible data visualizations.&lt;/p>
&lt;p>I&amp;rsquo;m thrilled about the progress made so far and eager to share more updates with you all. Stay tuned for more insights and developments on this exciting journey!&lt;/p></description></item><item><title>Data leakage in applied ML: reproducing examples from genomics, medicine and radiology</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240701-shaivimalik/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/nyu/data-leakage/20240701-shaivimalik/</guid><description>&lt;p>Hello everyone! I&amp;rsquo;m Shaivi Malik, a computer science and engineering student. I am thrilled to announce that I have been selected as a Summer of Reproducibility Fellow. I will be contributing to the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/nyu/data-leakage/">Data leakage in applied ML: reproducing examples of irreproducibility&lt;/a> project under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/fraida-fund/">Fraida Fund&lt;/a> and &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/mohamed-saeed/">Mohamed Saeed&lt;/a>. You can find my proposal &lt;a href="https://drive.google.com/file/d/1WAsDif61O2fWgtkl75bQAnIcm2hryt8z/view?usp=sharing" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>This summer, we will reproduce studies from medicine, radiology and genomics. Through these studies, we&amp;rsquo;ll explore and demonstrate three types of data leakage:&lt;/p>
&lt;ol>
&lt;li>Pre-processing on train and test sets together&lt;/li>
&lt;li>Model uses features that are not legitimate&lt;/li>
&lt;li>Feature selection on training and test sets&lt;/li>
&lt;/ol>
&lt;p>For each paper, we will replicate the published results with and without the data leakage error, and present performance metrics for comparison. We will also provide explanatory materials and example questions to test understanding. All these resources will be bundled together in a dedicated repository for each paper.&lt;/p>
&lt;p>This project aims to address the need for accessible educational material on data leakage. These materials will be designed to be readily adopted by instructors teaching machine learning in a wide variety of contexts. They will be presented in a clear and easy-to-follow manner, catering to a broad range of backgrounds and raising awareness about the consequences of data leakage.&lt;/p>
&lt;p>Stay tuned for updates on my progress! You can follow me on &lt;a href="https://github.com/shaivimalik" target="_blank" rel="noopener">GitHub&lt;/a> and watch out for my upcoming blog posts.&lt;/p></description></item><item><title>Assessing the Computational Reproducibility of Jupyter Notebooks</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/depaul/20240618-nbrewer/</link><pubDate>Tue, 18 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/depaul/20240618-nbrewer/</guid><description>&lt;p>Like so many authors before me, my first reproducibility study and very first academic publication started with the age-old platitude, &amp;ldquo;Reproducibility is a cornerstone of the scientific method.&amp;rdquo; My team and I participated in a competition to replicate the performance improvements promised by a paper presented at last year&amp;rsquo;s Supercomputing conference. We weren&amp;rsquo;t simply re-executing the same experiment with the same cluster; instead, we were trying to confirm that we got similar results on a different cluster with an entirely different architecture. From the very beginning, I struggled to wrap my mind around the many reasons for reproducing computational experiments, their significance, and how to prioritize them. All I knew was that there seemed to be a consensus that reproducibility is important to science and that the experience left me with more questions than answers.&lt;/p>
&lt;p>Not long after that, I started a job as a research software engineer at Purdue University, where I worked heavily with Jupyter Notebooks. I used notebooks and interactive components called widgets to create a web application, which I turned into a reusable template. Our team was enthusiastic about using Jupyter Notebooks to quickly develop web applications because the tools were accessible to the laboratory researchers who ultimately needed to maintain them. I was fortunate to receive the &lt;a href="https://bssw.io/fellows/nicole-brewer" target="_blank" rel="noopener">Better Scientific Software Fellowship&lt;/a> to develop tutorials to teach others how to use notebooks to turn their scientific workflows into web apps. I collected those and other resources and established the &lt;a href="https://www.jupyter4.science" target="_blank" rel="noopener">Jupyter4Science&lt;/a> website, a knowledgebase and blog about Jupyter Notebooks in scientific contexts. That site aims to improve the accessibility of research data and software.&lt;/p>
&lt;p>There seemed to be an important relationship between improved accessibility and reuse of research code and data and computational reproducibility, but I still had trouble articulating it. In pursuit of answers, I moved to sunny Arizona to pursue a History and Philosophy of Science degree. My research falls at the confluence of my prior experiences; I&amp;rsquo;m studying the reproducibility of scientific Jupyter Notebooks. I have learned that questions about reproducibility aren&amp;rsquo;t very meaningful without considering specific aspects such as who is doing the experiment and replication, the nature of the experimental artifacts, and the context in which the experiment takes place.&lt;/p>
&lt;p>I was fortunate to have found a mentor for the Summer of Reproducibility, &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/tanu-malik/">Tanu Malik&lt;/a>, who shares the philosophy that the burden of reproducibility should not solely rest on domain researchers who must develop other expertise. She and her lab have developed &lt;a href="https://github.com/depaul-dice/Flinc" target="_blank" rel="noopener">FLINC&lt;/a>, an application virtualization tool that improves the portability of computational notebooks. Her prior work demonstrated that FLINC provides efficient reproducibility of notebooks and takes significantly less time and space to execute and repeat notebook execution than Docker containers for the same notebooks. My work will expand the scope of this original experiment to include more notebooks to FLINC&amp;rsquo;s test coverage and show robustness across even more diverse computational tasks. We expect to show that infrastructural tools like FLINC improve the success rate of automated reproducibility.&lt;/p>
&lt;p>I&amp;rsquo;m grateful to both the Summer of Reproducibility program managers and my research mentor for this incredible opportunity to further my dissertation research in the context of meaningful collaboration.&lt;/p></description></item><item><title>Exploring Reproducibility in High-Performance Computing Publications with the Chameleon Cloud</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tuwien/autoappendix/20240615-kkrassni/</link><pubDate>Sat, 15 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/tuwien/autoappendix/20240615-kkrassni/</guid><description>&lt;p>Hello everyone,&lt;/p>
&lt;p>I&amp;rsquo;m &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/klaus-kra%C3%9Fnitzer/">Klaus Kraßnitzer&lt;/a> and am currently finishing up my Master&amp;rsquo;s degree at
the Technical University of Vienna. This summer, under the guidance of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/sascha-hunold/">Sascha Hunold&lt;/a>,
I&amp;rsquo;m excited to dive into a project that aims to enhance reproducibility in
high-performance computing research.&lt;/p>
&lt;p>Our project, &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/tuwien/autoappendix/">AutoAppendix&lt;/a>, focuses on the rigorous evaluation and potential
automation of Artifact Description (AD) and Artifact Evaluation (AE) appendices
from publications to this year&amp;rsquo;s &lt;a href="https://supercomputing.org/" target="_blank" rel="noopener">Supercomputing Conference (SC)&lt;/a>. Due to a sizeable
chunk of SC publications utlizing &lt;a href="https://chameleoncloud.org/" target="_blank" rel="noopener">Chameleon Cloud&lt;/a>, a
platform known for its robust and scalable experiment setups, the project will
be focused on and creating guidelines (and
potentially, software tools) that users of the Chameleon Cloud can utilize to
make their research more easily reproducible. You can learn more about the project
and read the full proposal &lt;a href="https://drive.google.com/file/d/1J9-Z0WSIqyJpnmd_uxtEm_m4ZIO87dBH/view?usp=drive_link" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>My fascination with open-source development and research reproducibility was sparked during my undergraduate studies and further nurtured by my role as a teaching assistant. Hands-on projects and academic courses, like those in chemistry emphasizing precise experimental protocols, have deeply influenced my approach to computational science.&lt;/p>
&lt;h2 id="project-objectives">Project Objectives&lt;/h2>
&lt;ol>
&lt;li>&lt;strong>Analyze and Automate&lt;/strong>: Assess current AE/AD appendices submitted for SC24, focusing on their potential for automation.&lt;/li>
&lt;li>&lt;strong>Develop Guidelines&lt;/strong>: Create comprehensive guidelines to aid future SC conferences in artifact submission and evaluation.&lt;/li>
&lt;li>&lt;strong>Build Tools (Conditionally)&lt;/strong>: Develop automation tools to streamline the evaluation process.&lt;/li>
&lt;/ol>
&lt;p>The ultimate aim of the project is to work towards a more efficient, transparent, and
reproducible research environment, and I&amp;rsquo;m committed to making it simpler for
researchers to demonstrate and replicate scientific work. I look forward to
sharing insights and progress as we move forward.&lt;/p>
&lt;p>Thanks for reading, and stay tuned for more updates!&lt;/p></description></item><item><title> Reproducibility in Data Visualization</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240614-aryas/</link><pubDate>Fri, 14 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240614-aryas/</guid><description>&lt;p>Hello! My name is &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/arya-sarkar/">Arya Sarkar&lt;/a> and I will be contributing to the research project titled &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/niu/repro-vis/">Reproducibility in Data Visualization&lt;/a>, with a focus on investigating and coming up with novel solutions to capture both static and dynamic visualizations from different sources. My project is titled Investigate Solutions for Capturing Visualizations and I am mentored by Prof. &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/david-koop/">David Koop&lt;/a>.&lt;/p>
&lt;p>Open-source has always piqued my interest, but often I found it hard to get started in as a junior in university. I spent a lot of time working with data visualizations but had never dived into the problem of reproducibility before diving into this project. When I saw a plethora of unique and interesting projects during the contribution phase of OSRE-2024, I was confused at the beginning. However, the more I dived into this project and understood the significance of research in this domain to ensure reproducibility, the more did I find myself getting drawn towards it. I am glad to be presented this amazing opportunity to work in the Open-source space as a researcher in reproducibility.&lt;/p>
&lt;p>This project aims to investigate, augment, and/or develop solutions to capture visualizations that appear in formats including websites and Jupyter notebooks. We have a special interest on capturing the state of interactive visualizations and preserving the user interactions required to reach a certain visualization in an interactive environment to ensure reproducibility.&lt;a href="https://drive.google.com/file/d/1SGLd37zBjnAU-eYytr7mYzfselHgxvK1/view?usp=sharing" target="_blank" rel="noopener">My proposal can be viewed here!&lt;/a>&lt;/p></description></item><item><title>Heterogeneous Graph Neural Networks for I/O Performance Bottleneck Diagnosis</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240614-mahdi/</link><pubDate>Fri, 14 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/lbl/aiio/20240614-mahdi/</guid><description>&lt;p>Hello, I am &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/mahdi-banisharifdehkordi/">Mahdi Banisharifdehkordi&lt;/a>, a Ph.D. student in Computer Science at Iowa State University, specializing in Artificial Intelligence. This summer, I will be working on the project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/lbl/aiio/">AIIO / Graph Neural Network&lt;/a> under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/bin-dong/">Bin Dong&lt;/a> and Suren Byna.&lt;/p>
&lt;p>High-Performance Computing (HPC) applications often face performance issues due to I/O bottlenecks. Manually identifying these bottlenecks is time-consuming and error-prone. My project aims to enhance the AIIO framework by integrating a Graph Neural Network (GNN) model to automatically diagnose I/O performance bottlenecks at the job level. This involves developing a comprehensive data pre-processing pipeline, constructing and validating a tailored GNN model, and rigorously testing the model&amp;rsquo;s accuracy using test cases from the AIIO dataset.&lt;/p>
&lt;p>Through this project, I seek to provide a sophisticated, AI-driven approach to understanding and improving I/O performance in HPC systems, ultimately contributing to more efficient and reliable HPC applications.&lt;/p></description></item><item><title>Reproducibility in Data Visualization</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240613-triveni5/</link><pubDate>Thu, 13 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/niu/repro-vis/20240613-triveni5/</guid><description>&lt;p>Hello everyone!&lt;/p>
&lt;p>I&amp;rsquo;m Triveni, a Master&amp;rsquo;s student in Computer Science at Northern Illinois University (NIU). When I came across the OSRE 2024 project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/niu/repro-vis/">Categorize Differences in Reproduced Visualizations&lt;/a> focusing on data visualization reproducibility, I was excited because it aligned with my interest in data visualization. While my initial interest was in geospatial data visualization, the project&amp;rsquo;s goal of ensuring reliable visualizations across all contexts really appealed to me. So, I actively worked on understanding the project’s key concepts and submitted my proposal &lt;a href="https://drive.google.com/file/d/1R1c23oUC7noZo5NrUzuDbjwo0OqbkrAK/view" target="_blank" rel="noopener">My proposal can be viewed here&lt;/a> under mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/david-koop/">David Koop&lt;/a> to join the project.&lt;/p>
&lt;h2 id="early-steps-and-challenges">Early Steps and Challenges:&lt;/h2>
&lt;p>I began working on the project on May 27th, three weeks ago. Setting up the local environment initially presented some challenges, but I persevered and successfully completed the setup process. The past few weeks have been spent exploring the complexities of reproducibility in visualizations, particularly focusing on capturing the discrepancies that arise when using different versions of libraries to generate visualizations. Working with Dr. David Koop as my mentor has been an incredible experience. Our weekly report meetings keep me accountable and focused. While exploring different algorithms and tools to compare visualizations can be challenging at times, it&amp;rsquo;s a fantastic opportunity to learn cutting-edge technologies and refine my problem-solving skills.&lt;/p>
&lt;h2 id="looking-ahead">Looking Ahead:&lt;/h2>
&lt;p>I believe this project can make a valuable contribution to the field of reproducible data visualization. By combining automated comparison tools with a user-centric interface, we can empower researchers and data scientists to make informed decisions about the impact of visualization variations. In future blog posts, I&amp;rsquo;ll share more about the specific tools and techniques being used, and how this framework will contribute to a more reliable and trustworthy approach to data visualization reproducibility.&lt;/p>
&lt;p>Stay tuned!&lt;/p>
&lt;p>I&amp;rsquo;m excited to embark on this journey and share my progress with all of you.&lt;/p></description></item><item><title>Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240612-architd/</link><pubDate>Wed, 12 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/intel/20240612-architd/</guid><description>&lt;p>Hello everyone
I&amp;rsquo;am Archit from India. An undergraduate student at the Indian Institute of Technology, Banaras Hindu University, IIT (BHU), Varanasi. As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/bsc/ro-crate-compss/">Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon&lt;/a> my &lt;a href="https://drive.google.com/file/d/1qY-uipQZPox144LD4bs05rn3islfcjky/view" target="_blank" rel="noopener">proposal&lt;/a> under mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/raul-sirvent/">Raül Sirvent&lt;/a> aims to develop a service that facilitates the automated replication of COMPSs experiments within the Chameleon infrastructure.&lt;/p>
&lt;h2 id="about-the-project">About the project:&lt;/h2>
&lt;p>The project proposes to create a service that will have the capability to take a COMPSs crate (an artifact adhering to the RO-Crate specification) and, through analysis of the provided metadata construct a Chameleon-compatible image for replicating the experiment on the testbed.&lt;/p>
&lt;h2 id="how-it-all-started">How it all started&lt;/h2>
&lt;p>This journey began amidst our college&amp;rsquo;s cultural fest, in which I was participating, just 15 days before the proposal submission deadline. Many of my friends had been working for months to get selected for GSoC. I didn’t think I could participate this year because I was late, so I thought, &amp;ldquo;Better luck next year.&amp;rdquo; But during the fest, I kept hearing about UC OSPO and that a senior had been selected within a month. So, I was in my room when my friend told me, &amp;ldquo;What&amp;rsquo;s the worst that can happen? Just apply,&amp;rdquo; and so I did. I chose this project and wrote my introduction in Slack without knowing much. After that, it&amp;rsquo;s history. I worked really hard for the next 10 days learning about the project, making the proposal, and got selected.&lt;/p>
&lt;h2 id="first-few-weeks">First few weeks:&lt;/h2>
&lt;p>I started the project a week early from June 24, and it’s been two weeks since. The start was a bit challenging since it required setting up a lot of things on my local machine. For the past few weeks, the majority of my time has been dedicated to learning about COMPSs, RO-Crate, and Chameleon, the three technologies this project revolves around. The interaction with my mentor has also been great. From the weekly report meetings to the daily bombardment of doubts by me, he seems really helpful.
It is my first time working with Chameleon or any cloud computing software, so it can be a bit overwhelming sometimes, but it is getting better with practice.&lt;/p>
&lt;p>Stay tuned for progress in the next blog!!&lt;/p></description></item><item><title>FSA: Benchmarking Fail-Slow Algorithms</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/fsa-benchmarking/20240612-xikangsong/</link><pubDate>Wed, 12 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/fsa-benchmarking/20240612-xikangsong/</guid><description>&lt;p>Hi everyone! I&amp;rsquo;m Xikang, a master&amp;rsquo;s CS student at UChicago. As a part of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/uchicago/failslowalgorithms/">FSA benchmarking Project&lt;/a>, I&amp;rsquo;m thrilled to be a contributor to OSRE 2024, collaborating with Kexin Pei, the assistant Professor of Computer Science at Uchicago and Ruidan, a talented PhD student at UChicago.&lt;/p>
&lt;p>This summer, I will focus on integrating some advanced ML into our RAID slowdown analysis. Our aim is to assess whether LLMs can effectively identify RAID slowdown issues and to benchmark their performance against our current machine learning algorithms. We will test the algorithms on Chameleon Cloud and benchmark them.&lt;/p>
&lt;p>Additionally, we will explore optimization techniques to enhance our pipeline and improve response quality. We hope this research will be a start point for future work, ultilizing LLMs to overcome the limitations of existing algorithms and provide a comprehensive analysis that enhances RAID and other storage system performance.&lt;/p>
&lt;p>I&amp;rsquo;m excited to work with all of you and look forward to your suggestions.
if you are interested, Here is my &lt;a href="https://docs.google.com/document/d/1KpodnahgQDNf1-05TF2BdYXiV0lT_oYEnC0oaatHRoc/edit?usp=sharing" target="_blank" rel="noopener">proposal&lt;/a>&lt;/p></description></item><item><title>ML-Powered Problem Detection in Chameleon</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20240612-syed/</link><pubDate>Wed, 12 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uchicago/chameleoncloud/20240612-syed/</guid><description>&lt;p>Hello, I am &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/syed-mohammad-qasim/">Syed Mohammad Qasim&lt;/a>, a PhD candidate in Electrical and Computer Engineering at Boston University. I will be spending my
summer working on the project &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/uchicago/ml_detect_chameleon/">ML-Powered Problem Detection in Chameleon&lt;/a> under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/ayse-coskun/">Ayse Coskun&lt;/a>
and &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/michael-sherman/">Michael Sherman&lt;/a>.&lt;/p>
&lt;p>Currently, Chameleon Cloud monitors sites at the Texas Advanced Computing Center (TACC), University of Chicago,
Northwestern University, and Argonne National Lab. They collect metrics using Prometheus at each site and feed them
all to a central Mimir cluster. All the logs go to a central Loki, and Grafana is used to visualize and set alerts.
Chameleon currently collects around 3000 metrics. Manually reviewing and setting alerts on them is time-consuming
and labor-intensive. This project aims to help Chameleon operators monitor their systems more effectively and improve overall
reliability by creating an anomaly detection service that can augment the existing alerting framework.&lt;/p></description></item><item><title>Improving Video Applications' Accuracy by Enabling The Use of Concierge</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/uchicago/edgebench/20230731-zharfanf/</link><pubDate>Mon, 31 Jul 2023 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/uchicago/edgebench/20230731-zharfanf/</guid><description>&lt;style>
p {
text-align: justify;
}
img {
display: block;
margin-left: auto;
margin-right: auto;
}
&lt;/style>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello, it&amp;rsquo;s me again, Faishal, a SoR project contributor for the edgebench project. For the past these two months, my mentors and I have been working on improving the performance of our system. In this report, I would like to share with you what we have been working on.&lt;/p>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Edgebench is a project that focuses on how to efficiently distribute resource (bandwidth and cpu usage) across several video applications. Nowaday&amp;rsquo;s video applications process its data or video on a server or known as edge computing, hence bandwidth or compute unit may be the greatest concern if we talk about edge computing in terms of WAN, because it is strictly limited.&lt;/p>
&lt;p>Consider the following case, suppose we have 3 video applications running that is located in several areas across a city. Suppose the total bandwidth allocated to those 3 video applications is also fixed. Naively, we may divide the bandwidth evenly to every camera in the system. We may have the following graph of the allocated bandwidth overtime.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/baseline_alloc.png" alt="Baseline Allocation" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>They are fixed and won’t change. However, every video application has its own characteristic to deliver such a good result or f1-score. It is our task to maintain high average f1-score. Therefore we need to implement a new solution which is accuracy-oriented. The accuracy-gradient&lt;a href="%28#acc%29">[1]&lt;/a> comes into this.&lt;/p>
&lt;h2 id="system-design">System Design&lt;/h2>
&lt;p>On our current design, we need a resource allocator, namely concierge. This concierge determines how much bandwidth is needed for every video application (vap) in the system. Concierge will do the allocation at a certain time interval that has been determined before. This process is called profiling, on this process, the concierge will first ask every vap to calculate their f1-score at a certain video segment when the bandwidth is added by profile_delta. Then the difference of this f1-score is substracted by the default f1-score, namely &lt;code>f1_diff_high&lt;/code>. After that, the concierge will ask to reduce its bandwidth by profile_delta and do the same process as before, this result will be named &lt;code>f1_diff_low.&lt;/code> Those two results will be sent to the concierge for the next step. On the concierge, there will be sensitivity calculation, where sensitivity is&lt;/p>
&lt;!-- pada sistem yang kami desain, kami membutuhkan sebuah resource allocator yang kami namakan concierge. Concierge ini yang akan menentukan berapa besarnya bandwidth yang dibutuhkan pada tiap video application. Concierge akan melakukan penentuan bw dalam interval yang sudah ditentukan sebelumnya, pada tahap ini, concierge akan meminta kepada seluruh video aplikasi untuk menghitung f1-score pada segmen video tertentu ketika alokasi bandwidth pada aplikasi itu dinaikan sebesar delta yang sudah ditentukan pula. Setelah itu, the difference of f1-score disimpan pada variabel f1_diff_high. Lalu concierge akan meminta f1-score ketika bw akan diturunkan sebesar delta. Akan pula dihitung the difference-nya. Kedua hasil tersebut akan dikirimkan oleh video aplikasi kepada concierge untuk dilakukan perhitungan selanjutnya. -->
&lt;!-- Pada concierge, akan dilakukan perhitungan sensitivity. Where sensitivity -->
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://latex.codecogs.com/svg.image?&amp;amp;space;sensitivity[i]=f1%5c_diff%5c_high[i]-%5cSigma_%7bk=1%7d%5enf1%5c_diff%5c_low[k];k%5cneq&amp;amp;space;i&amp;amp;space;" alt="sensitivity[i] = f1_diff_high[i] - \Sigma_{k=1}^nf1_diff_low[k]; k \neq i" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>This equation tells us which video application will give us the best f1-score improvement if we add more bandwidth to one vap while reducing other&amp;rsquo;s bandwidth. From this, we will optimize and the concierge will give the bandwdith to the one with the highest sensitivity and take the bandwidth from the app with the lowest sensitvity.&lt;/p>
&lt;h2 id="results">Results&lt;/h2>
&lt;p>As aforementioned, our main objective is to improve the accuracy. However, there are two parameters that will be taken into account which are improvement and the overhead of its improvement. We first choose 3 dds apps&lt;a href="#dds">[2]&lt;/a> that we think will be our ideal case. The following graphs show the profile of our ideal case&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/ideal_case.png" alt="Ideal Case" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>We can see that two of them have high sensitivity especially on lower bandwidth and one of them has low sensitivity. This is a perfect scenario since we may sacrifice one&amp;rsquo;s bandwidth and give it to the app that has the highest sensitivity at that iteration. We will do the experiment under the following setup&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-shell" data-lang="shell">&lt;span class="line">&lt;span class="cl">&lt;span class="nv">DATASETS&lt;/span>&lt;span class="o">=(&lt;/span>&lt;span class="s2">&amp;#34;&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;uav-1&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;coldwater&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;roppongi&amp;#34;&lt;/span>&lt;span class="o">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">MAX_BW&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">1200&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">PROFILING_DELTA&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">80&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">MI&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">5&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>That setup block tells us we will use the total bandwith of 1200 kbps, that means at first we will distribute the bandwidth evenly (400 kbps). The profiling_delta will be 80 kbps and profiling interval (&lt;code>MI&lt;/code>) will be 5 seconds.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/merged_ideal.png" alt="Merged Ideal" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:center">&lt;strong>Mode&lt;/strong>&lt;/th>
&lt;th style="text-align:center">&lt;em>DDS&lt;/em> &lt;br> (&lt;span style="color:blue">&lt;em>uav-1&lt;/em>&lt;/span>)&lt;/th>
&lt;th style="text-align:center">&lt;em>DDS&lt;/em> &lt;br> (&lt;span style="color:orange">&lt;em>coldwater&lt;/em>&lt;/span>)&lt;/th>
&lt;th style="text-align:center">&lt;em>DDS&lt;/em> &lt;br> (&lt;span style="color:green">&lt;em>roppongi&lt;/em>&lt;/span>)&lt;/th>
&lt;th style="text-align:center">Average&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:center">Baseline&lt;/td>
&lt;td style="text-align:center">0.042&lt;/td>
&lt;td style="text-align:center">0.913&lt;/td>
&lt;td style="text-align:center">0.551&lt;/td>
&lt;td style="text-align:center">0.502&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:center">&lt;strong>Concierge&lt;/strong>&lt;/td>
&lt;td style="text-align:center">0.542&lt;/td>
&lt;td style="text-align:center">0.854&lt;/td>
&lt;td style="text-align:center">0.495&lt;/td>
&lt;td style="text-align:center">&lt;strong>0.63&lt;/strong> (&lt;span style="color:green">&lt;em>+25.5%&lt;/em>&lt;/span>)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>From the result, we managed to improve the average f1-score by &lt;strong>0.1&lt;/strong> or &lt;strong>25.5%&lt;/strong>. This is obviously a very good result. There are a total of 10 videos in our dataset, for the next experiment, we first will generate 6 combinations of dds apps. Noted that for each combination, one video will be uav-1 since we know that it has the highest sensitivity. We will the experiment with 4 bandwidth scenarios &lt;strong>(1200, 1500, 1800, 2100)&lt;/strong> in kbps.&lt;/p>
&lt;!-- dari hasil tersebut, kita telah berhasil meng-improve rata-rata f1-score sebesar 0.1 atau 13.5% Hal ini tentu saja merupakan sebuah hasil yang sangat baik. Selanjutnya kami melakukan tes yang sama namun dengan video yang berbeda. setupnya demikian -->
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/only_uav_merged.png" alt="Only Uav-1" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The left figure depicts the average improvement of the concierge. Here we can see that the improvement decreases when the total bandwidth increases. The reason behind this is at a higher bandwidth, the sensitivity tends to be closer to 0 and the concierge won&amp;rsquo;t do any allocation. Overall, this confirms our previous result that with the help of uav-1, the concierge can improve the f1-score up to 0.1. The next experiment is to randomly pick 3 dds videos out of 10 videos that will be generated 10 times. We would like to see how it perfoms without any help of uav-1.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/random_merged.png" alt="Random Merged" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>From the result, we still managed to get the improvement. However, it seems that average improvement decreases compared to the previous one. The reason of this phenomenon will be discussed later.&lt;/p>
&lt;h3 id="overhead-measurement">Overhead Measurement&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/overhead_1.png" alt="Overhead Measurement" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>From the graph above, each graph represents the total bandwidth used. In this experiment, it is clearly known that the lower MI leads to higher overhead since there would be more profiling process than higher MI. From the 4 graphs above, it can be known that there would be a significant trade off if we lower the MI since the improvement itself is not highly significant. The highest improvement is at &lt;strong>1200kbps&lt;/strong>. Hence, for higher bandwidth, there is no need to do the profiling too often&lt;/p>
&lt;h2 id="discussion">Discussion&lt;/h2>
&lt;p>There are some limitations of our current design. If we have a look at box-plot in figure 5 above, we can see that there is some combinations where the improvement is negative.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./images/recovery_failed.png" alt="Failed Recovery" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The figure above depicts the profiling process from the segment 6 to determine the bandwidth used at segment 7. Here we can see that the f1-score at that bandwidth for (&lt;span style="color:blue">&lt;em>jakarta&lt;/em>&lt;/span>) drops significantly. Our current design cannot address this issue yet since we only consider current video segment. There is a need to not only look at current segment, but also the previous and the future segment should be taken into account as well.&lt;/p>
&lt;p>Regarding the overhead, we are aware that 50% overhead is still considered bad. We might as well try the dynamic &lt;code>MI&lt;/code> or skip the profiling for certain video if not neccesarry.&lt;/p>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>Regardless the aforementioned limitations, this report shows that the concierge is generally capable of giving an f1-score improvement. The update of the next will be shown in the final report later.&lt;/p>
&lt;h2 id="references">References&lt;/h2>
&lt;p>&lt;a id="acc">[1]&lt;/a> &lt;a href="https://drive.google.com/file/d/1U_o0IwYcBNF98cb5K_h56Nl-bQJSAtMj/view?usp=sharing" target="_blank" rel="noopener">https://drive.google.com/file/d/1U_o0IwYcBNF98cb5K_h56Nl-bQJSAtMj/view?usp=sharing&lt;/a> &lt;br>
&lt;a id="dds">[2]&lt;/a> Kuntai Du, Ahsan Pervaiz, Xin Yuan, Aakanksha Chowdhery, Qizheng Zhang, Henry Hoffmann, and Junchen Jiang. 2020. Server-driven video streaming for deep learning inference. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 557–570.&lt;/p></description></item><item><title>Public Artifact Data and Visualization</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/intel/artifactviz/20230617-zjyhhhhh/</link><pubDate>Sat, 17 Jun 2023 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/intel/artifactviz/20230617-zjyhhhhh/</guid><description>&lt;p>Hello! As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre23/intel/artifactviz">Public Artifact Data and Visualization&lt;/a> our proposals (&lt;a href="https://drive.google.com/file/d/1egIQDLMQ5eV7Uc-S55-GTiSXdmrC3_Pj/view?usp=sharing" target="_blank" rel="noopener">proposal&lt;/a> from &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/jiayuan-zhu/">Jiayuan Zhu&lt;/a> and &lt;a href="https://drive.google.com/file/d/1Gf68Pz8v3YjcQ1sWkS9n2hnl7_lsme2l/view?usp=sharing" target="_blank" rel="noopener">proposal&lt;/a> from &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/krishna-madhwani/">Krishna Madhwani&lt;/a>) under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/anjo-vahldiek-oberwagner/">Anjo Vahldiek-Oberwagner&lt;/a> aims to design a system that allows researchers to conveniently record and compare the environmental information, such as CPU utilization, of different iterations and versions of code during an experiment.&lt;/p>
&lt;p>In academic experiments, there is often a need to compare results and performance between different iterations and versions. This comparative analysis helps researchers evaluate the impact of different experimental parameters and algorithms on the results and enables them to optimize experimental design and algorithm selection. However, to conduct effective comparative analysis, it is essential to record and compare environmental information, alongside the experimental data. This information provides valuable insights into the factors that may influence the observed outcomes.&lt;/p>
&lt;p>Through this summer, we aim to develop a system that offers a streamlined interface, enabling users to effortlessly monitor their running programs using simple command-line commands. Moreover, our system will feature a user-friendly dashboard where researchers can access historical runtime information and visualize comparisons between different iterations. The dashboard will present comprehensive graphs and charts, facilitating the analysis of trends and patterns in the environmental data.&lt;/p></description></item><item><title>Reproduce and benchmark self-adaptive edge applications under dynamic resource management</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/uchicago/edgebench/20230530-zharfanf/</link><pubDate>Tue, 30 May 2023 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/uchicago/edgebench/20230530-zharfanf/</guid><description>&lt;p>Hello there!&lt;/p>
&lt;p>I am Faishal Zharfan, a senior year student studying Telecommunication Engineering at Bandung Institute of Technology (ITB) in Bandung, Indonesia, my &lt;a href="https://drive.google.com/file/d/1u3UsCQZ40erpPmyoyn8DEVqH5Txmvvkz/view?usp=drive_link" target="_blank" rel="noopener">proposal&lt;/a>. I&amp;rsquo;m currently part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre23/uchicago/edgebench/">Edgebench&lt;/a> under the mentorship of Yuyang Huang. The main goal of this project is to be able to reproduce and benchmark self-adaptive video applications using the proposed solution.&lt;/p>
&lt;p>The topic that I&amp;rsquo;m currently working on is &amp;ldquo;Reproduce and benchmark self-adaptive edge applications under dynamic resource management&amp;rdquo; or known as edgebench is led by Prof. Junchen Jiang and Yuyang Huang. Edgebench is a project that focuses on how to efficiently distribute resource (bandwidth and cpu usage) across several video applications. Nowaday&amp;rsquo;s video applications process its data or video on a server or known as edge computing, hence bandwidth or compute unit may be the greatest concern if we talk about edge computing in terms of WAN, because it is strictly limited. We may distribute the bandwidth evenly across the cameras, however the needs of bandwidth/compute unit of each camera is different. Therefore we need another solution to tackle this problem, the solution proposed recently is called &amp;ldquo;accuracy gradient&amp;rdquo;, with this solution, we can tell how much of one application needs the bandwidth on a certain time to achieve higher accuracy. The goal of this solution is to allocate more bandwidth to the apps which has the higher f1-score improvement and reduce the other which doesn&amp;rsquo;t have a significant diminishment of f1-score. Henceforth, in the end we would have a higher total f1-score.&lt;/p>
&lt;p>Throughout this summer, we have planned to implement the &amp;ldquo;accuracy gradient&amp;rdquo; and test several baselines to be compared with the solution. As for the implementation, we are currently implementing the latency measurement. We are aware that there is an overhead over this solution, therefore the latency should be taken into account.&lt;/p></description></item></channel></rss>