<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Panji Sri Kuncara Wisma | UCSC OSPO</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/panji-sri-kuncara-wisma/</link><atom:link href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/panji-sri-kuncara-wisma/index.xml" rel="self" type="application/rss+xml"/><description>Panji Sri Kuncara Wisma</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/panji-sri-kuncara-wisma/avatar_hu57901e993dcfcc591e00dc174e12aaca_277742_270x270_fill_q75_lanczos_center.jpg</url><title>Panji Sri Kuncara Wisma</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/panji-sri-kuncara-wisma/</link></image><item><title>[Final Blog] Distrobench: Distributed Protocol Benchmark</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250830-panjisri/</link><pubDate>Sat, 30 Aug 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250830-panjisri/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>This is the final blog for our contribution to the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/umass/edge-replication/">Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges&lt;/a> project under the mentorship of &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/fadhil-kurnia/">Fadhil Kurnia&lt;/a> for the OSRE program.&lt;/p>
&lt;p>&lt;a href="https://github.com/fadhilkurnia/distro" target="_blank" rel="noopener">Distrobench&lt;/a> is a framework to evaluate the performance of replication/coordination protocols for distributed systems. This framework standardizes benchmarking by allowing different protocols to be tested under an identical workload, and supports both local and remote deployment of the protocols. The frameworks tested are restricted under a key-value store application and are categorized under different &lt;a href="https://jepsen.io/consistency/models" target="_blank" rel="noopener">consistency models&lt;/a>, programming languages, and persistency (whether the framework stores its data in-memory or on-disk).&lt;/p>
&lt;p>All the benchmark results are stored in a &lt;code>data.json&lt;/code> file which can be viewed through a webpage we have provided. A user can clone the git repository, benchmark different protocols on their own machine or in a cluster of remote machines, then view the results locally. We also provided a &lt;a href="https://distrobench.org" target="_blank" rel="noopener">webpage&lt;/a> that shows our own benchmark results which ran on 3 Amazon EC2 t2.micro instances.&lt;br>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="" srcset="
/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_2eb41220c4287bdc730b38c76a5643f8.webp 400w,
/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_789a9a55850eed73f3a681f8423873cf.webp 760w,
/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250830-panjisri/image_hu785d614b38f6808c04fc85bf3c31eb36_153748_2eb41220c4287bdc730b38c76a5643f8.webp"
width="760"
height="381"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="how-to-run-a-benchmark-on-distrobench">How to run a benchmark on Distrobench&lt;/h2>
&lt;p>Before running a benchmark using Distrobench, the protocol that will be benchmarked must first be built. This is to allow the script to initialize the protocol instance for local benchmark or to send the binaries into the remote machine. The remote machine running the protocol does not need to store the code for the protocol implementations, but does require dependencies for running that specific protocol such as Java, Docker, rsync, etc. The following are commands used to build the &lt;a href="https://github.com/ailidani/paxi" target="_blank" rel="noopener">ailidani/paxi&lt;/a> project which does not need any additional dependency to be run inside of a remote machine:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Clone the Distrobench repository &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">git clone git@github.com:fadhilkurnia/distro.git
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Clone the Paxi repository and build the binary &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> distro/sut/ailidani.paxi
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">git clone git@github.com:ailidani/paxi.git
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> paxi/bin/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">./build.sh
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Go back to the Distrobench root directory &amp;amp; run python script &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> ../../../..
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">python main.py
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>By default, the script will start 3 local instances of a Paxi protocol implementation that the user chose through the CLI. The user can modify the number of running instances and whether or not it is deployed locally or in a remote machine by changing the contents of the &lt;code>.env&lt;/code> file inside the root directory. The following is the contents of the default .env file:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">NUM_OF_NODES=3
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">SSH_KEY=ssh-key.pem
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">REMOTE_USERNAME=ubuntu
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PUBLIC_IP1=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PUBLIC_IP2=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PUBLIC_IP3=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PRIVATE_IP1=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PRIVATE_IP2=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">PRIVATE_IP3=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">CLIENT_IP=127.0.0.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">OUTPUT=data.json
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>When running a remote benchmark, a ssh-key should also be added in the root directory to allow the use of ssh and rsync from within the python script. All machines must also allow TCP connection through port 2000-2300 and port 3000-3300 because that would be the port range for communication between the running instances as well as for the YCSB benchmark. Running the benchmark requires the use of at least 3 nodes because it is the minimum number of nodes to support most protocols (5 nodes recommended).&lt;/p>
&lt;p>To view the benchmark result in the web page locally, move &lt;code>data.json&lt;/code> into the &lt;code>docs/&lt;/code> directory and run &lt;code>python -m http.server 8000&lt;/code>. The page is then accessible through &lt;code>http://localhost:8000&lt;/code>.&lt;/p>
&lt;h2 id="deep-dive-on-how-distrobench-works">Deep dive on how Distrobench works&lt;/h2>
&lt;p>The following is the project structure of the Distrobench repository:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">distro/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── main.py // Main python script for running benchmark
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── data.json // Output file for main.py
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── README.md
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── .env // Config for running the benchmark
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── docs/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── index.html // Web page to show benchmark results
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── data.json // Output file displayed by web page
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── README.md
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── src/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── utils/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ └── ycsb/ // Submodule for YCSB
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">└── sut/ // Systems under test
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── ailidani.paxi/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> └── run.py // Protocol-specific benchmark script called by main.py
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── apache.zookeeper/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── etcd-io.etcd/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── fadhilkurnia.xdn/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── holipaxos-artifect.holipaxos/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ├── otoolep.hraftd/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> └── tikv.tikv/
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>main.py&lt;/code> will automatically detect directories inside &lt;code>sut/&lt;/code> and will call the main function inside &lt;code>run.py&lt;/code>. The following is the structure of &lt;code>run.py&lt;/code> written in pseudocode style:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">FUNCTION main(run_ycsb: Function, nodes: List of Nodes, ssh: Dictionary)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> node_data = map_ip_port(nodes)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> SWITCH user\_input
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> CASE 0:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> start()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> RETURN
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> CASE 1:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> stop()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> RETURN
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> CASE 2:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> client_data = []
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> FOR EACH item IN node_data
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ADD item.client_addr TO client_data
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> END FOR
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> run_ycsb(client_data)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> RETURN
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> END SWITCH
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">FUNCTION start()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> // Start the protocol instance (local or remote)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">FUNCTION stop()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> // Stop the protocol instance (local or remote)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">FUNCTION map_ip_port(nodes: List of Nodes) -&amp;gt; List of Dictionary
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> // Generate port numbers based on the protocol requirements
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">END FUNCTION
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The .env file provides both public and private IP addresses to add versatility when running a remote benchmark. Private IP is used for communication between remote machines if they are under the same network group. In the case of our own benchmark, four t2.micro EC2 instances are deployed under the same network group. Three of them are used to run the protocol and the fourth machine acts as the YCSB client. It is possible to use your local machine as the YCSB client instead of through another remote machine by specifying &lt;code>CLIENT_IP&lt;/code> in the .env file as &lt;code>127.0.0.1&lt;/code>. The decision to use the remote machine as the YCSB client is made to reduce the impact of network latency between the client and the protocol servers to a minimum.&lt;/p>
&lt;p>The main tasks of the &lt;code>start()&lt;/code> function can be broken down into the following:&lt;/p>
&lt;ol>
&lt;li>Generate custom configuration files for each remote machine instance (May differ between implementations. Some implementations does not require a config file because they support flag parameters out of the box, others require multiple configuration files for each instance)&lt;/li>
&lt;li>rsync binaries into the remote machine (If running a remote benchmark)&lt;/li>
&lt;li>Start the instances&lt;/li>
&lt;/ol>
&lt;p>The &lt;code>stop()&lt;/code> function is a lot simpler since it only kills the process running the protocol and optionally removes the copied binary files in the remote machine. The &lt;code>run_ycsb()&lt;/code> function passed onto &lt;code>run.py&lt;/code> is defined in &lt;code>main.py&lt;/code> and currently supports two types of workload:&lt;/p>
&lt;ol>
&lt;li>Read-heavy: A single-client workload with 95% read and 5% update (write) operations&lt;/li>
&lt;li>Update-heavy: A single-client workload with 50% read and 50% update (write) operations&lt;/li>
&lt;/ol>
&lt;p>A new workload can be added inside the &lt;code>src/ycsb/workloads&lt;/code> directory. Both workloads above only run 1000 operations for the benchmark which may not be enough operations to properly evaluate the performance of the protocols. It should also be noted that while YCSB does support a &lt;code>scan&lt;/code> operation, it is never used for our benchmark because none of our tested protocols implement this operation.&lt;/p>
&lt;h3 id="how-to-implement-a-new-protocol-in-distrobench">How to implement a new protocol in Distrobench&lt;/h3>
&lt;p>Adding a new protocol to distrobench requires implementing two main components: a Python integration script (&lt;code>run.py&lt;/code>) and a YCSB database binding for benchmarking.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Create the protocol directory structure&lt;/p>
&lt;ul>
&lt;li>Create a new directory under &lt;code>sut/&lt;/code> using format &lt;code>yourrepo.yourprotocol/.&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Write &lt;code>run.py&lt;/code> integration&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Put script inside yourrepo.yourprotocol/ directory&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Must have the &lt;code>main(run_ycsb, nodes, ssh)&lt;/code> function.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Add start/stop/benchmark menu options&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Handle local (127.0.0.1) and remote deployment&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Create YCSB client&lt;/p>
&lt;ul>
&lt;li>Make Java class extending YCSB&amp;rsquo;s DB class&lt;/li>
&lt;li>Put inside &lt;code>src/ycsb/yourprotocol/src/main/java/site/ycsb/yourprotocol&lt;/code>&lt;/li>
&lt;li>Implement &lt;code>read()&lt;/code>, &lt;code>insert()&lt;/code>, &lt;code>update()&lt;/code>, &lt;code>delete()&lt;/code> methods&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Register your client&lt;/p>
&lt;ul>
&lt;li>Register your client to &lt;code>src/pom.xml&lt;/code>, &lt;code>src/ycsb/bin/binding.properties&lt;/code>, and &lt;code>src/ycsb/bin/ycsb&lt;/code>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Build and test&lt;/p>
&lt;ul>
&lt;li>Run &lt;code>cd src/ycsb &amp;amp;&amp;amp; mvn clean package&lt;/code>&lt;/li>
&lt;li>Run python &lt;code>main.py&lt;/code>&lt;/li>
&lt;li>Select your protocol and test it&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;h2 id="protocols-which-have-been-tested">Protocols which have been tested&lt;/h2>
&lt;p>Distrobench has tested 20 different distributed consensus protocols across 7 different implementation projects.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;a href="https://github.com/ailidani/paxi" target="_blank" rel="noopener">ailidani/paxi&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability, Eventual&lt;/li>
&lt;li>Protocol : Paxos, EPaxos, SDpaxos, WPaxos, ABD, chain, VPaxos, WanKeeper, KPaxos, Paxos_groups, Dynamo, Blockchain, M2Paxos, HPaxos.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/apache/zookeeper" target="_blank" rel="noopener">apache/zookeeper&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Java&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability + Primary Integrity&lt;/li>
&lt;li>Protocol : Zookeeper implements ZAB (Zookeper Atomic Broadcast)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/etcd-io/etcd" target="_blank" rel="noopener">etcd-io/etcd&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Raft&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/fadhilkurnia/xdn" target="_blank" rel="noopener">fadhilkurnia/xdn&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Java, Rust&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability, Linearizability + Primary Integrity&lt;/li>
&lt;li>Protocol : Gigapaxos&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/Zhiying12/holipaxos-artifect" target="_blank" rel="noopener">Zhiying12/holipaxos-artifect&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go, Rust&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Holipaxos, Omnipaxos, Multipaxos&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/otoolep/hraftd" target="_blank" rel="noopener">otoolep/hraftd&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Go&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Raft&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/tikv/tikv" target="_blank" rel="noopener">tikv/tikv&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Programming Language : Rust&lt;/li>
&lt;li>Persistency : On-Disk&lt;/li>
&lt;li>Consistency Model : Linearizability&lt;/li>
&lt;li>Protocol : Raft&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;h2 id="challenges">Challenges&lt;/h2>
&lt;ul>
&lt;li>When attempting to benchmark HoliPaxos, the main challenge was handling versions that rely on persistent storage with RocksDB. Since some implementations are written in Go, it was necessary to find compatible versions of RocksDB and gRocksDB (for example, RocksDB 10.5.1 works with gRocksDB 1.10.2). Another difficulty was that RocksDB is resource-intensive to compile, and in our project we did not have sufficient CPU capacity on the remote machine to build RocksDB and run remote benchmarks.&lt;/li>
&lt;li>Some projects did not compile successfully at first and required minor modifications to run.&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion-and-future-improvements">Conclusion and future improvements&lt;/h2>
&lt;p>The current benchmark result shows the performance of all the mentioned protocols by throughput and benchmark runtime. The results are subject to revisions because it may not reflect the best performance for the protocols due to unoptimized deployment script. We are also planning to switch to a more powerful EC2 machine because t2.micro does not have enough resources to support the use of RocksDB as well as TiKV.&lt;/p>
&lt;p>In the near future, additional features will be added to Distrobench such as:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Multi-Client Support:&lt;/strong> The YCSB client will start multiple clients which will send requests in parallel to different servers in the group.&lt;/li>
&lt;li>&lt;strong>Commit Versioning:&lt;/strong> Allows the labelling of all benchmark results with the commit hash of the protocol&amp;rsquo;s repository version. This allows comparing different versions of the same project.&lt;/li>
&lt;li>&lt;strong>Adding more Primary-Backup, Sequential, Causal, and Eventual consistency protocols:&lt;/strong> Implementations with support for a consistency model other than linearizability and one that provides an existing key-value store application are notoriously difficult to find.&lt;/li>
&lt;li>&lt;strong>Benchmark on node failure&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Benchmark on the addition of a new node&lt;/strong>&lt;/li>
&lt;/ul></description></item><item><title>Midterm Blog: Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/</link><pubDate>Fri, 25 Jul 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/</guid><description>&lt;p>Hello! I&amp;rsquo;m Panji Sri Kuncara Wisma and I want to share my midterm progress on the &amp;ldquo;Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges&amp;rdquo; project under the mentorship of Fadhil I. Kurnia.&lt;/p>
&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>The goal of our project is to create an open testbed that enables fair, reproducible evaluation of different consensus protocols (Paxos variants, EPaxos, Raft, etc.) when deployed at network edges. Currently, researchers struggle to compare these systems because they lack standardized evaluation environments and often rely on mock implementations of proprietary systems.&lt;/p>
&lt;p>XDN (eXtensible Distributed Network) is one of the important consensus systems we plan to evaluate in our benchmarking testbed. Built on GigaPaxos, it allows deployment of replicated stateful services across edge locations. As part of preparing our benchmarking framework, we need to ensure that the systems we evaluate, including XDN, are robust for fair comparison.&lt;/p>
&lt;h2 id="progress">Progress&lt;/h2>
&lt;p>As part of preparing our benchmarking tool, I have been working on refactoring XDN&amp;rsquo;s FUSE filesystem from C++ to Rust. This work is essential for creating a stable and reliable XDN platform.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="System Architecture" srcset="
/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_5600401ae6570bf38b96fa89a080f4f7.webp 400w,
/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_6d3b555dbec3bdb305839eda9b227acf.webp 760w,
/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/fuselog_design_hu4e0250a1afb641f82d064bca3b5b892d_118470_5600401ae6570bf38b96fa89a080f4f7.webp"
width="760"
height="439"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The diagram above illustrates how the FUSE filesystem integrates with XDN&amp;rsquo;s distributed architecture. On the left, we see the standard FUSE setup where applications interact with the filesystem through the kernel&amp;rsquo;s VFS layer. On the right, the distributed replication flow is shown: Node 1 runs &lt;code>fuselog_core&lt;/code> which captures filesystem operations and generates statediffs, while Nodes 2 and 3 run &lt;code>fuselog_apply&lt;/code> to receive and apply these statediffs, maintaining replica consistency across the distributed system.&lt;/p>
&lt;p>This FUSE component is critical for XDN&amp;rsquo;s operation as it enables transparent state capture and replication across edge nodes. By refactoring this core component from C++ to Rust, we&amp;rsquo;re hopefully strengthening the foundation for fair benchmarking comparisons in our testbed.&lt;/p>
&lt;h3 id="core-work-c-to-rust-fuse-filesystem-migration">Core Work: C++ to Rust FUSE Filesystem Migration&lt;/h3>
&lt;p>XDN relies on a FUSE (Filesystem in Userspace) component to capture filesystem operations and generate &amp;ldquo;statediffs&amp;rdquo; - records of changes that get replicated across edge nodes. The original C++ implementation worked but had memory safety concerns and limited optimization capabilities.&lt;/p>
&lt;p>I worked on refactoring from C++ to Rust, implementing several improvements:&lt;/p>
&lt;p>&lt;strong>New Features Added:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Zstd Compression&lt;/strong>: Reduces statediff payload sizes&lt;/li>
&lt;li>&lt;strong>Adaptive Compression&lt;/strong>: Intelligently chooses compression strategies&lt;/li>
&lt;li>&lt;strong>Advanced Pruning&lt;/strong>: Removes redundant operations (duplicate chmod/chown, created-then-deleted files)&lt;/li>
&lt;li>&lt;strong>Bincode Serialization&lt;/strong>: Helps avoid manual serialization code and reduces the risk of related bugs&lt;/li>
&lt;li>&lt;strong>Extended Operations&lt;/strong>: Added support for additional filesystem operations (mkdir, symlink, hardlinks, etc.)&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Architectural Improvements:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Memory Safety&lt;/strong>: Rust&amp;rsquo;s ownership system helps prevent common memory management issues&lt;/li>
&lt;li>&lt;strong>Type Safety&lt;/strong>: Using Rust enums instead of integer constants for better type checking&lt;/li>
&lt;/ul>
&lt;h2 id="findings">Findings&lt;/h2>
&lt;p>The optimization results performed as expected:&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Database Performance Comparison" srcset="
/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_cb1ea5caaa82d543dfeabd0c97f7c4fe.webp 400w,
/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_d65f44ef3f769dddda7f0211b94ad6b6.webp 760w,
/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/performance_hudc10c2ffc95d775aedb0a1dad587d6fd_55711_cb1ea5caaa82d543dfeabd0c97f7c4fe.webp"
width="760"
height="433"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;strong>Statediff Size Reductions:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>MySQL workload&lt;/strong>: 572MB → 29.6MB (95% reduction)&lt;/li>
&lt;li>&lt;strong>PostgreSQL workload&lt;/strong>: 76MB → 11.9MB (84% reduction)&lt;/li>
&lt;li>&lt;strong>SQLite workload&lt;/strong>: 4MB → 29KB (99% reduction)&lt;/li>
&lt;/ul>
&lt;p>The combination of write coalescing, pruning, and compression proves especially effective for database workloads, where many operations involve small changes to large files.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Rust vs C&amp;#43;&amp;#43; Performance Comparison" srcset="
/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_2adee964972897a04e60327dcfe9675e.webp 400w,
/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_dd86a6fc0dabbac3beb17266f1f49002.webp 760w,
/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250725-panjisri/latency_hu3b080735c91d058ad2f9cf67a54d5f14_21553_2adee964972897a04e60327dcfe9675e.webp"
width="760"
height="470"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;strong>Performance Comparison:&lt;/strong>
Remarkably, the Rust implementation matches or exceeds C++ performance:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>POST operations&lt;/strong>: 30% faster (10.5ms vs 15ms)&lt;/li>
&lt;li>&lt;strong>DELETE operations&lt;/strong>: 33% faster (10ms vs 15ms)&lt;/li>
&lt;li>&lt;strong>Overall latency&lt;/strong>: Consistently better (9ms vs 11ms)&lt;/li>
&lt;/ul>
&lt;h2 id="current-challenges">Current Challenges&lt;/h2>
&lt;p>While the core implementation is complete and functional, I&amp;rsquo;m currently debugging occasional latency spikes that occur under specific workload patterns. These edge cases need to be resolved before moving on to the benchmarking phase, as inconsistent performance could compromise the reliability of the evaluation.&lt;/p>
&lt;h2 id="next-steps">Next Steps&lt;/h2>
&lt;p>With the FUSE filesystem foundation nearly complete, next steps include:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Resolve latency spike issues&lt;/strong> and complete XDN stabilization&lt;/li>
&lt;li>&lt;strong>Build benchmarking framework&lt;/strong> - a comparison tool that can systematically evaluate different consensus protocols with standardized metrics.&lt;/li>
&lt;li>&lt;strong>Run systematic evaluation&lt;/strong> across protocols&lt;/li>
&lt;/ol>
&lt;p>The optimized filesystem will hopefully provide a stable base for reproducible performance comparisons between distributed consensus protocols.&lt;/p></description></item><item><title>Developing an Open Testbed for Edge Replication System Evaluation</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250615-panjisri/</link><pubDate>Sun, 15 Jun 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre25/umass/edge-replication/20250615-panjisri/</guid><description>&lt;p>Hi, I&amp;rsquo;m Panji. I&amp;rsquo;m currently contributing to the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre25/umass/edge-replication/">Open Testbed for Reproducible Evaluation of Replicated Systems at the Edges&lt;/a> under the mentorship of Fadhil I. Kurnia. You can find more details on the project proposal &lt;a href="https://drive.google.com/file/d/1CFT5CJJXbQlVPz8_A9Dxkjl7oRjESdli/view?usp=sharing" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>The primary challenge we&amp;rsquo;re addressing is the current difficulty in fairly comparing different edge replication systems. To fix this, we&amp;rsquo;re trying to build a testing platform with four key parts. We&amp;rsquo;re collecting real data about how people actually use edge services, creating a tool that can simulate realistic user traffic across many locations, building a system that mimics network delays between hundreds of edge servers, and packaging everything into an open-source toolkit.&lt;/p>
&lt;p>This will let researchers test different coordination methods like EPaxos, Raft, and others using the same data and conditions. We hope this will help provide researchers with a more standardized way to evaluate their systems. We&amp;rsquo;re working with multiple programming languages and focusing on making complex edge computing scenarios accessible to everyone in the research community.&lt;/p>
&lt;p>One of the most interesting aspects of this project is tackling the challenge of creating realistic simulations that accurately reflect the performance characteristics different coordination protocols would exhibit in actual edge deployments. The end goal is to provide the research community with a standardized, reproducible environment for edge replication.&lt;/p></description></item></channel></rss>