<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Qianru Zhang | UCSC OSPO</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/qianru-zhang/</link><atom:link href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/qianru-zhang/index.xml" rel="self" type="application/rss+xml"/><description>Qianru Zhang</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/qianru-zhang/avatar_hu7d5c153b57c3a8409a98edda0e9d8a10_1464188_270x270_fill_q75_lanczos_center.jpg</url><title>Qianru Zhang</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/qianru-zhang/</link></image><item><title>Final Blog: BenchmarkST: Cross-Platform, Multi-Species Spatial Transcriptomics Gene Imputation Benchmarking</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240829-qianru/</link><pubDate>Thu, 29 Aug 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240829-qianru/</guid><description>&lt;p>Hello! I&amp;rsquo;m Qianru! I have been contributing to the BenchmarkST: Cross-Platform, Multi-Species Spatial Transcriptomics Gene Imputation Benchmarking project under the mentorship of Ziheng Duan. My project aims to provide a standardized, easily accessible evaluation framework for gene imputation in spatial transcriptomics.&lt;/p>
&lt;h1 id="motivation-and-overview">Motivation and Overview&lt;/h1>
&lt;p>The &amp;ldquo;BenchmarkST&amp;rdquo; project was driven by the need to address a critical challenge in spatial transcriptomics: the impact of sparse data on downstream tasks, such as spatial domain identification. Sparse data can significantly degrade the performance of these tasks. For example, in a 10X Visium dataset of human brain Dorsolateral Prefrontal Cortex (DLPFC), using the complete dataset with GraphST (a state-of-the-art clustering method) for clustering resulted in an ARI (Adjusted Rand Index) of 0.6347. However, when using only 20% of the data—a common scenario—the performance dropped dramatically to 0.1880. This stark difference highlights the importance of effective gene imputation, which can help restore the lost information and improve the accuracy of downstream analyses.
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="fig1" srcset="
/report/osre24/uci/benchmarkst/20240829-qianru/fig1_hu72c585df7604f28a748aa64a85602fac_159578_1bdac9436ddd84b83023a2cd20d76fb3.webp 400w,
/report/osre24/uci/benchmarkst/20240829-qianru/fig1_hu72c585df7604f28a748aa64a85602fac_159578_8a97a3a52a0fad3fb5d2dbf596e883a9.webp 760w,
/report/osre24/uci/benchmarkst/20240829-qianru/fig1_hu72c585df7604f28a748aa64a85602fac_159578_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240829-qianru/fig1_hu72c585df7604f28a748aa64a85602fac_159578_1bdac9436ddd84b83023a2cd20d76fb3.webp"
width="760"
height="496"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
To tackle this issue, the BenchmarkST project led to the creation of the Impeller package. This package provides a standardized, easily accessible evaluation framework for gene imputation in spatial transcriptomics, offering preprocessed datasets, reproducible evaluation methods, and flexible inference interfaces. It spans across different platforms, species, and organs, aiming to enhance the integrity and usability of spatial transcriptomics data.&lt;/p>
&lt;h1 id="what-was-accomplished">What Was Accomplished&lt;/h1>
&lt;h2 id="development-of-the-impeller-package">Development of the Impeller Package&lt;/h2>
&lt;h4 id="data-aggregation-and-preprocessing">Data Aggregation and Preprocessing:&lt;/h4>
&lt;p>We aggregated and preprocessed spatial transcriptomic datasets from multiple platforms (10X Visium, StereoSeq, SlideSeqV2), species (human, mouse), and organs (Dorsolateral Prefrontal Cortex, olfactory bulb). These datasets are readily available for download within the package.&lt;/p>
&lt;h4 id="unified-evaluation-framework">Unified Evaluation Framework:&lt;/h4>
&lt;p>A reproducible framework was developed, integrating methods such as K-Nearest Neighbors (KNN) and the deep learning-based Impeller method, enabling users to easily evaluate the performance of different gene imputation techniques.&lt;/p>
&lt;h4 id="inference-interfaces">Inference Interfaces:&lt;/h4>
&lt;p>We provided interfaces that allow users to apply gene imputation on custom datasets, offering the flexibility to predict any gene in any cell, maximizing the utility for diverse research needs.&lt;/p>
&lt;h2 id="code-contributions-and-documentation">Code Contributions and Documentation&lt;/h2>
&lt;h4 id="repository">Repository:&lt;/h4>
&lt;p>All code related to the Impeller package has been committed to the &lt;a href="https://pypi.org/project/impeller/0.1.2/#files" target="_blank" rel="noopener">Impeller&lt;/a> repository.&lt;/p>
&lt;h4 id="link-to-versions">Link to Versions:&lt;/h4>
&lt;p>&lt;a href="https://pypi.org/project/impeller/0.1.2/#history" target="_blank" rel="noopener">Here&lt;/a> you can find all the versions made during the project, with detailed descriptions of each change.&lt;/p>
&lt;h4 id="readmemdhttpspypiorgprojectimpeller012description">&lt;a href="https://pypi.org/project/impeller/0.1.2/#description" target="_blank" rel="noopener">README.md&lt;/a>:&lt;/h4>
&lt;p>Detailed documentation on how to use the Impeller package, including installation instructions, usage examples, and explanations of the key components.&lt;/p></description></item><item><title>Halfway Through GSOC: My Experience and Learnings</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240718-qianru/</link><pubDate>Thu, 18 Jul 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240718-qianru/</guid><description>&lt;p>Hello there! I&amp;rsquo;m Qianru, and this is my mid-term blog post for the 2024 Google Summer of Code. I am working on the BenchmarkST project, focusing on benchmarking gene imputation methods in spatial transcriptomics. My goal is to create a comprehensive, reproducible platform for evaluating these methods across various datasets and conditions.&lt;/p>
&lt;p>In this post, I will share some of the progress I have made so far, the challenges I have faced, and how I overcame them. I will also highlight some specific accomplishments and what I plan to do next.&lt;/p>
&lt;hr>
&lt;h3 id="achievements">Achievements:&lt;/h3>
&lt;ol>
&lt;li>&lt;strong>Developed the Python Package:&lt;/strong> I created the &amp;ldquo;Impeller&amp;rdquo; Python package, which includes tools for downloading example data, processing it, and training models. This package aims to standardize gene imputation tasks in spatial transcriptomics.&lt;/li>
&lt;li>&lt;strong>Example Data Integration:&lt;/strong> Successfully integrated various spatial transcriptomics datasets into the package for benchmarking purposes.&lt;/li>
&lt;li>&lt;strong>Benchmarking Framework:&lt;/strong> Established a framework for objective comparison of different gene imputation methodologies.&lt;/li>
&lt;/ol>
&lt;p>&lt;strong>Python Package: Installation and Usage&lt;/strong>&lt;/p>
&lt;p>You can install the package using pip:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">pip install Impeller
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Download Example Data&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">from Impeller import download_example_data
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">download_example_data&lt;span class="o">()&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Load and Process Data&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">from Impeller import load_and_process_example_data, val_mask, test_mask, x, &lt;span class="nv">original_x&lt;/span> &lt;span class="o">=&lt;/span> load_and_process_example_data&lt;span class="o">()&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Train Model&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">from Impeller import create_args, train &lt;span class="nv">args&lt;/span> &lt;span class="o">=&lt;/span> create_args&lt;span class="o">()&lt;/span>,test_l1_distance, test_cosine_sim, &lt;span class="nv">test_rmse&lt;/span> &lt;span class="o">=&lt;/span> train&lt;span class="o">(&lt;/span>args, data, val_mask, test_mask, x, original_x&lt;span class="o">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h3 id="challenges">Challenges:&lt;/h3>
&lt;p>Reproducing the results of various gene imputation methods was not an easy task. I faced several challenges along the way:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Lack of Standardized Data:&lt;/strong> Some methods had incomplete or missing code, making it difficult to reproduce their results accurately.&lt;/li>
&lt;li>&lt;strong>Reproducibility Issues:&lt;/strong> Successfully integrated various spatial transcriptomics datasets into the package for benchmarking purposes.&lt;/li>
&lt;li>&lt;strong>Resource Limitations:&lt;/strong> Running large-scale experiments required significant computational resources, which posed constraints on the project timeline.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h3 id="future-work">Future Work:&lt;/h3>
&lt;p>Moving forward, I plan to:&lt;/p>
&lt;ol>
&lt;li>Extend the package&amp;rsquo;s functionalities to include more datasets and imputation methods.&lt;/li>
&lt;li>Enhance the benchmarking framework for more comprehensive evaluations.&lt;/li>
&lt;li>Collaborate with other researchers to validate and improve the package&amp;rsquo;s utility in the bioinformatics community.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;p>I hope you found this update informative and interesting. If you have any questions or feedback, please feel free to contact me. Thank you for your attention and support!&lt;/p></description></item><item><title>BenchmarkST: Cross-Platform, Multi-Species Spatial Transcriptomics Gene Imputation Benchmarking</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240609-qianru/</link><pubDate>Sun, 09 Jun 2024 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre24/uci/benchmarkst/20240609-qianru/</guid><description>&lt;p>Hello! My name is Qianru, and I will be working on a project to improve spatial transcriptomics during Google Summer of Code 2024. My project, &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre24/uci/benchmarkst/">Benchmarking Gene Imputation Methods for Spatial Transcriptomics&lt;/a>, is mentored by &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/ziheng-duan/">Ziheng Duan&lt;/a> and &lt;a href="https://users.soe.ucsc.edu/~cormac/" target="_blank" rel="noopener">Cormac Flanagan&lt;/a>. The goal is to create a standard platform to evaluate methods for filling in missing gene data, which is a big challenge in spatial transcriptomics. &lt;a href="https://drive.google.com/file/d/1ydqGuuzpNgPpVUBvTiFvF1q7qV9gA_wm/view?usp=sharing" target="_blank" rel="noopener">My proposal can be viewed here!&lt;/a>&lt;/p>
&lt;p>Spatial transcriptomics lets us see where genes are active in tissues, giving us insight into how cells interact in their natural environment. However, current methods often miss some gene data, making it hard to get a complete picture. Gene imputation can help fill in these gaps.&lt;/p>
&lt;p>My project will:&lt;/p>
&lt;p>Create a benchmark dataset to standardize gene imputation tasks across different platforms, species, and organs.&lt;/p>
&lt;p>Compare various gene imputation methods to see how well they work in different scenarios.&lt;/p>
&lt;p>Develop a user-friendly Python package with tools for gene imputation to help researchers improve their data.&lt;/p>
&lt;p>I&amp;rsquo;m excited to contribute to this project and help advance the field of spatial transcriptomics by making data analysis more accurate and comprehensive.&lt;/p></description></item></channel></rss>