<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Kangrui Wang | UCSC OSPO</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/kangrui-wang/</link><atom:link href="https://deploy-preview-1007--ucsc-ospo.netlify.app/author/kangrui-wang/index.xml" rel="self" type="application/rss+xml"/><description>Kangrui Wang</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/kangrui-wang/avatar_hudfad0ce13dfbda9c14cb3b4d9da3f56e_78154_270x270_fill_q75_lanczos_center.jpg</url><title>Kangrui Wang</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/author/kangrui-wang/</link></image><item><title>Enhancing Drift Detection through Fine-Tuning Llama2</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/anl/perfdrift/20230730-kangrui/</link><pubDate>Sun, 30 Jul 2023 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/anl/perfdrift/20230730-kangrui/</guid><description>&lt;p>Greetings everyone, I&amp;rsquo;m Kangrui. Over the past few weeks, we&amp;rsquo;ve dedicated our efforts and have consequently made significant progress in our drift detection methods. Now, I&amp;rsquo;m excited to present to you a detailed elaboration on how we prompted and fine-tuned Llama2 to efficiently carry out the drift detection task.&lt;/p>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;h3 id="why-llm-in-drift-detection-method">Why LLM in drift detection method?&lt;/h3>
&lt;p>The use of large language models (LLMs) in drift detection methods presents numerous benefits that place it as a prominent solution in this domain.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Rapid Development:&lt;/strong> LLMs are in the vanguard of technological advancement. This field is evolving rapidly with continuous enhancements in model architecture, training techniques, and data handling. With every new version, these models are showing an increasing capacity to understand and generate human-like text, pushing the limits of what is achievable in Natural Language Processing (NLP) and Artificial Intelligence (AI) as a whole.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Superior Performance:&lt;/strong> Traditional drift detection methodologies such as Page-Hinkley, EDDM, and HDDM have their merits and have found success in numerous scenarios. Even Deep Learning (DL) techniques, like training a predictive model based on error rates, have made significant strides in the field. However, when handling complex, high-dimensional, and real-time data, LLMs have demonstrated exceptional results. They are not only able to effectively predict and respond to drifts but also adapt to new trends more swiftly. Our experiments using LLMs like GPT-3.5-turbo have yielded impressive results, notably outperforming other methods.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="GPT-3.5-turbo Performance" srcset="
/report/osre23/anl/perfdrift/20230730-kangrui/gpt-3.5-performance_hudb1929583c62f83e6182026371c0950a_147441_986c57531b096aac2ea5604c7942efed.webp 400w,
/report/osre23/anl/perfdrift/20230730-kangrui/gpt-3.5-performance_hudb1929583c62f83e6182026371c0950a_147441_534b4ca0b9e767d820ed9b45d754db9f.webp 760w,
/report/osre23/anl/perfdrift/20230730-kangrui/gpt-3.5-performance_hudb1929583c62f83e6182026371c0950a_147441_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/anl/perfdrift/20230730-kangrui/gpt-3.5-performance_hudb1929583c62f83e6182026371c0950a_147441_986c57531b096aac2ea5604c7942efed.webp"
width="760"
height="303"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;em>Fig. 1: Concept dirfts detected by GPT-3.5-turbo in Cori dataset&lt;/em>&lt;/p>
&lt;ol start="3">
&lt;li>&lt;strong>Flexibility:&lt;/strong> One of the major advantages of using LLMs is their flexibility in dealing with different types of input and output. In contrast to traditional methods, which are confined to single feature concept drift detection and can only process numerical values, LLMs can handle a range of input types including text, numbers, and more complex data structures. This capability allows them to detect multi-feature concept drifts, thereby broadening the scope and complexity of problems they can tackle. Moreover, the generation capability of LLMs can provide rich and detailed output, facilitating more comprehensive insights into the detected drifts.&lt;/li>
&lt;/ol>
&lt;h2 id="why-llama2-in-drift-detection-method">Why Llama2 in drift detection method?&lt;/h2>
&lt;p>Llama2 presents a series of advantages that make it an excellent choice for applying llm in drift detection. Here&amp;rsquo;s a breakdown of the key reasons:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Performance Guarantee:&lt;/strong> As a newly released model, Llama2 has undergone extensive development and testing, providing a reliable guarantee of performance. It represents the cutting edge in AI technology, having benefited from the latest research and advancements in language model design.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Accessibility Guarantee:&lt;/strong> One significant advantage of Llama2 is that it is open-source. It is readily accessible on HuggingFace, which also provides a range of mature tools to fine-tune and deploy the model.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Flexibility for Fine-Tuning:&lt;/strong> Llama2 comes in different sizes, such as 7B, 13B, and 75B parameters, which allows for flexibility in model selection based on the task&amp;rsquo;s requirements and computational resources.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="data">Data&lt;/h2>
&lt;h3 id="dataset">Dataset&lt;/h3>
&lt;p>In our study, we employed &lt;a href="https://github.com/alipsgh/data-streams" target="_blank" rel="noopener">Synthetic data streams&lt;/a> for the fine-tuning of Llama2. Synthetic data streams serve as an invaluable resource for controlled experiments in the domain of drift detection. These curated datasets encompass varied types of drifts, providing us with the capability to assess the efficacy of our detection algorithms under diverse scenarios.&lt;/p>
&lt;p>Here is a brief introduction to the synthetic datasets we used:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Sine1 &amp;amp; Sine2:&lt;/strong> These datasets induce abrupt concept drift within a two-dimensional feature space. The classification rule, a sine function, dictates the instance labels, which are flipped at every drift point.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Mixed:&lt;/strong> This dataset, characterized by its combination of numeric and boolean features, uses a composite classification rule. The abrupt concept drift is simulated via a periodic reversal of class labels.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Stagger:&lt;/strong> This categorical dataset incorporates abrupt concept drift by periodically altering the classification rules tied to the features.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Circles &amp;amp; LED:&lt;/strong> These datasets are designed to simulate gradual concept drift. In Circles, the classification of instances is determined by their spatial relation to specific circles. LED imitates a seven-segment digit display, introducing drift by interchanging the pertinent attributes.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>Typically, the synthetic datasets contain 100,000 or 1,000,000 instances. The concept drift happens every 25000 or 33333 instances each portraying either abrupt (with drifting period of 50 instances) or gradual concept drifts (with drifting period of 500 instances).&lt;/p>
&lt;h3 id="data-preprocessing-and-metrics">Data Preprocessing and Metrics&lt;/h3>
&lt;p>Given the token limit of Llama2 and the specific requirements of our project, we needed to transform the data into an appropriate format.&lt;/p>
&lt;p>As such, we processed each data stream into three sections: the &amp;lsquo;undrifted&amp;rsquo; period, the &amp;lsquo;drifting&amp;rsquo; period, and the &amp;lsquo;drifted&amp;rsquo; period. All instances in each section were randomly and independently drawn from the original data stream, summing up to a maximum of 100 instances. The number of instances for the undrifted and drifted periods ranged from 20 to 50, and for the drifting period, it ranged from 10 to 20.&lt;/p>
&lt;p>For instance, let&amp;rsquo;s consider a dataset containing 100,000 instances where the concept drift occurs every 25,000 instances, causing abrupt concept drift. To format a data point, we could draw 20 to 50 instances from the first 25,000 as the undrifted period. Then, we could draw 10 to 20 instances from the 25,001st to 25,050th instance as the drifting period. Finally, we would draw 10 to min(100 - num(undrifted period) - num(drifting period), 50) from the 25,051st to 50,050th instance as the drifted period. This newly formatted data stream would then be fed into Llama2.&lt;/p>
&lt;p>We also included some additional information to assist Llama2&amp;rsquo;s inference process. A typical data point in our processed dataset includes:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;before_period&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">31&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;transition_period&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">32&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">38&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;after_period&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">39&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">59&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;before_index&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">196&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">19963&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;transition_index&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">20002&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">20030&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;after_index&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">20310&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">39984&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;meta&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;Dataset: MIXED&lt;/span>&lt;span class="se">\n\t&lt;/span>&lt;span class="s2">v&amp;#39;s type is nominal, range is (&amp;#39;False&amp;#39;, &amp;#39;True&amp;#39;)&lt;/span>&lt;span class="se">\n\t&lt;/span>&lt;span class="s2">w&amp;#39;s type is nominal, range is (&amp;#39;False&amp;#39;, &amp;#39;True&amp;#39;)&lt;/span>&lt;span class="se">\n\t&lt;/span>&lt;span class="s2">x&amp;#39;s type is numeric&lt;/span>&lt;span class="se">\n\t&lt;/span>&lt;span class="s2">y&amp;#39;s type is numeric&lt;/span>&lt;span class="se">\n\t&lt;/span>&lt;span class="s2">class&amp;#39;s type is nominal, range is (&amp;#39;p&amp;#39;, &amp;#39;n&amp;#39;)&lt;/span>&lt;span class="se">\n&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;data_stream&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="o">...&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>From this dictionary, the &amp;ldquo;meta&amp;rdquo; and &amp;ldquo;data_stream&amp;rdquo; entries are fed into Llama2. The &amp;ldquo;transition_period&amp;rdquo; serves as the criterion: if Llama2&amp;rsquo;s answer lies within the &amp;ldquo;transition_period&amp;rdquo;, we deem it correct.&lt;/p>
&lt;h2 id="llama2">Llama2&lt;/h2>
&lt;h3 id="inference">Inference&lt;/h3>
&lt;p>We experimented with three variations of prompts during the inference phase.&lt;/p>
&lt;p>&lt;strong>Prompt Version 1:&lt;/strong>&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">[INST] &amp;lt;&amp;lt;SYS&amp;gt;&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> You are a helpful, respectful, and honest assistant. Always provide the most helpful responses possible while ensuring safety. Ensure that your responses are socially unbiased, positive, and free from harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question lacks coherence or sense, explain why instead of providing incorrect information. If you are uncertain about an answer, refrain from sharing false information.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &amp;lt;&amp;lt;/SYS&amp;gt;&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> Your task is to identify the index in a given data stream where the relationship between the features and labels begins to change. The data stream is formatted as a list, with each element being a two-element list: the first represents the features (also a list), and the second is the label. If your answer is &amp;#39;x&amp;#39;, it indicates that the data pattern starts shifting at the xth data point in the stream.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> Here&amp;#39;s an example of the data&amp;#39;s metadata: Dataset: SINE1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> x&amp;#39;s type is numeric
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> y&amp;#39;s type is numeric
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> class&amp;#39;s type is nominal, range is (&amp;#39;p&amp;#39;, &amp;#39;n&amp;#39;)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> The given data stream is: [[[0.7, 0.07], &amp;#39;p&amp;#39;], [[0.45, 0.78], &amp;#39;n&amp;#39;], ..., [[0.64, 0.45], &amp;#39;n&amp;#39;]]
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> Your task is to respond with a single index. No additional information is required.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">[/INST]
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Prompt Version 2:&lt;/strong>&lt;/p>
&lt;p>The same as Prompt 1, but with a specific range for the index response:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Please provide an index ranging from 0 to 96. No additional information is required.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Prompt Version 3:&lt;/strong>&lt;/p>
&lt;p>This prompt uses an instruction-input-output design, which we adopted for fine-tuning:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Below is an instruction paired with an input that provides further context. Write a response that appropriately completes the request.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">### Instruction:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Identify the index in a given data stream where the relationship between features and labels begins to change. The data stream is formatted as a list, each element being a two-element list: the first represents the features (also a list), and the second is the label. For instance, if the response is &amp;#39;x&amp;#39;, it means that the data pattern starts shifting at the xth data point in the stream. Only respond with an index, no further information is necessary.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">### Input:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Meta Data:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Dataset: SINE1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> x&amp;#39;s type is numeric
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> y&amp;#39;s type is numeric
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> class&amp;#39;s type is nominal, range is (&amp;#39;p&amp;#39;, &amp;#39;n&amp;#39;)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Data stream:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">[[[0.7, 0.07], &amp;#39;p&amp;#39;], [[0.45, 0.78], &amp;#39;n&amp;#39;], .., [[0.64, 0.45], &amp;#39;n&amp;#39;]]
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">### Response:
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Despite minor differences between Prompt Version 1 and Version 2, both suggested by Meta, the results varied significantly, a topic we will delve into in the following section. Prompt Version 3, employing the instruction-input-output structure, was used during our fine-tuning process.&lt;/p>
&lt;h3 id="fine-tuning">Fine-Tuning&lt;/h3>
&lt;p>We utilized the tools provided by &lt;a href="https://github.com/facebookresearch/llama-recipes" target="_blank" rel="noopener">llama-recipes&lt;/a> to fine-tune Llama2. The key command used to initiate the fine-tuning process is illustrated below:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-shell" data-lang="shell">&lt;span class="line">&lt;span class="cl">python llama_finetuning.py --use_peft &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --peft_method lora &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --quantization &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --model_name meta-llama/Llama-2-13b-chat-hf &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --output_dir ./fine_tuned_model/Llama-2-13b-chat-hf-test_finetune &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --dataset alpaca_dataset &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --batch_size_training &lt;span class="m">40&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --num_epochs &lt;span class="m">1&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Some explaination about the parameters:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">--use_peft: This flag indicates the use of the Parameter-Efficient Fine-Tuning (PEFT) method. PEFT allows us to fine-tune the model more efficiently.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">--peft_method lora: Here, we specify that the Lora (Layer-wise Optimal Brain Surgeon with Relevance-based Adjustment) method should be used for PEFT.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">--quantization: The quantization flag is used to reduce the memory footprint of the model during the inference stage. It does so by reducing the precision of the model&amp;#39;s weights.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">--dataset alpaca_dataset: Specifies the dataset setting used for fine-tuning, in this case, the &amp;#39;alpaca_dataset&amp;#39; indicates the instruction-input-output structure for fine-tuning.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="results">Results&lt;/h2>
&lt;p>The performance of various models and prompt versions is depicted in Fig. 2.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="All Performance" srcset="
/report/osre23/anl/perfdrift/20230730-kangrui/performance_plot_hu026976f577cb17db71cb82cd3675225d_101027_f4b54b1d163428a3bbdd2373c5e7d6c6.webp 400w,
/report/osre23/anl/perfdrift/20230730-kangrui/performance_plot_hu026976f577cb17db71cb82cd3675225d_101027_ba09d14d8674a9735bf9bb60ce301dae.webp 760w,
/report/osre23/anl/perfdrift/20230730-kangrui/performance_plot_hu026976f577cb17db71cb82cd3675225d_101027_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/anl/perfdrift/20230730-kangrui/performance_plot_hu026976f577cb17db71cb82cd3675225d_101027_f4b54b1d163428a3bbdd2373c5e7d6c6.webp"
width="760"
height="608"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;em>Fig. 2: Performance comparison of different models and prompt versions.&lt;/em>&lt;/p>
&lt;p>It is evident from the results that the design of the prompt has a significant impact on Llama2&amp;rsquo;s performance. Furthermore, due to computational resource constraints, we have only managed to fine-tune Llama2 on a portion of our dataset (approximately 1,000 instances). The entire training set consists of 19,000 instances, and the test set includes 5,000 instances. Despite these limitations, a performance increase is noticeable after fine-tuning.&lt;/p></description></item><item><title>Automatic Cluster Performance Shifts Detection Toolkit</title><link>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/anl/perfdrift/20230527-kangrui/</link><pubDate>Sat, 27 May 2023 00:00:00 +0000</pubDate><guid>https://deploy-preview-1007--ucsc-ospo.netlify.app/report/osre23/anl/perfdrift/20230527-kangrui/</guid><description>&lt;p>Hi! I am Kangrui, a Pre-doc student at the University of Chicago. As part of the &lt;a href="https://deploy-preview-1007--ucsc-ospo.netlify.app/project/osre23/anl/perfdrift">Automatic Cluster Performance Shifts Detection Toolkit&lt;/a> my &lt;a href="https://drive.google.com/file/d/1AxpgWLzF3oKTFlD8q6JYS35CxxJ6c76X/view?usp=share_link" target="_blank" rel="noopener">proposal&lt;/a> under the mentorship of &lt;strong>Sandeep Madireddy&lt;/strong> and &lt;strong>Ray Andrew&lt;/strong> aims to design a real-time performance shift detection algorithm for high-performance computing clusters, ensuring minimal overheads.&lt;/p>
&lt;p>This project focuses on developing a real-time performance shift detection algorithm tailored to heterogeneous workloads, aiming to promptly inform administrators about performance changes. The primary goal is to design an algorithm that efficiently detects shifts in real-time, with minimal system overheads.&lt;/p>
&lt;p>In addition to algorithm development, we plan to enhance the Darshan toolkit&amp;rsquo;s functionality by integrating our algorithm, offering users early performance shift detection. This integration will aid administrators in making informed system utilization and scheduling decisions.&lt;/p>
&lt;p>To promote transparency and reproducibility, we&amp;rsquo;ll encapsulate our findings, scripts, and profiling data within a Jupyter notebook, especially Chameleon Trovi, enabling other researchers to reproduce our experiments easily.&lt;/p>
&lt;p>Looking ahead, we plan to expand the algorithm&amp;rsquo;s applicability to cater to diverse HPC workloads and infrastructures. Other areas of interest include its use in detecting shifts in financial markets or monitoring IoT data streams. Further refinement of our algorithm, to reduce overheads and improve real-time detection capabilities, is also a part of our future endeavours. This task may involve evaluating various shift detection methods and noise filtering techniques.&lt;/p></description></item></channel></rss>