Machine Learning | UCSC OSPO

Final Report: CarbonCast — An end-to-end consumption-based Carbon Intensity Forecasting service

Mon, 15 Sep 2025 00:00:00 +0000

Hi everyone—this is my final report for CarbonCast, mentored by Professor Abel Souza. Back in June, my goal was simple to say and harder to pull off: help people see when the grid is cleaner and make it easy to act on that information. Over the summer I turned CarbonCast from a research prototype into something you can open, click, and rely on: a containerized backend, a clean API, and a fast, friendly map UI.

Background

CarbonCast forecasts the carbon intensity of electricity (gCO₂e/kWh) using grid data and weather. Earlier versions were accurate but difficult to run and even harder to use outside a research context. My OSRE focus was to make CarbonCast usable for real people: provide a standard API, build a web UI that feels responsive, and package everything so it starts quickly and keeps itself healthy.

Goals

I centered the work around four goals. First, I wanted to ship an end-to-end containerized stack—data collection, validation, storage, API, and UI—that someone else could run without digging through my notes. Second, I aimed to expand coverage beyond a handful of regions so the map would be genuinely useful. Third, I needed to make it reliable, with retries, monitoring, and graceful fallbacks so the system could run for weeks without babysitting. Finally, I wanted to lay the groundwork for a consumption-based signal, because imports from neighboring regions also shape a region’s true emissions picture.

What I built

By the end of the program, CarbonCast runs as a containerized backend + API + web app that you can bring up with Docker. The pipelines now reach 85+ regions, and the UI currently exposes 58+ while we finish integrating the rest. The API offers straightforward endpoints for current conditions and multi-day views, plus region metadata so clients can discover what’s available. The UI presents an interactive choropleth map with a side panel for the energy mix and a simple timeline to move between past, now, and the next few days. To keep things feeling snappy, I tuned caching so “now” data updates quickly while historical and forecast views load instantly from cache. I also added a small “mission control” dashboard that shows what updated, what failed, and how the system recovered, which makes maintenance far less mysterious.

How it works

Fresh weather and grid data arrive on a regular schedule. The system checks each file for sanity, stores it, and serves it through a clean API. The React app calls that API and paints the map. Hovering reveals regional details; clicking opens a richer panel with the energy mix and trends; the timeline lets you scrub through hours naturally. In short, the path is fresh data → API → map, and each step is designed to be obvious and quick.

Behind the scenes, I extended the existing Django backend with a SQLite path so the UI works out of the box on a laptop. For production, you can point the same code at Postgres or MySQL without changing the UI. This choice made local testing easy while leaving room for scale later.

Highlights

A few moments stand out. The first time the dashboard flipped from red to green on its own—after the system retried through a wave of timeouts—was a turning point. Clicking across the map and getting instant responses because the right data was cached felt great too. And packaging everything so another person can run it without asking me for help might be the biggest quality-of-life win for future contributors.

Challenges

The first big hurdle was refactoring the old vanilla-JS interface. The original UI worked, but it was dated and hard to extend. I rebuilt it as a modern React + TypeScript app with a cleaner component structure and a fresh look—think glassmorphic panels, readable color scales, and a layout that feels consistent on both laptops and smaller screens. Moving to this design system made the codebase far easier to maintain, theme, and iterate on.

The next challenge was performance under real-time load. With dozens of regions updating, it was easy to hit API limits and make the UI feel jittery. I solved this by adding a smart caching layer with short, volatility-aware timeouts, request de-duplication, and background prefetching. That combination dramatically reduced round-trips, essentially eliminated rate-limit hits, and made the map feel responsive even as you scrub through time. The result is a UI that can handle many simultaneous updates without hiccups.

Finally, there were plenty of stubborn UI bugs. Some regions wouldn’t color even when data was available, certain charts refused to render, and a few elements flickered or never showed up. Most of this came down to learning React state management in a real project: taming race conditions, canceling in-flight requests when users navigate, and making sure state only updates when fresh data actually arrives. Fixing those issues taught me a lot about how maps re-paint, how charts expect their data, and how to keep components simple enough that they behave the way users expect.

What didn’t make the cut (yet)

I designed—but did not finish—per-region plug-in models so each grid can use the approach that fits it best. We decided to ship a stable, deployable service first and reserve that flexibility work for the next phase. The design is written down and ready to build.

Links and resources:

Project page: CarbonCast
Proposal: https://ucsc-ospo.github.io/report/osre25/ucsc/carboncast/20250710-tanushsavadi/
Midterm blog: https://ucsc-ospo.github.io/report/osre25/ucsc/carboncast/20250803-tanushsavadi/
Backend/API (branch): https://github.com/carbonfirst/CarbonCast/tree/django_apis_sqlite
Frontend/UI: https://github.com/carbonfirst/CarbonCastUI/tree/main

What’s next

My next steps are clear. I want to finish the per-region model plug-ins so grids can bring their own best forecasting logic. I also plan to carry the consumption-based signal end-to-end, including imports and interconnects surfaced directly in the UI. Finally, I’ll harden the system for production by enabling auth and throttling and by moving to a production-grade database where appropriate.

Thank you

Huge thanks to Professor Abel Souza for steady mentorship and to the OSRE community for thoughtful feedback. The most rewarding part of this summer was watching a research idea become something people can click on—and use to make cleaner choices.

Wrapping Up KALLM

Wed, 03 Sep 2025 00:00:00 +0000

Large language models today look complicated, but if you peel back the layers, most of what you see is old technology: stacks of linear transformations. The Transformer architecture, the engine behind GPTs and their cousins, is often described as revolutionary. Yet the majority of its parameters are standard linear layers, the same kind of matrix multiplications you would find in a simple multilayer perceptron from the 1980s. For years these layers have gone unchallenged. They are fast, they scale, and they work. But maybe the time has come to ask: can we do better than linear?

This project explored exactly that. Instead of leaving those layers untouched, we tried replacing them with a more mathematically structured alternative: Kolmogorov–Arnold Networks (KANs). The result is a working language model—SmolLM2, a 135-million-parameter Transformer—where the final feedforward blocks no longer consist of brute-force linear weights, but of compact polynomial-based functions. And the striking fact is that performance remained within the baseline range. Smaller KANs managed to match larger linear layers, showing that smarter mathematics can stand shoulder to shoulder with the workhorse of deep learning.

Transformers

To understand the significance, let’s revisit what a Transformer actually is.
A Transformer block has two main components: attention and feedforward. The attention mechanism computes how each word in a sentence relates to every other word. That is the clever part, and it is what made Transformers famous. But once attention finishes its work, the output is passed into a feedforward network. And this feedforward network is essentially two large linear layers, stacked with a nonlinearity between them.

Now stacking thirty such blocks yields a complete model like SmolLM2. Look at the parameter counts and you see a pattern: attention is not the main consumer. It’s the feedforward layers. They dominate memory and computation, making them the primary target for efficiency gains.

What Are Kolmogorov–Arnold Networks?

So what happens if, instead of a giant matrix multiplication, we try something more structured? Enter Kolmogorov–Arnold Networks.

KANs are built on a mathematical theorem from the mid-20th century, which proved that any multivariate function can be decomposed into sums of univariate functions. Instead of mixing all inputs together at once, you treat each input dimension separately, applying a small nonlinear function, and then recombine. The beauty is that these univariate functions can be simple but expressive—like splines or polynomials—and yet, when summed, they approximate very complex mappings.

Think of a KAN layer as a set of individual univariate modules. Each one takes a single variable, bends it according to a chosen basis (polynomials, splines, etc.), and then all those bent versions are added up to produce the output. The richness of the final function depends on two factors:

Choice of basis: You can bend with Chebyshev polynomials, with Legendre polynomials, with B-splines, or with other families.
Degree: This is how many bends you allow. A degree-1 polynomial is just a line. Degree-2 can capture curves. Higher degrees capture higher-order oscillatory components.

A Chebyshev polynomial of the second kind, degree 2, is one such basis. Unlike a simple quadratic, it has roots and oscillations that make it particularly good at spanning function space efficiently. This efficiency explains its favorable performance in our experiments: low degree means fewer parameters, but Chebyshev’s properties let it approximate more than you might expect from so few numbers.

Why Small Can Beat Big

Linear layers require many parameters because they treat every input–output mapping as arbitrary. KANs assume smoothness: each input passes through a compact polynomial basis before recombination. This structure captures useful patterns with fewer parameters.

A degree-2 Chebyshev basis, for example, encodes curvature and oscillation efficiently. While a linear layer of the same size must spend parameters to approximate these effects, the polynomial basis includes them inherently. The result is comparable expressivity with fewer parameters. In language tasks where patterns are often smooth or compositional, this structured efficiency translates into competitive accuracy at lower cost.

Baselines, Modifications, and Comparisons

Here’s what we actually tested, in plain language:

The untouched baseline: a pretrained SmolLM2, with all thirty blocks intact.
Linear restart: the same pretrained model, but the last five feedforward modules were thrown away and replaced with freshly initialized linear ones. These then had to be trained again.
KAN replacement: again, take the pretrained model, cut off the last five feedforward modules, and put in new KAN modules instead—specifically, Chebyshev of the second kind, degree 2.

In all three cases, the backbone of the model—the embeddings, the attention layers, and the first twenty-five blocks—was left untouched. Only the tail was modified. This design allowed us to test transfer learning: would the pretrained parts of the model still play nicely with the new pieces? The answer is yes. The attention layers and other linear projections adapted seamlessly, proving that KANs can be swapped in without destabilizing the whole system.

Training was done on smol-smoltalk dataset, a small-scale dialogue corpus used for both pretraining and fine-tuning. After training, all models were evaluated on the same subset of BIG-Bench Hard tasks.

Results

The baseline was the pretrained SmolLM2 without modification. It achieved an average accuracy of 22.5%, using 134M parameters. This experiment has a single measurement because no training was applied. The rest of the experiments was done using 3 random seeds.

When retrained with linear replacements, the model reached an average accuracy of 43.8%, with 46M trainable parameters (only last 5 blocks are active) and 5.87 GB VRAM total usage.

Replacing the last five feedforward blocks with Kolmogorov–Arnold Networks produced an average accuracy of 44.1%, with 39M parameters and 5.86 GB VRAM usage. The memory consumption of KAN layers is a subject that requires further optimization.

In short, KANs matched or slightly exceeded the reinitialized linear baseline in accuracy, while using fewer parameters and slightly less memory. This demonstrates that structured polynomial layers can substitute for large linear layers without degrading reasoning performance.

Why Transfer Learning Works So Well

One of the surprising outcomes is how cleanly the pretrained Transformer integrates with KANs. Remember: only the feedforward modules in the last five blocks were replaced. All the other linear layers—embedding projections, attention queries, keys, and values, output heads—remained untouched. They continue to function as before. The new KAN blocks slot right in, adapt during training, and the system as a whole behaves coherently.

That tells us something important. The standard Transformer does not depend on linearity per se in those positions. What it depends on is a nonlinear transformation with enough expressive power. KANs provide that power, just in a different mathematical form. Which means: any pretrained Transformer can, in principle, be retrofit with KANs in the feedforward slots, no need to start from scratch.

Looking Ahead: Mixing Polynomial Bases

So far we only tested one family, Chebyshev-2. But the architecture is more general. Each KAN block can in fact host multiple polynomial families in parallel, or stack them in sequence.

Parallel: imagine splitting the input across several channels, each processed by a different basis. The outputs are then recombined. This way, one basis covers the smooth global structure, while another captures edge effects or oscillations.
Sequential: here, the output of one polynomial transformation becomes the input of another. You can think of it as layering function approximations, where the second basis corrects the limitations of the first. For example, a spline might give you piecewise smoothness, then a Chebyshev layer on top could adjust the global shape.

Both strategies were implemented and promise to extract more expressivity per parameter. Instead of simply making the networks bigger, we can make them smarter, combining the strengths of different mathematical families. That will be the focus of future work.

Conclusion

The main lesson is this: language models do not need to be built entirely from massive linear matrices. By replacing just a handful of those matrices with compact Kolmogorov–Arnold modules, we achieved the same reasoning accuracy with fewer parameters and less memory. Transfer learning works cleanly. The architecture adapts. And the door is now open to rethink what belongs inside a Transformer block.

KANs are not just a theoretical curiosity. They are practical, efficient, and compatible with modern large language models. This project showed that replacing linear with polynomial is not only possible, it is competitive. The next step is to push combinations, explore scaling, and see just how far this mathematical alternative can take us.

Midterm blog: CarbonCast Midpoint Update: From Vision to Reality

Sun, 03 Aug 2025 00:00:00 +0000

A few months ago, I shared my vision for making carbon intensity forecasts more accessible through the CarbonCast project. My proposal under the mentorship of Professor Abel Souza aims to build an API that makes carbon intensity forecasts more accessible and actionable. I had two main goals: expand CarbonCast to work with more regional electricity grids, and transform it from a research project into something that could actually run and be interacted with in the real world.

Today, I’m excited to share that we’ve not only hit those goals – we’ve exceeded them in ways I didn’t expect.

What We’ve Built So Far

Remember how I mentioned that CarbonCast needed to support more regional grids? Well, we’ve gone big. The system now covers 85+ regions across two continents. We’re talking about major US grid operators like ERCOT (Texas), CISO (California), PJM (Mid-Atlantic), MISO (Midwest), and NYISO (New York), plus we’ve expanded into European countries like Germany, France, Spain, and the UK.

But here’s the thing – collecting weather data for carbon intensity forecasting isn’t as simple as just downloading a few files. Each region needs four different types of weather data: solar radiation (for solar power predictions), wind patterns (for wind power), temperature and humidity (for energy demand), and precipitation (which affects both supply and demand). That means we’re managing data collection for over 340 different combinations of regions and weather variables.

The Automation Challenge

When I started this project, I quickly realized that manually managing data collection for this many regions would be impossible. We’re talking about thousands of data requests, each taking time to process, with various things that can go wrong along the way.

So we built something I’m really proud of: an intelligent automation system that handles 95% of the work without human intervention. That means 19 out of every 20 data collection tasks happen automatically, even when things go wrong.

The system is smart about it too. It knows when to speed up data collection, when to slow down to avoid overwhelming the servers, and how to recover when errors happen. We’ve achieved 99% data completeness, which means almost every piece of weather data we need actually makes it into our system successfully.

Making It Production-Ready

The biggest challenge was taking CarbonCast from a research project that worked on my laptop to something that could run reliably for weeks without me babysitting it. This meant building in all the boring but crucial stuff that makes software actually work in the real world.

We created a comprehensive error handling system that can automatically recover from 95% of the problems it encounters. Network hiccups, server timeouts, data format changes – the system handles these gracefully and keeps running.

There’s also a real-time monitoring dashboard that shows exactly what’s happening across all regions. I can see which areas are collecting data successfully, which ones might be having issues, and get alerts if anything needs attention. It’s like having a mission control center for carbon data.

The Dashboard: Mission Control for Carbon Data

Let me show you what this monitoring system actually looks like. We built a comprehensive web dashboard that gives us real-time visibility into everything that’s happening:

The main dashboard showing real-time system metrics and status across all regions

The dashboard shows key metrics at a glance – total requests, completion rates, and active regions. But it goes much deeper than that. You can drill down into individual requests to see their complete lifecycle:

Detailed view of individual data requests showing processing timelines and status

Each request card shows everything from the initial request time to when the data becomes available for download. This level of visibility is crucial when you’re managing hundreds of data requests across different regions and weather variables.

The regional analytics view shows how well we’re doing across different grid operators:

Regional breakdown showing completion status across different electricity grid operators

What I’m particularly proud of is the error handling dashboard. When things do go wrong (which they inevitably do with any large-scale data system), we can see exactly what happened and how the system recovered:

Error tracking and resolution system showing 100% success rate in region mapping

The fact that we’re showing “No unknown regions found” means our coordinate-based region detection system is working perfectly – every weather data request gets properly mapped to the right electricity grid.

The Technical Foundation

Under the hood, we’ve built what I’d call enterprise-grade infrastructure. The system can run autonomously for weeks, automatically organizing data by region and weather type, managing storage efficiently, and even optimizing its own performance based on what it learns.

We’ve also created comprehensive testing systems to make sure everything works reliably. When you’re dealing with data that people might use to make real decisions about when to charge their electric vehicles or run their data centers, reliability isn’t optional.

The architecture follows a modular, service-oriented design with clear separation between data collection, processing, monitoring, and user interfaces. This makes it much easier to maintain and extend as we add new features.

Why This Matters

All of this infrastructure work might sound technical, but it’s directly connected to the original vision: making carbon intensity forecasts accessible to everyone.

With this foundation in place, we can now provide reliable, up-to-date weather data for carbon intensity forecasting across major electricity grids in North America and Europe. That means developers building carbon-aware applications, companies trying to reduce their emissions, and individuals wanting to time their energy use for lower environmental impact all have access to the data they need.

What’s Next: Breaking Down CarbonCast

The next phase is where things get really exciting. Now that we have this solid data collection foundation, we’re going to break down CarbonCast itself into modular components. This will make it easier for developers to integrate carbon intensity forecasting into their own applications, whether that’s a smart home system, a cloud computing platform, or a mobile app that helps people make greener energy choices.

Looking Back

When I started this project, I knew we needed better infrastructure for carbon data. What I didn’t expect was how much we’d end up building – or how well it would work. We’ve created something that can reliably collect and organize weather data across two continents, handle errors gracefully, and run without constant supervision.

More importantly, we’ve built the foundation that will make it possible for anyone to access accurate carbon intensity forecasts. Whether you’re a developer building the next generation of carbon-aware applications or someone who just wants to know the best time to do laundry to minimize your environmental impact, the infrastructure is now there to support those decisions.

The vision of making carbon data accessible and actionable is becoming reality, one automated data collection at a time.

Impact Beyond Research

This work builds directly on the foundation of Multi-day Forecasting of Electric Grid Carbon Intensity using Machine Learning, transforming research into practical, real-world infrastructure. We’re not just making carbon intensity forecasts more accurate – we’re making them accessible to everyone who wants to reduce their environmental impact.

The open-source nature of CarbonCast means that anyone can run, contribute to, and benefit from this work. Whether you’re a developer building carbon-aware applications, a policymaker working on grid decarbonization strategies, or a sustainability-conscious individual looking to reduce your carbon footprint, the tools are now there to make informed, impactful choices.

Looking ahead, I’m excited to see how this infrastructure will enable the next generation of carbon-aware computing and smart energy decisions.

Midterm Report: KAN Integration into LLMs

Fri, 18 Jul 2025 00:00:00 +0000

Imagine if we could make neural networks that are not just more efficient, but smarter in how they learn. That’s the promise behind Kolmogorov–Arnold Networks (KANs)—a fascinating new architecture that replaces the usual “weighted sums and activation functions” with more mathematical finesse. Instead of processing all inputs in one big lump, KANs treat each input dimension individually, transforming them with elegant functions like B-splines or simpler polynomials. The idea is simple but powerful: do more with less.

For my project, I set out to explore what happens when we integrate these KAN layers into a lightweight language model called SmolLM2, training and testing it on a smol-smoltalk dataset.

Setting the Stage: SmolLM2 Meets KAN

The original SmolLM2 has 135 million parameters and 30 transformer blocks—plenty of moving parts. To keep things manageable during the initial phase, I created a mini version of the model with just 3 blocks and a trimmed-down vocabulary. This setup let me test dozens of KAN variations quickly, using a simple text classification task (AGNews) as a playground before moving on to full-scale language modeling.

Despite working with a simplified model, I managed to successfully train a full 30-block KAN-based SmolLM2. That model even passed challenging language benchmarks with flying colors—matching the performance of the original, linear-layer version. That’s a big win.

What Worked (and What Didn’t)

Along the way, I tried out a variety of KAN flavors: spline-based, radial basis functions (RBF), rational functions, and no fewer than eight types of orthogonal polynomials—like Chebyshev, Legendre, and Hermite. Each one brings its own quirks, strengths, and training times.

Some key takeaways:

Chebyshev (second kind) with a low polynomial degree (just 2!) delivered the best speed/accuracy trade-off.
Jacobi and Gegenbauer polynomials edged slightly ahead in raw accuracy but required much longer training times.
Replacing each linear layer with a KAN version (keeping parameter count similar) worked fine—but layering them in parallel or sequence didn’t add much.
A baseline with regular linear layers still performed slightly better (60.8% vs. 60.3%), but KANs showed they can come close with room for optimization.

Why This Matters

What’s compelling is not just that KANs can work, but that they bring some appealing properties:

Parameter efficiency: Good performance with fewer or similarly-sized layers.
Flexibility: They adapt well to existing hyperparameters—less fine-tuning needed.
Stability: They run smoothly in fp16 (a lower-precision format), which is critical for efficient training.
Potential for richer activations: Some existing projects still rely on activations like ReLU or SiLU alongside KANs. But I found KANs alone could learn well without them, opening up more dynamic architectures in the future.

What’s Next

With the heavy lifting done, code written, models trained, ideas tested, the remainder of the project is focused on refinement. That means more training on generative tasks, better tuning of polynomial degrees, smarter initialization strategies, and potentially making KAN-based layers more plug-and-play.

The fact that a fully KAN-powered SmolLM2 can hold its own on tough language benchmarks is more than just a proof of concept. It’s a hint that we might not have to keep scaling models indefinitely to get better performance. Instead, we can get more from each parameter, by changing how the model thinks.

CarbonCast

Thu, 10 Jul 2025 00:00:00 +0000

As part of the CarbonCast project, my proposal under the mentorship of Professor Abel Souza aims to build an API that makes carbon intensity forecasts more accessible and actionable.

Under the mentorship of Professor Abel Souza, my proposal is centered around building upon CarbonCast to create an API to enable user access and utilization of energy data in optimizing their electricity consumption. Before diving into the details of the project, I’d like to share a bit about my background.

About Me

Hi, I’m Tanush—a rising senior at the University of Massachusetts Amherst, majoring in Computer Science and Mathematics and graduating in Spring 2026. Currently, I’m an AI Intern for the Commonwealth of Massachusetts Department of Unemployment Assistance, where I’m developing an end-to-end retrieval-augmented generation (RAG) chatbot on AWS.

In the past, I’ve contributed to CarbonCast in a different capacity, designing a user interface to help visualize carbon intensity forecasts. I also worked at MathWorks as a Machine Learning Intern, where I collaborated in an AGILE environment to design and deploy predictive models that improved precision torque control and dynamic responsiveness in motor-driven robotic and industrial systems.

I’m excited to bring these experiences to this year’s GSoC project, where I’ll be building tools to make carbon data more accessible and actionable for everyone.

What is CarbonCast?

CarbonCast is a Python-based machine-learning library designed to forecast the carbon intensity of electrical grids. Carbon intensity refers to the amount of carbon emitted per kilowatt-hour (kWh) of electricity consumed. Developed in Python, the current version of CarbonCast delivers accurate forecasts in numerous regions by using historical energy production data of a particular geographical region, time of day/year, and weather forecasts as features.

However, there is no easy way to access, visualize, and utilize the data through a standard interface. In addition, much important information is left out and is not available to users. For instance, electricity grids often import electricity from neighboring regions, and so electricity consumption depends on both electricity generation and imports. Moreover, it is imperative for each energy source to utilize a tailored predictive mechanism. Consequently, any carbon optimization solution trying to reduce carbon emissions due to its electricity consumption will benefit more from following a consumption-based carbon intensity signal.

Unlike other third-party carbon services, CarbonCast’s model is open-sourced, allowing users to study, understand, and improve its behavior. This transparency invites public collaboration and innovation. It also contrasts sharply with proprietary services that often withhold both the logic behind their models and the data they are trained on.

Why This Matters

Electricity usage is one of the largest contributors to carbon emissions globally. Carbon intensity—the amount of carbon emitted per kilowatt-hour of electricity consumed—varies based on how electricity is generated and demanded (for example, coal versus solar). With better visibility into when the grid is cleaner, individuals and organizations can shift their energy consumption to lower-carbon periods and lower prices. This enables everyday energy optimizations without compromising comfort or productivity.

By improving CarbonCast’s accessibility and functionality, we are helping people and institutions answer questions like:

When is the best time to charge my EV to reduce environmental impact?
Can I run my energy-hungry server jobs when the electricity is cheaper?
How do I actually reduce my emissions without guessing?

By providing clear, accurate forecasts of carbon intensity, CarbonCast can help users make informed decisions to optimize their energy footprint and reduce emissions without sacrificing convenience or productivity.

What I’m Building

The plan for this summer is to develop the backend API services for CarbonCast. This summer, I’m focused on two major goals:

Geographical Expansion

I am extending CarbonCast’s compatibility to support more regional electricity grids. Each model will be customized for local grid behavior and renewable energy characteristics. This involves tuning the model pipeline to adapt to each region’s energy mix, weather patterns, and reporting granularity.

System Refactoring and Modularity

The original CarbonCast system was built as a research artifact. To refine it into production-grade infrastructure, I am refactoring the codebase to improve modularity. This makes it easier to plug in new regions, update forecasting algorithms, and integrate new data sources.

Impact Beyond Research

The paper that inspired this project, Multi-day Forecasting of Electric Grid Carbon Intensity using Machine Learning, pioneered the idea of forecasting carbon intensity over multiple days using a hierarchical machine learning model. This goes beyond the typical 24-hour day-ahead models that are common in the industry and allows for better planning and longer-term decision-making.

CarbonCast builds directly on that foundation by transforming research into practical, real-world infrastructure. It is an open-source library that anyone can run, contribute to, and benefit from. Whether you’re a developer building carbon-aware applications, a policymaker working on grid decarbonization strategies, or a sustainability-conscious individual looking to reduce your carbon footprint, CarbonCast provides the tools to make informed, impactful choices.

Looking Ahead

I am excited to contribute to a project that blends machine learning, systems engineering, sustainability, and public impact. My goal is to help make it easier for everyone to see, understand, and act on their carbon footprint while also providing the “visibility” people need to take meaningful, informed actions.

Kolmogorov-Arnold-based Transformer for LLMs

Sun, 15 Jun 2025 00:00:00 +0000

Project: KALLM

Proposal: proposal

Mentors:

Sai Suman Lamba Karanam
Prof. Zahmeeth Sakkaff

I am modifying existing large language models to make them more efficient by replacing some of their layers with Kolmogorov-Arnold Network (KAN) modules. These KAN layers use compact univariate polynomial approximations, which can reduce parameter count and improve interpretability. The project explores how to integrate these layers into Transformers, and how far we can push this idea by combining or stacking KAN modules with different polynomial bases. The goal is to keep performance competitive while lowering computational costs.

Beyond just speeding up training, I am exploring several other promising directions. One is testing whether transfer learning remains effective when replacing the linear layers of a pretrained LLM with KAN modules, or when swapping between different KAN configurations. I am also considering curriculum learning strategies that gradually increase KAN complexity during training. I have studied all major KAN implementations and early experiments with a custom Transformer architecture show encouraging results. However, I have found that most LLMs rely on functional-style activation definitions in PyTorch, which makes it difficult to build a universal wrapper. Because of this, KAN-based models will likely need to be integrated manually on a case-by-case basis.

LINQS: Autograder (LLM Detection)

Sat, 14 Jun 2025 00:00:00 +0000

LINQS: Autograder (GSoC ‘25)

As part of the LINQS: Autograder (LLM Detection) my proposal under the mentorship of Eriq Augustine, Lucas Ellenberger, and Lise Getoor aims to build a tool for AI plagiarism detection in code.

Problem Statement

Academic institutions are facing new sets of challenges in maintaining academic integrity with the rise of Large Language Models and tools like ChatGPT and GitHub Copilot, and their easier accessibility to students. Students are increasingly using these tools for assistance with their coursework, especially in programming assignments.

While these tools are useful for purposes such as brainstorming, research, and drafting, its use in completing assignments often crosses ethical boundaries. The use of these tools by students makes it difficult to uphold fairness in grading and ensure they are truly learning.

AI-generated code often lacks unique identifiers, rendering traditional plagiarism detectors like MOSS ineffective in detecting AI-generated code. That’s why there is a need for better systems that can assess whether code was AI generated by spotting underlying patterns.

Project Overview:

This is the problem that I am working to address with my project ‘LLM Detection’.

I aim to build a system that helps academic institutions ensure fairness and integrity in students’ work. To accomplish this goal, I will be working on 2 tasks:

Building a tool which determines whether a given piece of code was written by AI or not.
Designing and implementing a mechanism to compute a confidence score that indicates the likelihood of AI involvement in the code.

This tool can discourage students from copying or completing entire assignments using AI tools, encouraging honest and independent work.

(Read my full GSoC proposal here: Proposal)

About me:

Hey there!

My name is Anvi Kohli, I am a senior majoring in Computer Science and AI from India. This summer I will be contributing to the Autograder project by the LINQS Lab, under the guidance of Eriq Augustine, Lucas Ellenberger, and Lise Getoor.

A problem-solver at heart, I love to brainstorm, solve, and optimize complex issues. An instance being reaching the grand finals of the Smart India Hackathon to become the third best team nationwide with our app – “PM Poshan”. This app was built to digitize the monitoring and functioning of the mid-day meal scheme in India. It gave me the opportunity to improve my versatility and exposed me to all stages of the product development cycle.

I have hands-on experience in a multitude of domains such as AI/Data Science, cloud, full-stack development, and DevOps. Within AI, I have worked in GenAI, Computer Vision, Deep Learning and Classical Machine Learning. Apart from this, I have a strong interest in entrepreneurship, travelling, and cooking.