Artificial Intelligence Explainability Accountability

Fri, 14 Jun 2024 00:00:00 +0000

Hey! I’m Sarthak Chowdhary(Shaburu), and I am thrilled to share my incredible journey with the Open Source Program Office of UC Santa Cruz! Association as part of Google Summer of Code (GSoC) 2024. This experience marks a pivotal milestone in my career, offering me the chance to delve into an intriguing project while learning from the brightest minds in the open-source community. Allow me to guide you through my adventure thus far, from the nerve-wracking wait for results to the exhilarating commencement of the coding period.

Before we start here’s my Proposal.

Pre-GSoC Application

I had shortlisted 3 Organizations that i was working on

OSPO UC Santa Cruz - Amplifying Research Impact Through Open Source
CVAT.AI - Computer Vision Data Annotation for AI
Emory University - Biomedical Research to Advance Medical Care

On the 1st of May, like many students eagerly anticipating the results of the Google Summer of Code (GSoC) 2024, I found myself glued to my screen, anxiously awaiting the clock to strike 11:30 PM IST. After what felt like an eternity of waiting, I finally received the email that changed everything: I had been selected for GSoC 2024 with the Open Source Program Office of UC Santa Cruz!

The first month of GSoC, known as the community bonding period, is for establishing rapport with the people working on the project. I researched about my mentor Dr. Leilani H. Gilpin and build a good rapport with her, who is an Assistant Professor in Computer Science and Engineering and an affiliate of the Science & Justice Research Center at UC Santa Cruz. She is also a part of the AI group @ UCSC and leads the AI Explainability and Accountability (AIEA) Lab. Her research focuses on the design and analysis of methods for autonomous systems to explain themselves. Her work has applications to robust decision-making, system debugging, and accountability. Her current work examines how generative models can be used in iterative XAIstress testing. She guided me through the necessary documentation and explained the Project demands and requirements in detail, which was invaluable for my project.

Project

The project aims to build a system that is capable of taking some input which will be the student’s code and explaining them their mistakes from low level syntax errors, compilation errors to high level issues such as overloaded variables.

My Proposal aims to create custom novel basic questions and take it up a notch by creating custom drivers for each problem, common drivers to detect low level errors and give baseline explanations for various error cases, combining these drivers to make a robust system and use third-party open source software (like monaco code editor - the editor of the web) where necessary. Write uniform and consistent feedback/explanations for Each coding problem while covering all the possible edge cases and a pipeline which will iterate the test cases and feedbacks. This benchmark suite will be used for testing the system.

Additionally I plan on building an interface that has a roadmap from basics such as arrays, hashmaps to advanced topics such as trees, heap, backtracking along with progress bars and throws confetti on successful unit tests (important). These will be using the same benchmark suite that will be built under the hood. I will be utilizing Judge0 (open-source online code execution system) for the code execution and Monaco(open-source The Editor of the Web) as the code editor for this.

Project goals:

Project Objective: By the end of summer the software should be a novel and robust tool for helping the community of beginner and advanced programmers alike in learning programming by hyper-focusing on the mistakes they make and using AI to explain to them the how, what and why of their code. Provide clear and concise explanations accompanied by actionable suggestions for debugging and improvement.
Expected deliverables: A Robust eXplainable AI benchmark suite which will be used extensively for the undergraduate AI courses and possibly the Graduate courses as well. Along with anyone interested in learning programming with the help of personalized AI.
Future work based on project: A beautiful Gamified interface that gets people excited to learn programming which utilizes the above benchmark suite would be awesome to build!

When I Started my programming journey (before ChatGPT😨) I personally encountered problems that were way above my skill set and I had no way of knowing so, which used to result in spending countless hours without proper feedback as to where I was going wrong. This project has a real impact on people in an innovative way which I wish I had access to at the start of my Programming journey, so working on it comes from a place of passion. Also this specific project will test my own understanding of programming and spending the summer solidifying it, that too under the guidance of Leilani H. Gilpin is a dream come true for me.

Developing Trustworthy Large Language Models

Fri, 14 Jun 2024 00:00:00 +0000

Hi! Thanks for stopping by.

In this first blog post of a series of three, I’d like to introduce myself, my mentor, and my project.

My name is Nikhil. I am an ML researcher who works at the intersection of NLP, ML, and HCI. I previously worked as a Machine Learning Engineer II at VMware and spent some wonderful summers interning with ML teams at NVIDIA and IIT Bombay. I also recently graduated from the University of Southern California (USC) with honors in Computer Science and a master’s thesis.

This year at Google Summer of Code (GSoC 24), I will be working on developing trustworthy large language models. I’m very grateful to be mentored by Leilani H. Gilpin at the AIEA lab, UC Santa Cruz. I truly admire the flexibility and ownership she allows me in pursuing my ideas independently within this project. Please feel free to peruse my accepted GSoC proposal here.

Project: My project has a tangible outcome: An open-source, end-to-end, full-stack web app with a hybrid trustworthy LLM in the backend.

This open-source web app will be a lightweight tool that not only has the ability to take diverse textual prompts and connect with several LLMs and a database but also the capability to gather qualitative and quantitative user feedback. Users will be able to see how this feedback affects the LLMs’ responses and impacts its reasoning and explanations (xAI). The tool will be thoroughly tested to ensure that the unit tests are passing and there is complete code coverage.

At the moment, we are investigating LLMs and making them more trustworthy in constraint satisfaction tasks like logical reasoning and misinformation detection tasks. However, our work has applicability in other areas of Responsible AI, such as Social Norms (toxicity detection and cultural insensitivity), Reliability (misinformation, hallucination, and inconsistency), Explainability & Reasoning (lack of interpretability, limited logical, and causal reasoning), Safety (privacy violation and violence), and Robustness (prompt attacks and distribution shifts).

Impact:

Responsible AI research teams across industry and academia can use this as a boilerplate for their user study projects.
Diverse PhD students and academic researchers looking to study LLM and user interaction research will find this useful.
LLM alignment researchers and practitioners can find this resourceful as user feedback affects the inherent rewards model of the internal LLMs.
Explainable AI (xAI) researchers can find value in the explanations that this tool generates, which reveal interpretable insights into how modern LLMs think and use their memory. These are just a few use cases; however, there are several others that we look forward to describing in the upcoming posts.

This was my first blog in the series of three for the UC OSPO. Stay tuned for the upcoming blogs, which will detail my progress at the halfway mark and the final one concluding my work.

If you find this work interesting and would love to share your thoughts, I am happy to chat! :) Feel free to connect on LinkedIn and mention that you are reaching out from this blog post.

It is great to meet the UC OSPO community, and thanks for reading. Bye for now.

Responsible AI | UCSC OSPO

Artificial Intelligence Explainability Accountability

Pre-GSoC Application

Project

Developing Trustworthy Large Language Models