Hari Varsha

Thanks for visiting my about_me.md ! I'm a kernel-grade computer systems engineer and researcher with a strong focus on distributed systems and machine learning systems. I'm currently pursuing my master's in computer science at New York University and did my undergrad in electrical and computer engineering, where I spent my time hacking around with everything fun from large-language models to embedded microcontrollers. I like programming in Rust, OCaml, C++, and Python.

I've worked as a Summer Research Intern @ UCSC (Go Slugs!) and as a lab member at Tech4Good for over a year. I was also fortunate enough to work with Prof. KC Sivaramakrishnan at Indian Institute Of Technology, Madras on Hardcaml, Jane Street's embedded DSL for designing and testing hardware designs. Over the years, I’ve become deeply entrenched in the world of performance engineering, low-level systems programming, AI research, and deep-learning infrastructure.

I lead the Arcane Systems Reading Group, a collective of systems nerds who explore niche topics like compilers, databases, operating system kernels, formal methods, distributed systems, large-scale infrastructure, compute orchestration, and performance optimization. I also write essays on deep-tech in Substack and stream development on Discord, Twitch, and Youtube. I hope to produce the same level of influence as Kanye West but for computer science as a whole.

Outside of my career, you can find me spending my time with my beautiful partner and cats. I like playing video games, reading literature, and watching films & anime. I've been trying to get good at fine dining, muay-thai, and math/lingustic/informatics olympiads. I try to live a simple life as much as I can, with my biggest inspiration being Hank Hill from King of the Hill.

Current

🚨 I'm looking for summer-2026 internships / software engineering roles 🚨

If you'd like a dedicated, hardworking, & passionate hacker on your team, please reach out at hv2241 [at] nyu [dot] edu or any of my contacts found in the bottom of this website. I am open to dropping out of my master's program if our interests coincide and we can work on cool stuff together. Thank you so much for your consideration !

Here's a quick TL;DR of me :

• I have an extreme bias for rapid prototyping and building quality software fast. I'm also hyper-optimistic when it comes to solving challenges and passionate when it comes to my work.

• I enjoy cultivating deep technical competence, seek feedback constantly, and always am on the lookout to expand my skills. I also know what I don't know, and am not afraid to reach out for help.

• I learn relatively quickly, can hit the ground running, and love debugging complex codebases. I have genuine passion for delivering outstanding results.

• I'm a niche subset of T-shaped developer, and my operating principles were heavily inspired by Valve's Employee Handbook and the blogs of Paul Graham and Joel Spolsky.

I also want to be honest here, so you have the full picture

• I’m an international student on an F-1 visa and will require future sponsorship (H-1B). I take time to deeply understand problems, learn the right tools, explore different solutions, and develop a strong grasp of the codebase.

• Due to my ADHD, I thrive best in environments where I’m given autonomy rather than being micromanaged, and I also need a healthy level of mentorship to stay on track and grow meaningfully.

• I tend to ask tons of questions and am a slow learner by nature. While it may take me a little longer than average to fully settle in, this allows me to maximize my long-term growth and impact. I like to build what’s often called “tribal knowledge”, so I can contribute effectively and help team members quickly.

• In the past, I pushed myself too hard and working extremely long hours are no longer sustainable. That said, I’m more than committed and resilient enough to put in the hours needed when things get tough, which they always do.

Now that you know pretty much everything about me as your future team-mate and friend, we could build meaningful connection around our shared love for computers and building products people love :)

Projects: I've been deep diving into Distributed Serverless Engines in Python, writing my own Linux Kernel Modules in C, and experimenting with Hash Tables in Rust to understand them on a fundamental level.

Books: I've been enjoying reading Database Internals by Alex Petrov, Operating Systems: Three Easy Pieces by Andrea and Remzi Arpaci-Dusseau, and Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein.

Ongoing Projects

I'm currently focusing on hacking around the entire machine learning stack (models, kernels, compilers, and hardware) and database stack (data models, storage engines, query engines, and distributed systems) from scratch. Unfortunately, my other projects have been on hold, including my open-source project Memspect.

  • Memspect [C, Rust, LLVM] [Talk] : A cool static analysis framework for real-world C codebases that focuses on fast and accurate memory debugging. Gained arcane knowledge of compiler internals in the process. Started off as a final-year project and was presented at India's first compiler workshop
  • Tachyon [Rust] : Building a mini LSM-tree storage engine from scratch. Core components includes mutable/immutable memtables, SST (sorted string table) files, write-ahead logging, and configurable compaction algorithms that are used in production systems
  • TinyTorch [Python, CUDA] : Developing a PyTorch-inspired deep learning framework from scratch. Currently implementing automatic differentiation, GPU-accelerated tensor operations, and neural network modules with custom CUDA kernels for matrix multiplication, convolution, and attention mechanisms

Future Projects include an experimental file-system to compete with ZFS & BTRFS, a GPU-aware scheduler for serverless platforms, MapReduce from scratch, context-sensitive search-engine for metadata & logs, tiny open-source machine learning compiler, library containing state-of-the-art algorithms for distributed deep learning, inference engine from scratch, building my own network stack for p2p file sharing, and custom GPU orchestrator for managing H100 clusters.

Career Interests

I enjoy building novel systems and reliable tools that work seamlessly and stand the test of time. This was inspired by using and exploring the design of beautiful software such as Vim/Emacs, Ripgrep, Fish shell, Linear, SQLite, Signal, Graphviz, Blender, Linux Mint, and many more.

My experiences have taught me to prioritize ownership, autonomy, and align myself with a strong sense of mission over my work. My process for building cutting-edge systems is through rigorous engineering with creative exploration. I strive to write simple and clean code that is incredibly well-tested, approachable, and thoughtful.

I deeply care about my craft and often discuss various programming topics with senior engineers and domain experts. My engineering philosophy is heavily inspired by TIGER_STYLE, Andrew Kelley's Practical Data Oriented Design, Hard-Mode Rust, and some elements of functional programming. Here are some of my professional interests in greater detail:

✦ GPU kernel engineering and distributed machine learning systems ✦

I've always had a deep fascination with GPUs ever since I played Crysis 3 as a kid. Today, I explore ways of writing my own high-performance kernels for TPUs, GPUs, & AWS Trainium. I'd like to study the performance characteristics of various GPU architectures and optimize compilers to leverage their hardware features like tensor cores. My interests are mainly deep kernel work and CUDA spelunking, particularly experimenting with using Cutlass, Triton and CuTe DSL. I hope to read more cool papers, reverse-engineering kernel implementations and implementing FlashAttention variants in the future.

Beyond kernel-level interests, I'm drawn to work that sets the direction for the next generation of machine learning systems, whether it's rewriting core collectives of PyTorch for fault tolerance using RDMA & GPUDirect, developing decision tree compilers, building an entirely new distributed file system and efficient parallel expert communication libraries, or crafting custom Python bytecode interpreters to capture graphs. I'm particularly fascinated by hardware-aware algorithms, sequence models with long-range memory, and distributed training at scale.

✦ Supercomputing-scale compute & databases for fun and profit

One of my primary interests lies in designing and building novel, high-performance systems for machine learning, particularly at supercomputing scale. As a humble systems engineer, I'd kill to work on interesting projects like optimizing container runtimes and building query engines like Apache DataFusion.

I also love exploring operating system design and HPC network architectures, hacking high performance storage systems like Weka and Ceph, designing load-balancing algorithms to optimize serving efficiency, breaking the CUDA compiler, and enhancing performance of virtual machines.

✦ Building scalable, fault-tolerant Infrastructure for LLM research ✦

I take great pleasure in deploying on bare-metal machines and building tooling for infrastructure engineers. As a proud supporter of self-host movement, I would like to use any opportunity to learn linux virtualization, private networking, and cloud-native observability tools, especially pursuing experimental ventures like integrating WebAssembly to avoid long cold starts and over-provisioning.

I'm especially interested in delving one level deeper using tools like eBPF to monitor and mitigate excessive CPU usage, instrumenting the Linux scheduler with ftrace and perfetto, analyzing request latency using sampling profilers like gprof, trying failing to configure Kubernetes for optimal workload performance, using pprof to optimize Go code, & add more spells under my sleeve.

✦ Hacking hardware architectures & high-performance, low-latency applications ✦

Building robust, low-latency hardware and applications to serve millions of users has always been on my career bucket list. I would actually love to put my EE degree to use, particularly in FPGA engineering with OCaml, chip floorplanning with deep reinforcement learning, and microprocessor architecture design. I'd like to build machine learning accelerators such as Google’s TPU and Groq's LPU, and consumer hardware like Apple's AirTag.

Above the hardware stack, I am interested in debugging kernel-level network latency spikes in containers, developing task schedulers, and tuning garbage collectors. Other strong interests include orchestration engines, block storage systems, and compute services.

Connect

“ It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better.

The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming;

but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat ”

Theodore Roosevelt, 26th president of the United States of America

Kindness is an important value I try to practice at every opportunity I get. I'm always open to discussing career plans, startup ideas, or even research interests. Feel free to get in touch for a coffee chat through the contacts below. I particularly encourage students from underrepresented groups or disadvantaged backgrounds who aspire to break into systems to connect with me. I'll be happy to chat and help in any way I can ^-^