# **Shreeyash Pandey**

Systems Software Developer

## shreeyash335@gmail.com — +91 8108539856 https://fp32.org

#### EXPERIENCE

• Vicharak Surat, India

Systems Software Developer, Full-time

August 2023 – August 2025

# Gati Project - Edge ML Accelerator

- Architected an FPGA based edge-ML accelerator for, and designed hardware building blocks (e.g., im2col, systolic arrays).
- Developed cycle-accurate simulators for architectural verification, performance profiling, and design tradeoff analysis.
- Designed and implemented a dataflow compiler, integrating graph-level (layer fusion, tiling) and ISA-specific hardware optimizations to achieve real-time (20-30fps) inference of image recognition and object detection workloads on low-power edge device.
- Developed a high-throughput runtime engine for edge deployment, including optimized NEON kernel development for ARM processors.

## **Binary Neural Networks on FPGA**

- o Investigated hyper-quantized (Binary, Ternary) neural networks for their efficacy to be deployed on FPGAs
- Ported existing binary networks to validate performance and understand bottlenecks
- o Modelled novel architectures of binarized neural nets in PyTorch

Porting Tianocore EDK2 to ARM Cortex-A series propreitary chip

Vicharak
 Linux Kernel Developer, Intern
 Surat, India
 January 2023 – August 2023

- Linux Kernei Developer, Intern
  - o Configure and Compile the Linux Kernel for various target architectures like x86 and ARM.
  - Understand and Implement UEFI/PI specification to bring-up incompatible boards and allow a greater range of kernels to boot.
  - Inspect relevant firmwares with tools like Ghidra to find and debug problems.

## Talks

# No-ISA is the Best ISA - Shreeyash Pandey, Rishik Ram No-ISA is the Best ISA - Shreeyash Pandey, Rishik Ram

IICT 2024, Bangalore

https://youtu.be/G4fxdHozm5I?si=WGncpPAsuKaeeJuc

#### **EDUCATION**

# • G.H. Raisoni Institute Of Engineering And Technology Bachelor of Engineering in Computer Science Engineering; CGPA: 8.5

Nagpur, India August 2019 – May 2023

**PROJECTS** 

# • Clogwave - debug complex C code in a waveform viewer

https://github.com/bojle/clogwave December. 2023 – Present

LLVM, C++

• Implemented an LLVM pass that instruments C code with VCD dumping callbacks that when the program is run, generates a VCD dump and can be viewed in tools like gtkwave.

Waveform view of sequential code allows interdependent variables to be analysed wrt to other variables.

#### • Open Source Contributions

Fixing missed optimization cases for AVX-512

- o Adding support for X86 Vector intrinsics to be used as constexpr
- o Fixing issues in RISC-V SelectionDAG
- Adding fixed-point division support to LLVM libc

### • Atari 2600 Emulator

C, libSDL, GDB, 6502 Assembly

February. 2022 – May 2022

- o Engineered a functional Atari 2600 emulator from scratch in C, faithfully replicating the console's behavior.
- Modeled the system's core components, including the 6507 CPU, Television Interface Adaptor (TIA), and RAM, I/O, and Timer (RIOT) chips.
- o Developed a 6502 assembler and disassembler to support custom code and debugging.

# TECHNICAL SKILLS

- Programming Languages: C, C++, Bash (Shell Scripting), Python, UV, x86 Assembly, ARM Assembly, RISC-V, Verilog
- Frameworks and Libraries: PyTorch, Onnxruntime, Tensorflow, Tinygrad, CUDA, TensorRT, LLVM, ONNX, Protobufs
- Documentation: Writing technical documentation and tools for it such as: Markdown, RST, Sphinx
- Tools: Git, Make, CMake, GDB, Valgrind, Compiler Explorer
- Operating Systems and ISAs: Linux, Windows, ARM, x86, 6502