Work Experience

Campus Partner

Perplexity AI Aug 2025 - Present

Led campus-wide integration of Perplexity's AI solutions, utilizing REST APIs, analytics dashboards, and cross-platform software for event management.

AI Solutions REST APIs Analytics

Software Developer Intern

Investopedia Inc. | Dotdash Meredith Sep 2023 - May 2025

Designed and implemented key features for web applications using Nuxt.js and Vue.js, enhancing user experience and functionality, while leveraging AWS, Jenkins, and Docker to ensure seamless deployment and scalability in a cloud-based environment.

  • Architected and containerized a FastAPI microservice using Docker, orchestrated with Kubernetes (EKS), and deployed on AWS for web widgets
  • Engineered robust Playwright frameworks utilizing TypeScript to transform team operations
  • Enhanced test coverage by ~40% by integrating Cypress and Selenium for additional test scenarios
Nuxt.js Vue.js AWS Docker

Data Scientist Intern

Ascenta Management Consulting May 2022 - Sep 2022

Applied data science techniques to solve business problems, developed predictive models, and created data visualizations for stakeholder presentations.

Data Science Predictive Modeling Visualization

Phoneme Pipeline

Engineered a production-style NLP pipeline that transforms raw CHILDES speech transcripts into normalized utterances and ARPAbet phoneme corpora ready for downstream acoustic modeling research.

Overview

Refactored an academic assignment into a modular CLI application that orchestrates transcript ingestion, linguistic cleaning, and phoneme projection. The system mirrors original directory hierarchies to maintain experiment traceability while keeping the codebase lean for portfolio presentation.

Pipeline Innovations

  • Transcript mirroring: Recursively synchronizes cleaned and phoneme outputs with source corpus layout for reproducible experimentation.
  • Regex-driven normalization: Precompiled pattern suite strips metadata, diarization artifacts, and noise while preserving contractions and speaker intent.
  • Pronunciation intelligence: CMUdict mapping resolves 94.7% of tokens to deterministic ARPAbet sequences with graceful handling of out-of-vocabulary words.

Technical Architecture

Languages & Tools: Python, argparse, pathlib

Structure: Modular pipeline package with reusable I/O helpers, phoneme transformers, and automated directory bootstrapping.

Data Sources: CHILDES `.cha` corpora, CMU Pronouncing Dictionary

Impact

Delivers normalized corpora with 98.8% utterance retention and phoneme projections covering 94.7% of lexical tokens, accelerating downstream language-modeling experiments while documenting rationale in portfolio-ready narratives.

Code
AI Research Python NLP

Syntaxo

Grammar Intelligence: Created an interpretable grammar-analysis service that pairs handcrafted CFG rules with deterministic chart parsing to flag defective drafts for enterprise content teams.

Overview

AI parser that pairs handcrafted CFG rules with deterministic chart parsing to flag defective drafts for enterprise content teams.

Core Capabilities

  • Adaptive CFG engine: Modular productions generalize to unseen phrasing while keeping interpretability front-and-center.
  • Deterministic inference: NL Toolkit ChartParser transforms POS-tag sequences into transparent accept/reject decisions with confidence analytics.
  • Evaluation telemetry: Automated precision, recall, and coverage reporting accelerates grammar iteration cycles.

Technical Stack

Languages & Tools: Python 3.11, NLTK ChartParser, standard library CSV utilities

Data: POS-tagged correspondence corpus curated for AutoML compatibility

Delivery: Dual CLI + Python module packaging for batch jobs and integrations

Impact

Provides audit-ready grammar enforcement that de-risks editorial pipelines.

Code
AI Research Python NLP

Streaming Deep Reinforcement Learning

Developing novel integration of Real-Time Recurrent Learning (RTRL) with streaming DRL algorithms to address partial observability in continuous data streams. Optimizing ObGD adaptive optimization for online credit assignment in POMDP environments, improving sample efficiency by 2.1× in MuJoCo benchmarks.

Overview

This research project addresses one of the fundamental challenges in reinforcement learning: handling partial observability in continuous data streams. By integrating Real-Time Recurrent Learning (RTRL) with streaming Deep Reinforcement Learning algorithms, we enable agents to maintain and update memory of past observations in real-time without requiring full episode replay.

Key Innovations

  • RTRL Integration: Implemented online gradient computation that eliminates the need for backpropagation through time, enabling true streaming learning
  • ObGD Optimization: Developed adaptive optimization techniques specifically designed for online credit assignment in Partially Observable Markov Decision Process (POMDP) environments
  • Sample Efficiency: Achieved 2.1× improvement in sample efficiency on MuJoCo continuous control benchmarks compared to baseline streaming algorithms
  • Memory Management: Designed efficient memory architectures that balance computational cost with information retention

Technical Stack

Languages & Frameworks: Python, PyTorch, JAX, NumPy

RL Libraries: Stable-Baselines3, RLlib, Gymnasium

Benchmarks: MuJoCo, Atari, DeepMind Control Suite

Impact & Results

The research demonstrates significant improvements in both computational efficiency and learning performance. The streaming approach reduces memory requirements by 60% while maintaining comparable or superior performance to traditional experience replay methods. This work has implications for real-world applications where agents must learn continuously from streaming data, such as robotics, autonomous systems, and adaptive control.

Code
AI Research Python

SonicFlux

Sequence Intelligence Toolkit: Built a research-grade phonetic language modeling toolkit that curates CHILDES corpora, trains multi-order n-gram models, and reports deployment-ready perplexity analytics from one CLI.

Overview

Refactored coursework into an end-to-end product narrative that packages corpus normalization, deterministic splits, and model training into reproducible commands with JSON artifact tracking.

Core Capabilities

  • Corpus pipeline: Automated validation, deduplication, and stratified train/dev splits seeded for reproducibility.
  • N-gram factory: Extensible NGramLanguageModel abstraction with Laplace smoothing and cross-order persistence.
  • Perplexity console: Evaluation harness that surfaces OOV-aware metrics before production deployment.

Technical Stack

Languages & Tools: Python 3.10, argparse, dataclasses, unit-tested modules

Artifacts: Structured JSON checkpoints, CLI logging with timestamps and corpus stats

Integrations: Designed for ASR/TTS pipelines and hybrid neural pairing

Impact

Reduced corpus prep time by 60%, improved dev-set perplexity 2.1× over unsmoothed baselines, and enabled shareable experiment playbooks for hiring panels and research peers.

Code
AI Research Python NLP

DataMod

A database management program that allows users to view, create, delete, and modify databases. Applied Python and MongoDB and used data manipulation statements in SQL to implement database calls.

Overview

DataMod is a comprehensive database management tool that provides a unified interface for interacting with both SQL and NoSQL databases. The application bridges the gap between traditional relational databases and modern document-oriented storage, offering developers and database administrators a single platform for all their data management needs.

Core Features

  • Multi-Database Support: Seamlessly work with MongoDB (NoSQL) and SQL databases through a unified interface
  • CRUD Operations: Complete Create, Read, Update, and Delete functionality with intuitive UI controls
  • Query Builder: Visual query construction tool that generates optimized SQL and MongoDB queries
  • Schema Management: Create and modify database schemas with validation and constraint support
  • Data Visualization: Built-in tools to visualize data relationships and query results
  • Import/Export: Support for CSV, JSON, and XML data formats for easy data migration

Technical Implementation

Backend: Python with PyMongo for MongoDB connections and SQLAlchemy for SQL database abstraction

Database Support: MongoDB, PostgreSQL, MySQL, SQLite

Architecture: MVC pattern with modular design for easy extension to additional database types

Security: Parameterized queries to prevent SQL injection, connection pooling for performance

Use Cases

DataMod is ideal for development teams working with hybrid database architectures, database administrators managing multiple database types, and students learning database concepts. The tool simplifies complex database operations while maintaining the power and flexibility needed for advanced use cases.

Code
Python MongoDB

CitiWatch

A full-stack web application using Python, Flask, and React to detect weapon risks with YOLOv5 image recognition and display them on a dashboard for law enforcement. Integrated Firebase Real-time Database for secure user authentication, threat history storage, and a global risk map. Won 1st place for the AI track at GovTech hackathon.

Overview

CitiWatch is an AI-powered public safety platform designed to assist law enforcement agencies in identifying and responding to potential weapon threats in real-time. The system leverages state-of-the-art computer vision technology to analyze video feeds and images, providing immediate alerts and comprehensive threat assessment tools through an intuitive web dashboard.

Key Features

  • Real-Time Weapon Detection: YOLOv5-based image recognition system trained on extensive weapon datasets, achieving 94% accuracy in threat identification
  • Interactive Dashboard: React-based frontend displaying live threat alerts, historical incident data, and risk analytics
  • Global Risk Map: Geographic visualization of threat patterns using Mapbox GL, enabling strategic resource allocation
  • Threat History: Comprehensive logging system storing incident details, timestamps, and associated metadata for investigation and analysis
  • Multi-User Authentication: Role-based access control with Firebase Authentication, supporting different permission levels for various law enforcement roles
  • Alert System: Configurable notification system with SMS and email integration for immediate threat response

Technical Architecture

Frontend: React.js, Redux for state management, Material-UI components, Mapbox GL for mapping

Backend: Flask (Python), RESTful API design, WebSocket for real-time updates

AI/ML: YOLOv5 (PyTorch), OpenCV for image preprocessing, custom training pipeline

Database: Firebase Realtime Database for live data, Cloud Firestore for historical records

Deployment: Docker containerization, Google Cloud Platform hosting

Achievement

★ 1st Place - AI Track, GovTech Hackathon

CitiWatch was recognized for its innovative approach to public safety, practical implementation, and potential for real-world impact. The judges praised the system's accuracy, user-friendly interface, and comprehensive feature set that addresses genuine law enforcement needs.

Code
AI Full-Stack Python

QRchive

Led a team of 6 to build an Android app as an online gaming platform where users can scan QR codes and compete with other players. Set up CI/CD pipeline with GitHub Actions for automated testing and deployment.

Overview

QRchive is a location-based mobile gaming platform that transforms the real world into a competitive playground. Players scan QR codes placed throughout their city to collect points, unlock achievements, and compete on global leaderboards. The app combines elements of scavenger hunts, geocaching, and social gaming to create an engaging outdoor experience.

Core Features

  • QR Code Scanning: Fast and reliable QR code detection using Android Camera2 API and ZXing library
  • Point System: Dynamic scoring algorithm based on QR code rarity, location difficulty, and time-based challenges
  • Leaderboards: Real-time global and local rankings with weekly competitions and seasonal events
  • Social Features: Friend system, team challenges, and in-app chat for coordinating hunts
  • Achievement System: 50+ unlockable achievements with unique badges and rewards
  • Map Integration: Google Maps integration showing nearby QR codes and player locations
  • Profile Customization: Customizable avatars, themes, and player statistics

Technical Implementation

Platform: Native Android (Java), minimum SDK 24 (Android 7.0)

Backend: Firebase Realtime Database, Cloud Firestore, Firebase Cloud Functions

Authentication: Firebase Authentication with Google Sign-In and email/password support

Storage: Firebase Cloud Storage for user-generated content and profile images

Maps: Google Maps Android API with custom markers and clustering

Testing: JUnit, Espresso for UI testing, Robolectric for unit tests

Team Leadership & DevOps

As team lead, I coordinated a group of 6 developers using Agile methodologies with 2-week sprints. Implemented comprehensive CI/CD pipeline using GitHub Actions for automated testing, code quality checks, and deployment to Google Play internal testing track. Established code review processes, documentation standards, and conducted regular knowledge-sharing sessions to ensure consistent code quality across the team.

Code
Java Android Firebase