sample-image

About

I am currently a Staff Data Scientist at Meta working on Privacy, with past experience in the company on Integrity, Community Products, and Growth. Through personal projects and summer work experiences, I've applied my knowledge in areas like academia, healthcare, non-profits and social media. I've worked startups and large corporations, and both as an individual contributor and a technical lead.

I studied Statistics and Machine Learning at Carnegie Mellon University, with an additional major in Economics and a minor in Computer Science. My interdisciplinary coursework gives me a solid theoretical foundation and technical skills.

Outside of work my interests include photography, basketball, standup comedy, hiking, and scuba diving.

I'm always interested in new opportunities. Feel free to reach out via email to yhavanur[at]gmail.com to connect.

Work

Meta

Staff Data Scientist, Privacy Dec 2020 - present

I'm currently the DS Tech Lead within the Privacy Products organization. My work spans various user-facing features such as the Meta Privacy Center, Access & Download Your Information tools, and the Meta Privacy Policy . In this role, I've done the following:

  • I've been responsible for key pieces of our measurement strategy, which includes developing ways to measure the prevalence of user privacy concerns, creating proxy metrics for user engagement and product findability, and conducting user analysis, all of which drive the overall product strategy of our 350+ team.
  • I created our organization's experimentation process, building a comprehensive data infrastructure and spearheading a standardized leadership review process that's supported more than 200+ experiments yearly across 50+ engineers.
  • I lead & coordinate the work of the data team of 4 data scientists and 3 data engineers, reviewing their work to provide technical feedback, and fostering collaboration with cross functional partners. I also mentor junior ICs on career growth, conflict resolution, and communication, both within my organization & as part of the broader data science community.

Data Scientist, Integrity Sept 2018 - Dec 2020

I worked on the Community Products Integrity team, which was responsible for protecting people who used Facebook's Groups & Events products. My work encompassed various facets of integrity, including some of the following:

  • I developed novel detection methodology for identifying Groups & Events tied to politically motivated movements, such as Stop the Steal. During the election period, we were able to action on 1900+ entities proactively before they could be used to effectively organize and potentially create violence.
  • Partnering with engineering, I helped to improve the quality of our Groups recommendations engine by implementing a model calibration process that reduced impressions to harmful groups by 10% and saved ~5 months of engineering effort each half, and helped close gaps in our filter process that reduced civic misinformation impressions by 65%.
  • I worked on developing Moderation Alerts, a moderation tool for identify potential conflicts within communities through data analysis that increased retention by 10%. It's consistently ranked as #1 moderation feature amongst community partners.

Data Science Intern, Growth Infrastructure May 2017- Sept 2017

I spent the summer of 2017 working at Meta (then Facebook) as a Data Science Intern on the Contacts Team within Growth Infastrucutre. My project involved leveraging terabytes of identity information collected from users across the various Meta owned entitites, such as Instagram, WhatsApp, and Messenger. I performed the following while I was there:

  • I identified and engineered features for a model to classify business phone numbers, utilizing terabytes of addressbook data and contactpoint relationships. The model achieved a 98% F1 score (harmonic mean of precision and recall values) and we were issued a patent on "Verifying users of an electronic messaging system"
  • I uncovered significant data quality issues in essential data pipelines and worked with engineering teams to resolve client-side issues with collection that affected more than 46% of records.
  • I created pipelines and dashboards to monitor and better understand various stages of experiment lifecyle captured by internal experimentation tool, identifying areas of improve for internal tools development.

IBM Watson Health

Data Science Intern, Innovations Team May 2016 – Aug 2016

I worked as a data science intern at Explorys, a healthcare startup that had recently been accquired by IBM Watson Health. This was a unique experience in the fact that it felt like working at a startup with the resources of a large company. I learned a lot that summer about being a data scientist, from learning how to write production quality code to working in a team of data scientists and engineers on common problems, and how to apply what I learned in school to learn problems. During my time there, I worked on the following tasks:

  • I implemented a model to calculate heart disease risk using Java and MapReduce, with healthcare records stored in Hbase. This model informs healthcare providers about the risks facing more than 55 million U.S patients.
  • To improve our medical record standardization process, I designed and tested a new fuzzy string matching algorithm, which increased recall by 600% and precision by 50% when compared to exisiting methods.
  • I added new procedure revenue metrics to internal data quality reports, in order to better analyze revenue distributions from provider records.
  • I received commendation for clarity and impact when presenting my work to IBM executives during the summer finale event in Austin, Texas, location of Watson headquarters.

Carnegie Mellon University

Research Assistant, Various Departments May 2015 – Jan 2016

Over my time at CMU, I've been lucky enough to be involved with several research projects, helping me enhance my programming and analytical skills while contributing to interesting areas of study, even as an underclassman.

  1. Optimizing Route Behavior of Taxi Drivers
    • Sophomore fall, I worked under Professor Joachim Groeger to develop a model for taxi drivers to optimize route and rider selection, using data collected from the New York City MTA.
    • I cleaned the data and performed EDA on fare amounts, pickup/dropoff locations, and duration using R
    • To better understand anomalies, I also collected additional hourly weather data and linked it to appropriate dates and times.
  2. Scientific Collaboration Networks
    • During my freshman summer I conducted independent research on scientific collaboration networks across a 20 year span under the supervision of Professor Katherine Anderson.
    • I wrote a Python script to efficiently generate an authorship network of more than half a million papers using Networkx, Pandas, and Matplotlib.
    • After generating those networks, I analyzed their properities such as degree distribution, betweenness centrality, and clustering coefficient using R, and reported my findings.
  3. Electricity Consumption Trends
    • Freshman spring I aided in ongoing research on the history of U.S electricity consumption and power grid implementation to discover its relationship with real-estate epidemics
    • After manually entering the data from historical records, I created multiple regression models using STATA and R

Education

Carnegie Mellon University

B.S in Statistics and Machine Learning, additional major in Economics, minor in Computer Science. Graduated May 2018 with University and Tepper College Honors

Honors

  • Senior Honors Thesis in Economics
    • Researched and wrote an independent thesis entitlted "Impact of Social Networks on Buying Behavior: Predicting the Success of Pittsburgh Businesses Through Analysis of Yelp Social Networks." In doing so, I won Best Thesis Defense at Meeting of the Minds, CMU's annual research symposium.
  • Andrew Carnegie Society Scholar
    • "ACS Scholars are undergraduate seniors who embody Carnegie Mellon's high standards of academic excellence, volunteerism, leadership and involvement in student organizations, athletics or the arts. They are selected each year by their deans and department heads to represent their class in service and leadership."
  • Quantitative Science Scholars Program (QSSS)
    • "QSSS is designed to help outstanding undergraduates acquire advanced quantitative technical skills they can use to impact society as entrepreneurs, policymakers, or social scientists."
  • Omicron Delta Epsilon
    • ODE is the largest international honor society for Economics. I was invited to join in my junior year.
  • Dean's List with High Honors, multiple semesters
    • Given in recognition of taking at least 45 factorable units and achieving a GPA of higher than 3.75 for the semester.

Relevant Coursework

  • 15-388 Practical Data Science
  • 36-462 Data Mining
  • 36-402 Advanced Methods in Data Analysis
  • 10-601 Introduction to Machine Learning
  • 11-441 Natural Language Processing
  • 15-251 Great Theoretical Ideas in Computer Science
  • 15-210 Data Structures and Algorithms
  • 15-150 Functional Programming
  • 15-122 Principles of Imperative Programming
  • 73-374 Econometrics II
  • 73-359 Benefit-Cost Analysis

GPA: 3.7/4.0

Leadership

Palau Financial Intelligence Unit

Technical Consultant June 2018 - August 2018

In the summer of 2018 I worked as a consultant for the Financial Intelligence Unit of the Republic of Palau, an agency committed to combatting money laundering within the country.

  • I wrote Python scrips and VBA macros with simple user interfaces that drastically improved data cleaning and validation efforts in complex bank documentation.
  • I developed new ETL procedures in order to integrate immigrations and customs records seamlessly using SQL and Visual Basic
  • I worked with the members of the FIU and generated the first ever comprehensive strategic criminal analysis, the findings of which were presented by Palau to the United Nation's Financial Action Task Force on Money Laundering
  • I created a new analytics dashboard using R Shiny and Microsoft Access to provide comprehensive view of suspected criminal activity through interactive time series graphs, maps, and wordclouds

Moneythink CMU

President May 2016 - May 2017

Moneythink is a nationally recognized non-profit committed to working to restore the economic health of the United States through financial education. The Carnegie Mellon chapter of Moneythink works with several Pittsburgh high school classrooms in low-income areas through weekly mentoring sessions on topics like budgeting, banking, and financial products. I've been in Moneythink since my freshman year of college, and I was lucky to spend the last year implementing several new initatives and expanding the reach of our mission.

  • During the school year I led board of 10 members in charge of maintaining the day to day operations of the organization, as well as implementing several new intiatives.
  • I oversaw the creation and successful launch of our Finanical Innovation Challenge, a themed hackathon that brought together students, faculty, and financial institutions to come up with new ideas that could combat issues related to financial literacy (http://tepper.cmu.edu/news/2017-05-15-finnovation-challenge)
  • I co-developed our official website (www.moneythinkcmu.org) as an informational and recruitment tool.

Carnegie Mellon Student Life

Resident Advisor August 2015 – May 2017

For two years in college I was an RA both for under & upperclassmen.

  • I coordinated with other members of my staff while being responsible for the health and well-being of more than 30 residents at any given time.
  • I received training in conflict resolution, identifying problematic behavior and first responder training.
  • I organized and ran dorm-wide events such as group outings, socials, and discussion groups. I was particularly interested in current events and politics, and so I took initative by organizing events around presidential debates, major news stories, and educating residents on local Pittsburgh politics.

Carnegie Mellon Computer Science Department

Teaching Assistant Jan 2018 – Present

In Spring 2018, I was a TA for 15-112, Carnegie Mellon's famed introductory computer science class.

  • Each week I held a recitation for a class of 30 students, reviewing that week's material and/or introducing next week's material, clarifying confusing concepts and going over practice problems.
  • In addition, I also held weekly office hours to answer questions on course materials and the weekly homework, as well as helping lead review sessions to large groups of students around exam time.
  • What I'm most proud of is taking initiative to conduct quantitative analysis on the structure of the course and practices. After collecting student level data on assignment scores, quiz performance, attendance, and self-reported time spent reports, I generated a series of recommendations to the professors of the course, helping to pinpoint relationships between time spent and particular concepts as well as effectiveness of review materials in terms of student results. This was the first time a rigourous review of teaching practices had occured, and led to many common assumptions being revisited.

Carnegie Mellon Economics Department

Teaching Assistant Jan 2017 – May 2017

In Spring 2017, I TAed a new course in the Economics Department, 73-160 Foundations of Microeconomics: Applications and Theory.

  • Each week I held a recitation for a class of 20 students, reviewing that week's material and/or introducing next week's material, clarifying confusing concepts and going over practice problems.
  • I also held weekly office hours to answer questions on course materials and the weekly homework.
  • Because this was the first time the course was being offered, I constantly took feedback on the course content and relayed it to the department to improve future iterations of the course.

Skills

Programming
  • Python
    • Pandas, Numpy, Sklearn, Jupyter Notebooks, Networkx
  • R
  • SQL
  • Java
  • C
  • Git
Other
  • Microsoft Office
  • Public Speaking
  • Conflict Resolution
  • Basketball
  • Standup Comedy

Get In Touch.

I'm always interested in learning new things and meeting new people. If you think you have something that I might be a good fit for, feel free to email me at yhavanur[at]gmail.com.