About Me

I am currently a G1 M.S in Data Science student at Harvard University. Before I come to harvard, I completed by bachelor degree in Department of Automation, School of Information Science and Technology, Tsinghua University. I also minored in Statistics, Center for Statistical Science, Tsinghua University.

My research interests lie in using statistical and machine learning methods to deal with real world problems. Unlike those who purely apply models to do testing or prediction, I am fairly interested in the mechanism behind the models. My ultimate career goal is to become a data scientist who can analyze data for actional insights.

I worked at Institute of System Engineering in Tsinghua University in 2016, dealing with huge amount of traffic data, advised by Prof. Xin Pei. Since Fall 2017, I have been working with Prof. Sheng Yu at Center for Statistical Science in the area of medical informatics. In Summer 2018, I was selected by Stanford's UGVR program and worked in Chuck Eesley's Group in MS&E, advised by Prof. Chuck Eesley. Started Spring 2019, I started my internship at Didi Chuxing and worked in the AI labs dealing with real traffic data.

Education

  • Harvard University

    Aug.2019 - Dec. 2020 (Expected)
    Graduate School of Arts and Sciences
    M.S in Data Science
    GPA: NULL/NULL

  • Tsinghua University

    Aug.2015 - Jul. 2019
    Department of Automation / Center for Statistical Science
    Bachelor of Engineering / Minor degree in Statistics
    GPA: 3.95/4.0,   Ranking: 1st/138

  • Stanford University

    Jun.2018 - Sept. 2018
    Management Science and Engineering
    Chuck Eesley's Lab
    Visiting Researcher (UGVR Program)

Research

Selected research projects are presented here.

China’s Mass Entrepreneurship and Innovation Policy Evaluation

Qiuyang   Yin,   Chuck E.Eesley,   Song   Wang.   [Poster]

Mass Entrepreneurship and Innovation Policy is a policy adopted in 2015 by Chinese government to encourage more startups. In our research, we have used several statistical models to test the effect of policy on different people and in different areas, including piecewise hazard model , Cox PH and ordered logit. We also extend DiD model to DiDiD model to explain three dimensions of differences (time, people and region).

Word Segmenatation as Graph Partition

Yuanhao   Liu  , Sheng   Yu  Zheng   Yuan, Qiuyang   Yin

We propose a novel approach to the Chinese word segmentation problem that considers the sentence as an undirected graph, whose nodes are the characters. Spectral graph partition algorithms are used to group the characters. After testing performance on electronical medical record(EMR), we found that our unsupervised dictionary-indepedent algorithm outperfomed classic Maximum probability(MP) model and supervised hidden marcov model(HMM). The results are ready to be submitted.

Automated Diagnosis in Traditional Chinese Medicine

Qiuyang   Yin,   Sheng   Yu

Traditional Chinese medicine (TCM) is a style of traditional medicine built on a foundation of more than 2,500 years of Chinese medical practice. In our research, we try to use statistical and machine learning methods to automatically prescribe Chinese medicine based on patient’s background and symptom. Since most of data are natural language, we are using many NLP techiques. eg. SVD is used construct word vectors, seq2seq RNN model is used to do prediction. In the end we have compared our results to naive dictionary-search methods.

Safety Evaluation for Truck Drivers Based on Traffic Big Data

Qiuyang   Yin,   Xin   Pei

This project aims at recognizing "dangerous" truck drivers based on traffic big data, involving surveillance video, database for drivers infos and accident records. After extracting some common features for drivers, data visualization is made to facilitate furture research. Some common machine learning methods are also applied to the dataset, including MLR, SVM and Random Forest. An online platform is then established for both traffic regulators and truck drivers, in which a whole road condition can be inferred.

Honors

Fellowships

Scholarships

  • 2019   |   Outstanding Graduate
    Of Beijing, Tsinghua(60 people out of 3300) and Dept. Of Automation respectively. This is similar to the summa cum laude in the U.S.
  • 2019   |   Outstanding Bachelor Thesis
    Award to prove the excellence of Bachelor Thesis. Title: Comprehensive evaluation of road safety based on shared traffic data
  • 2018   |   Chang Tong Scholarship
    Highest honor in the Department of Automation (1/560). Alumni: Hui Qiao (2012),   Fei Xia (2015)
  • 2017   |   Jiang Nanxiang Scholarship
    One of the highest scholarship in Tsinghua named after Tsinghua’s honor president, only 50 students among all undergraduates in Tsinghua are selected anually.
  • 2017, 2016   |   Qualcomm Scholarship
    Awarded to students with excellent scientific potential, only 40 students among all undergraduates in Tsinghua University are selected
  • 2016   |   China National Scholarship
    Highest level of scholarship set by the government of China

Awards

  • 2018,2017   |   3rd Prize in Tsinghua Challenge Cup
    Top scientific research contest at Tsinghua University
  • 2017   |   Finalist Winner in Mathematical Contest in Modeling(MCM)
    Best papers in the final round. (top 46/8843)
  • 2016   |   2nd Prize in C Programming Contest for Freshmen
    Freshmen AI contest in Tsinghua.
  • 2014   |   2nd Prize in National Mathematics Olympiad(Shanghai division)

Programming Languages

      R (>25k lines, Expert Level)   C/C++ (>10k lines)   Python(>10k)   MATLAB(>10k)   SQL   C#(>5k)   Java(>5k)   SAS   HTML/CSS
      LaTeX   Git/Github   Regular Expression   pandoc

Hobbies & Interests

  • Tennis

    I love to play tennis in my leisure time, which is my favourate sport. I play nearly everyday when I am in my senior year.

    I believe my NTRP rating is between 2.5 and 3.0, which is better than beginner but much worse than semi-pro players

  • Texas Holdem

    :) I have played Texas Holdem for more than 4 years. On some apps like Pokerist, I won a lot of championship in the online tournaments. Game theory and probability can be fully extended in this game! I love it very much.

  • Fencing

    I have learned fencing since my Junior year. (The video shows the fencing class, and the person in the left is me).

    Looking forward to meeting fencing lovers all over the world!