Education I am a graduate student from University of California, Davis. I double-majored in Statistics and Horticulture & Agronomy. My research focuses mainly agroecology. Recently, I am focusing on finish my thesis, which is about modelling the reproduction time and biomass of California grassland. The model is developed based on field obse... Read more 23 Mar 2017 - 1 minute read
Motivation Given the large basis of NBA audiences and increasing pursue for super stars, using past year data to analyze who is the next super star seems to be an interesting and meaningful question. Questions Are there any statistical patterns in the players performance? Can we use the pattern to predict the potential stars? What feat... Read more 22 Mar 2017 - 4 minute read
These two interesting questions come from STA141B assignment. This assignment was done using Python sqlite3, basemap, matplotlib and pandas. You may learn other questions in this assignment from this link. Question 1: Which part of San Francisco is Most Dangerous (and what time)? Data Description and Processing The dataset crime was collected ... Read more 22 Mar 2017 - 5 minute read
import requests import requests_cache import pandas as pd import lxml.html as lx import numpy as np import matplotlib.pyplot as plt from sklearn.cluster import KMeans from sklearn import metrics from sklearn.preprocessing import scale from sklearn.decomposition import PCA requests_cache.install_cache('project_cache') Data Scrapping def al... Read more 21 Mar 2017 - 10 minute read