Python Data Analysis ProjectsUsed IDE: PyCharmTeam size: solo projects Half year project


In order to learn data analysis and practice python, I began python data projects.

Given an excel file that contains one year worth of housing data for Queens New York, I made time plot, bar graph, and pie chart to visualize data sets.


Provided data that contains categorized emails spam or not, I built a spam filter.

As far as I listed 5 words with highest probabilities with spam or not, Your, For, The, You, A was the 5 most "spammiest" words. Re:, The, For, To, Of are the 5 most "normal" words.


Using math scores of students, I trained a program which judge a student with scores will be pass or fail.

I used Logistic Regression Algorithm with Grandient Descent.


With game concept images, I applied k-means clustering algorithm to re-color an image with k colors only.

Left pictures are original images and right pictures are processed images

Image of Zelda(Original, Processed)


Image of Lost Ark(Original, Processed)


Image of Blue Archive(Original, Processed)