Sunday, 31 August 2014

Simple visualization using (python and R )_part 1/2

Today I am going to make a simple graph with a data we generated through last post.
We had a data set which has a 4 data columns. I would like to just visualize each individual age using python and R.

We will compare two different visualization methods at the end of this post.

Let's look at the python case first.
In order to visualize the data, I am going to use a matplotlib library which is so popular. Please check the http://matplotlib.org/ for detail information. You can review the sample code and detail instruction.

This time I am going to just modify the code I made in previous post.
Of course matplotlib library is imported at the beginning section and some codes are added for displaying the graph.
As I said, my interest is the player's age. As you can see the source code, the individual's age is stored into the list.   




==================

import urllib
import re
import matplotlib.pyplot as plt

def main():
    data2=[]
    htmlfile = urllib.urlopen("http://www.nbastuffer.com/2013-2014_NBA_Regular_Season_Player_Stats.html")
    htmltext=htmlfile.read()

    pattern = re.compile('<td></td><td>([\w]+[\s][\w]+)</td><td>(\w\w\w)</td><td>(\w+)</td><td>(\d+)</td><td>')

    nba_contents=re.findall(pattern,htmltext)
    for filerows in nba_contents:
        (name, team, position, age ) = filerows
        data2.append(age)

    plt.plot(data2 ,'ro')
    plt.ylabel("Age")
    plt.xlabel("Player")
    plt.show()

====================

Once you execute this python program, new window is popped up.
As you can see, x-axis, y-axis were individually programmed as the "Player" and "Age"

[hadoop08:39:03@MATPLOT]$python WEB_NBA.py
[hadoop08:41:13@MATPLOT]$


It's done, Next post I am going to do the same thing using R programming.


No comments: