Using NumPy from pandas DataFrame

After we create DataFrame using pandas library, we might want to access data from the data frame directly. Here is how to do it.

import pandas as pd

def test_run():
    df = pd.read_csv("data/AAPL.csv")
    print df.values

if __name__ == "__main__":
    test_run()

Here is the output. Notice that it does not contain name of the columns. This table is really the content of data frame.

[['2017-01-20' 120.449997 120.449997 ..., 120.0 29479900 120.0]
 ['2017-01-19' 119.400002 120.089996 ..., 119.779999 25295700 119.779999]
 ['2017-01-18' 120.0 120.5 ..., 119.989998 23644700 119.989998]
 ..., 
 ['1980-12-16' 25.375 25.375 ..., 25.25 26432000 0.374879]
 ['1980-12-15' 27.375001 27.375001 ..., 27.25 43971200 0.404572]
 ['1980-12-12' 28.75 28.875 ..., 28.75 117258400 0.426842]]

If we want to navigate to specific coordinates in this 2 dimensional array, we use brackets with indexes, like nd[row, column].

import pandas as pd

def test_run():
    df = pd.read_csv("data/AAPL.csv")
    nd = df.values
    print nd[0, 0]

if __name__ == "__main__":
    test_run()

Here is the output.

2017-01-20

Sub array

If we would like to extract sub array from the nd array, we can provide ranges coordinates.

import pandas as pd

def test_run():
    df = pd.read_csv("data/AAPL.csv")
    nd = df.values
    print nd[0:5, 0:2]

if __name__ == "__main__":
    test_run()

Here is the sub array.

[['2017-01-20' 120.449997]
 ['2017-01-19' 119.400002]
 ['2017-01-18' 120.0]
 ['2017-01-17' 118.339996]
 ['2017-01-13' 119.110001]]

Select all rows and some columns

Use just ":" as row parametr to select all rows.

import pandas as pd

def test_run():
    df = pd.read_csv("data/AAPL.csv")
    nd = df.values
    print nd[:, 0]

if __name__ == "__main__":
    test_run()

Here is the output. It contains all rows and just one row.

[['2017-01-20']
 ['2017-01-19']
 ['2017-01-18']
 ['2017-01-17']
 ['2017-01-13']]

Select last or second last row

We can navigate to rows from the bottom of the table using negative numbers.

import pandas as pd

def test_run():
    df = pd.read_csv("data/AAPL.csv")
    nd = df.values
    print nd[-1, 0]

if __name__ == "__main__":
    test_run()

The output is the last value in column on 0th position.

2017-01-13

If we would use -2 as row parameter value, we would get this output.

2017-01-17

Last updated