After we create DataFrame using pandas library, we might want to access data from the data frame directly. Here is how to do it.
import pandas as pd
def test_run():
df = pd.read_csv("data/AAPL.csv")
print df.values
if __name__ == "__main__":
test_run()
Here is the output. Notice that it does not contain name of the columns. This table is really the content of data frame.
[['2017-01-20' 120.449997 120.449997 ..., 120.0 29479900 120.0]
['2017-01-19' 119.400002 120.089996 ..., 119.779999 25295700 119.779999]
['2017-01-18' 120.0 120.5 ..., 119.989998 23644700 119.989998]
...,
['1980-12-16' 25.375 25.375 ..., 25.25 26432000 0.374879]
['1980-12-15' 27.375001 27.375001 ..., 27.25 43971200 0.404572]
['1980-12-12' 28.75 28.875 ..., 28.75 117258400 0.426842]]
Navigating in N-Dimensional array
If we want to navigate to specific coordinates in this 2 dimensional array, we use brackets with indexes, like nd[row, column].
import pandas as pd
def test_run():
df = pd.read_csv("data/AAPL.csv")
nd = df.values
print nd[0, 0]
if __name__ == "__main__":
test_run()
Here is the output.
Sub array
If we would like to extract sub array from the nd array, we can provide ranges coordinates.
import pandas as pd
def test_run():
df = pd.read_csv("data/AAPL.csv")
nd = df.values
print nd[0:5, 0:2]
if __name__ == "__main__":
test_run()
Here is the sub array.
[['2017-01-20' 120.449997]
['2017-01-19' 119.400002]
['2017-01-18' 120.0]
['2017-01-17' 118.339996]
['2017-01-13' 119.110001]]
Select all rows and some columns
Use just ":" as row parametr to select all rows.
import pandas as pd
def test_run():
df = pd.read_csv("data/AAPL.csv")
nd = df.values
print nd[:, 0]
if __name__ == "__main__":
test_run()
Here is the output. It contains all rows and just one row.
[['2017-01-20']
['2017-01-19']
['2017-01-18']
['2017-01-17']
['2017-01-13']]
Select last or second last row
We can navigate to rows from the bottom of the table using negative numbers.
import pandas as pd
def test_run():
df = pd.read_csv("data/AAPL.csv")
nd = df.values
print nd[-1, 0]
if __name__ == "__main__":
test_run()
The output is the last value in column on 0th position.
If we would use -2 as row parameter value, we would get this output.