Using apply_along_axis (NumPy) or apply (Pandas) is a more Pythonic way of iterating through data in NumPy and Pandas (see related tutorial here). But there may be occasions you wish to simply work your way through rows or columns in NumPy and Pandas. Here is how it is done.
NumPy
NumPy is set up to iterate through rows when a loop is declared.
import numpy as np
# Create an array of random numbers (3 rows, 5 columns)
array = np.random.randint(0,100,size=(3,5))
print ('Array:')
print (array)
print ('\nAverage of rows:')
# iterate through rows:
for row in array:
print (row.mean())
OUT:
Array:
[[12 40 30 93 99]
[62 85 89 26 17]
[93 34 67 59 56]]
Average of rows:
54.8
55.8
61.8
To iterate through columns we transpose the array with .T so that rows become columns (and vice versa):
print('\nTranposed array:')
print (array.T)
print ('\nAverage of original columns:')
for row_t in array.T:
print (row_t.mean())
Transposed array:
[[12 62 93]
[40 85 34]
[30 89 67]
[93 26 59]
[99 17 56]]
Average of original columns:
55.666666666666664
53.0
62.0
59.333333333333336
57.333333333333336
Pandas
Lets first create our data:
import pandas as pd
df = pd.DataFrame()
names = ['Gandolf',
'Gimli',
'Frodo',
'Legolas',
'Bilbo',
'Sam',
'Pippin',
'Boromir',
'Aragorn',
'Galadriel',
'Meriadoc']
types = ['Wizard',
'Dwarf',
'Hobbit',
'Elf',
'Hobbit',
'Hobbit',
'Hobbit',
'Man',
'Man',
'Elf',
'Hobbit']
magic = [10, 1, 4, 6, 4, 2, 0, 0, 2, 9, 0]
aggression = [7, 10, 2, 5, 1, 6, 3, 8, 7, 2, 4]
stealth = [8, 2, 5, 10, 5, 4 ,5, 3, 9, 10, 6]
df['names'] = names
df['type'] = types
df['magic_power'] = magic
df['aggression'] = aggression
df['stealth'] = stealth
To iterate throw rows in a Pandas dataframe we use .iterrows():
for index, row in df.iterrows():
print(row[0], 'is a', row[1])
OUT:
Gandolf is a Wizard
Gimli is a Dwarf
Frodo is a Hobbit
Legolas is a Elf
Bilbo is a Hobbit
Sam is a Hobbit
Pippin is a Hobbit
Boromir is a Man
Aragorn is a Man
Galadriel is a Elf
Meriadoc is a Hobbit
To iterate through columns we need to do just a bit more manual work, creating a list of dataframe columns and then iterating through that list to pull out the dataframe columns:
columns = list(df)
for column in columns:
print (df[column][2]) # print the third element of the column
OUT:
Frodo
Hobbit
4
2
5
It yields an iterator which can can be used to iterate over all the rows of a dataframe in tuples. For each row it returns a tuple containing the index label and row contents as series.
LikeLike