27. Adding more data to NumPy arrays and Pandas dataframes

This post is also available as a PDF and as a Jupyter Notebook.

Numpy

Adding more rows of data

To add more rows to an existing numpy array use the vstack method which can add multiple or single rows. New data may be in the form of a numpy array or a list. All combined data must have the same number of columns.

import numpy as np

# Starting with a NumPy array
array1 = np.array([[1,2,3,4,5],
         [6,7,8,9,10],
         [11,12,13,14,15]])

# An additional 2d list
array2 = [[16,17,18,19,20],
         [21,22,23,24,25]]

# An additional single row Numpy array
array3 = np.array([26,27,28,29,30])

# We will combine all data into existing array, array1
# But a new name could be given
array1 = np.vstack([array1, array2, array3])

print (array1)

OUT:

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]
 [26 27 28 29 30]]

Adding more columns of data

To add more columns to an existing numpy array use the hstack method which can add multiple or single rows. All combined data must have the same number of rows.

df1 = pd.DataFrame()
names = ['Gandolf','Gimli','Frodo','Legolas','Bilbo']
types = ['Wizard','Dwarf','Hobbit','Elf','Hobbit']

df1['names'] = names
df1['type'] = types

print (df1)

# Add another column
magic = [10, 1, 4, 6, 4]
df1['magic'] = magic

print ('\n Added column:\n',df1)

OUT:

     names    type
0  Gandolf  Wizard
1    Gimli   Dwarf
2    Frodo  Hobbit
3  Legolas     Elf
4    Bilbo  Hobbit

 Added column:
     names    type  magic
0  Gandolf  Wizard     10
1    Gimli   Dwarf      1
2    Frodo  Hobbit      4
3  Legolas     Elf      6
4    Bilbo  Hobbit      4

We can use concat also to add multiple columns (in the form of another dataframe), in which case the data will be combined based on the index column. We pass the argument axis=1 to the concat statement to instruct the method to combine by column (it defaults to axis=0, or row concatenation).

df1 = pd.DataFrame()
names = ['Gandolf','Gimli','Frodo','Legolas','Bilbo']
types = ['Wizard','Dwarf','Hobbit','Elf','Hobbit']

df1['names'] = names
df1['type'] = types

print (df1)

df2 = pd.DataFrame()

magic = [10, 1, 4, 6, 4]
aggression = [7, 10, 2, 5, 1]
stealth = [8, 2, 5, 10, 5]

df2['magic_power'] = magic
df2['aggression'] = aggression
df2['stealth'] = stealth

df1 = pd.concat([df1,df2], axis=1)
print(df1)

OUT:

df1 = pd.concat([df1,df2], axis=1)

print(df1)

     names    type  magic_power  aggression  stealth
0  Gandolf  Wizard           10           7        8
1    Gimli   Dwarf            1          10        2
2    Frodo  Hobbit            4           2        5
3  Legolas     Elf            6           5       10
4    Bilbo  Hobbit            4           1        5

There is more information here: https://pandas.pydata.org/pandas-docs/stable/merging.html

One thought on “27. Adding more data to NumPy arrays and Pandas dataframes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s