Pandas: Check memory occupied by DataFrame

19 August 2020

Code

# Import library
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({
    'num_1e6': np.random.randn(1000000),
    'string': ['one']*1000000,
    'boolean': [True, False]*500000,
    'NaN': np.nan,
    'blank': ''
})

print(df.head(3))
    num_1e6 string  boolean  NaN blank
0  0.011071    one     True  NaN      
1  0.675186    one    False  NaN      
2 -0.541657    one     True  NaN

.

# Memory usage: 1
m1 = df.memory_usage()

# Memory usage: 2
m2 = df.memory_usage(deep=True)

# Output
print('Memory usage (default) in bytes: \n', m1, '\n\n')
print('Memory usage (deep=True) in bytes: \n', m2)

Output:

Memory usage (default) in bytes: 
 Index          128
num_1e6    8000000
string     8000000
boolean    1000000
NaN        8000000
blank      8000000
dtype: int64 


Memory usage (deep=True) in bytes: 
 Index           128
num_1e6     8000000
string     60000000
boolean     1000000
NaN         8000000
blank      61000000
dtype: int64






Any errors in code above?
Please send a message.