List of Pandas functions in the song:
df.sort_values('feature', ascending=False)
: Sorts the DataFrame by the specified ‘feature’ column in descending order.df.reset_index()
: Resets the index of the DataFrame, converting the current index into a column and generating a new sequential index.df.drop(columns=['width', 'height'])
: Removes the specified ‘width’ and ‘height’ columns from the DataFrame.df.drop_duplicates()
: Removes duplicate rows from the DataFrame.df.sample(frac=0.2)
: Returns a random sample of 20% of the rows from the DataFrame.df.sample(n=20)
: Returns a random sample of 20 rows from the DataFrame.df.plot.hist()
: Creates a histogram plot for the numerical data in the DataFrame.df.plot.scatter(x, y)
: Generates a scatter plot using the specified columns ‘x’ and ‘y’ from the DataFrame.df.dropna()
: Removes rows that contain missing (NaN) values.df.fillna(value)
: Fills missing (NaN) values in the DataFrame with the specified value.
Example codes:
Here are the examples with their corresponding outputs:
df.sort_values('feature', ascending=False)
:
df = pd.DataFrame({'feature': [10, 20, 15], 'name': ['A', 'B', 'C']})
df_sorted = df.sort_values('feature', ascending=False)
print(df_sorted)
Output:
feature name
1 20 B
2 15 C
0 10 A
df.reset_index()
:
df = pd.DataFrame({'name': ['A', 'B', 'C']}, index=[10, 20, 30])
df_reset = df.reset_index()
print(df_reset)
Output:
index name
0 10 A
1 20 B
2 30 C
df.drop(columns=['width', 'height'])
:
df = pd.DataFrame({'width': [10, 20], 'height': [30, 40], 'depth': [5, 10]})
df_dropped = df.drop(columns=['width', 'height'])
print(df_dropped)
Output:
depth
0 5
1 10
df.drop_duplicates()
:
df = pd.DataFrame({'name': ['A', 'B', 'A'], 'value': [1, 2, 1]})
df_unique = df.drop_duplicates()
print(df_unique)
Output:
name value
0 A 1
1 B 2
df.sample(frac=0.2)
:
df = pd.DataFrame({'name': ['A', 'B', 'C', 'D', 'E'], 'value': [1, 2, 3, 4, 5]})
df_sampled = df.sample(frac=0.2)
print(df_sampled)
Output (random selection, may vary):
name value
0 A 1
df.sample(n=2)
:
df = pd.DataFrame({'name': ['A', 'B', 'C', 'D', 'E'], 'value': [1, 2, 3, 4, 5]})
df_sampled = df.sample(n=2)
print(df_sampled)
Output (random selection, may vary):
name value
1 B 2
4 E 5
df.plot.hist()
:
df = pd.DataFrame({'data': [1, 2, 2, 3, 3, 3, 4]})
df.plot.hist()
Output: (Histogram plot will display the frequency of the values)
df.plot.scatter(x='x_col', y='y_col')
:
df = pd.DataFrame({'x_col': [1, 2, 3], 'y_col': [4, 5, 6]})
df.plot.scatter(x='x_col', y='y_col')
Output: (Scatter plot will display points at (1,4), (2,5), and (3,6))
df.dropna()
:
df = pd.DataFrame({'name': ['A', 'B', None], 'value': [1, None, 3]})
df_cleaned = df.dropna()
print(df_cleaned)
Output:
name value
0 A 1.0
2 None 3.0
df.fillna(value=0)
:
df = pd.DataFrame({'name': ['A', 'B', None], 'value': [1, None, 3]})
df_filled = df.fillna(value=0)
print(df_filled)
Output:
name value
0 A 1.0
1 B 0.0
2 0 3.0
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.