Pandas function song

List of Pandas functions in the song:

  1. df.sort_values('feature', ascending=False): Sorts the DataFrame by the specified ‘feature’ column in descending order.
  2. df.reset_index(): Resets the index of the DataFrame, converting the current index into a column and generating a new sequential index.
  3. df.drop(columns=['width', 'height']): Removes the specified ‘width’ and ‘height’ columns from the DataFrame.
  4. df.drop_duplicates(): Removes duplicate rows from the DataFrame.
  5. df.sample(frac=0.2): Returns a random sample of 20% of the rows from the DataFrame.
  6. df.sample(n=20): Returns a random sample of 20 rows from the DataFrame.
  7. df.plot.hist(): Creates a histogram plot for the numerical data in the DataFrame.
  8. df.plot.scatter(x, y): Generates a scatter plot using the specified columns ‘x’ and ‘y’ from the DataFrame.
  9. df.dropna(): Removes rows that contain missing (NaN) values.
  10. df.fillna(value): Fills missing (NaN) values in the DataFrame with the specified value.

Example codes:

Here are the examples with their corresponding outputs:

  1. df.sort_values('feature', ascending=False):
   df = pd.DataFrame({'feature': [10, 20, 15], 'name': ['A', 'B', 'C']})
   df_sorted = df.sort_values('feature', ascending=False)
   print(df_sorted)

Output:

   feature name
   1       20    B
   2       15    C
   0       10    A
  1. df.reset_index():
   df = pd.DataFrame({'name': ['A', 'B', 'C']}, index=[10, 20, 30])
   df_reset = df.reset_index()
   print(df_reset)

Output:

      index name
   0     10    A
   1     20    B
   2     30    C
  1. df.drop(columns=['width', 'height']):
   df = pd.DataFrame({'width': [10, 20], 'height': [30, 40], 'depth': [5, 10]})
   df_dropped = df.drop(columns=['width', 'height'])
   print(df_dropped)

Output:

      depth
   0      5
   1     10
  1. df.drop_duplicates():
   df = pd.DataFrame({'name': ['A', 'B', 'A'], 'value': [1, 2, 1]})
   df_unique = df.drop_duplicates()
   print(df_unique)

Output:

     name  value
   0    A      1
   1    B      2
  1. df.sample(frac=0.2):
   df = pd.DataFrame({'name': ['A', 'B', 'C', 'D', 'E'], 'value': [1, 2, 3, 4, 5]})
   df_sampled = df.sample(frac=0.2)
   print(df_sampled)

Output (random selection, may vary):

     name  value
   0    A      1
  1. df.sample(n=2):
   df = pd.DataFrame({'name': ['A', 'B', 'C', 'D', 'E'], 'value': [1, 2, 3, 4, 5]})
   df_sampled = df.sample(n=2)
   print(df_sampled)

Output (random selection, may vary):

     name  value
   1    B      2
   4    E      5
  1. df.plot.hist():
   df = pd.DataFrame({'data': [1, 2, 2, 3, 3, 3, 4]})
   df.plot.hist()

Output: (Histogram plot will display the frequency of the values)

  1. df.plot.scatter(x='x_col', y='y_col'):
   df = pd.DataFrame({'x_col': [1, 2, 3], 'y_col': [4, 5, 6]})
   df.plot.scatter(x='x_col', y='y_col')

Output: (Scatter plot will display points at (1,4), (2,5), and (3,6))

  1. df.dropna():
   df = pd.DataFrame({'name': ['A', 'B', None], 'value': [1, None, 3]})
   df_cleaned = df.dropna()
   print(df_cleaned)

Output:

     name  value
   0    A    1.0
   2  None    3.0
  1. df.fillna(value=0):
   df = pd.DataFrame({'name': ['A', 'B', None], 'value': [1, None, 3]})
   df_filled = df.fillna(value=0)
   print(df_filled)

Output:

     name  value
   0     A    1.0
   1     B    0.0
   2     0    3.0


Discover more from Science Comics

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!