Posts Melt and Unmelt(Pivot) in Pandas, Python
Post
Cancel

Melt and Unmelt(Pivot) in Pandas, Python

Reshaping data from long to wide or wide to long is a common task in data manipulation, and pandas provides several powerful methods to achieve this.

Converting Long to Wide

Long data is where multiple rows represent the same overservation across different conditions or categories.

Methods for Long -> Wide

1.pivot()

Reshapes data by turning unique values from a column into new column, It is commonly used when converting time-series data where rows represent repeated measurements into a wide format.

1
df.pivot(index = 'identifier', columns = 'category', values = 'value')
  • param : {‘index’ : what you want the index to be, ‘columns’ : what you want the column to be, ‘value’ : what you want the value to be}

2. pivot_table()

Similar to pivot(), but it allows for aggregation in case of duplicate entries. so this can aggregate and reshape simultaneously when you have duplicates.

1
df.pivot_table(index = 'identifier', columns = 'category', values = 'value', aggfunc = 'mean')

3. unstack()

moves an index level to columns, converting hierarchical rows into columns. it works well with MultiIndex Dataframes.

1
df_wide = df.set_index(['identifier', 'category]).unstack()

Converting Wide to Long

wide data is where each column represents a variable, condition, or category.

Methods for Wide to Long

1. melt()

column to value
Transforms wide data into a long format by combining multiple columns into two : one for the variable names and one for values.

1
df.melt(id_vars = '', var_name = '', value_name = '')

-> returns DataFrame

2. wide_to_long()

Reshapes column names with a pattern into a long format.

1
pd.wide_to_long(df, stubnames = 'measure', i = '', j = '')

in this case, there should be columns named ‘measure_1’, ‘measure_2’, … since we give ‘measure’ as stubnames.
i is the new column name that will identify each rows.
j is the new column that will contain the values.

ex) For example, if there is a row with the column name ‘measure_pink’ and the value is 7, we set stubnames as ‘measure’ and i as ‘color’, since the letters after ‘_’ appear to refer to color palettes. Then, assuming 7 represents the count obtained from some source, j should be set to ‘count’.

3. stack()

column to index it moves column into rows.

1
df.set_index('').stack().reset_index()

-> returns Series

MethodConverts FromConverts ToBest Use Case
pivot()LongWideUnique index-column combinations
pivot_table()LongWideAggregating and reshaping simultaneously
melt()WideLongTidy format where one column holds variable names
wide_to_long()WideLongPanel data or repeated measurement reshaping
stack()WideLongWide format with MultiIndex conversion
unstack()LongWideMoving index levels into columns

Key Considerations

  • pivot_table() instead of pivot() if you expect duplicates.
  • for stack() and unstack(), make sure your DataFrame is properly indexed.
  • for custom patterns, consider wide_to_long() or write custom reshaping logic using grobupy() and agg() or etc.