pandas- rename dataframe multilevel header according to the name of the first level header

  • Last Update :
  • Techknowledgy :

I can not find a function can directly doing this so

df.columns = df.columns.values
df
Out[110]:
   (X, a)(X, b)(Y, a)(Y, b)
0 1 3 4 2
1 5 7 8 6
df.rename(columns = {
   ('Y', 'b'): ('Y', 'b1')
})
Out[111]:
   (X, a)(X, b)(Y, a)(Y, b1)
0 1 3 4 2
1 5 7 8 6
df = df.rename(columns = {
   ('Y', 'b'): ('Y', 'b1')
})
df.columns = pd.MultiIndex.from_tuples(df.columns)
df
Out[114]:
   X Y
a b a b1
0 1 3 4 2
1 5 7 8 6

Alternatively, you could do:

# Find index of column to change.
i = df.columns.get_loc(('Y', 'b'))
# Rename the column.
cols = df.columns.values
cols[i] = ('Y', 'b1')
df.columns = pd.MultiIndex.from_tuples(cols)

Here is a slightly shorter alternative:

import pandas as pd

cols = pd.MultiIndex.from_tuples([('X', 'a'), ('X', 'b'), ('Y', 'a'), ('Y', 'b')])
df = pd.DataFrame([
   [1, 3, 4, 2],
   [5, 7, 8, 6]
], columns = cols)

# Here is the renaming part.
mapper = {
   ("Y", "b"): ("Y", "b1")
}
df.columns = pd.MultiIndex.from_tuples([mapper.get(x, x) for x in df.columns])

Suggestion : 2

And I want to rename a specific column name, anycodings_python for example "b" to "b1" under "Y" header. anycodings_python The desired result is,I have a dataframe like this :,Change column names in Pandas Dataframe from a list,So it is important, that header "b" under anycodings_python "X" remained unchanged. That mean i can't anycodings_python just use rename

I have a dataframe like this :

    X Y
    a b a b
    0 1 3 4 2
    1 5 7 8 6

And I want to rename a specific column name, anycodings_python for example "b" to "b1" under "Y" header. anycodings_python The desired result is

    X Y
    a b a b1
    0 1 3 4 2
    1 5 7 8 6

I can not find a function can directly anycodings_python doing this so

df.columns = df.columns.values
df
Out[110]:
   (X, a)(X, b)(Y, a)(Y, b)
0 1 3 4 2
1 5 7 8 6
df.rename(columns = {
   ('Y', 'b'): ('Y', 'b1')
})
Out[111]:
   (X, a)(X, b)(Y, a)(Y, b1)
0 1 3 4 2
1 5 7 8 6
df = df.rename(columns = {
   ('Y', 'b'): ('Y', 'b1')
})
df.columns = pd.MultiIndex.from_tuples(df.columns)
df
Out[114]:
   X Y
a b a b1
0 1 3 4 2
1 5 7 8 6

Alternatively, you could do:

# Find index of column to change.
i = df.columns.get_loc(('Y', 'b'))
# Rename the column.
cols = df.columns.values
cols[i] = ('Y', 'b1')
df.columns = pd.MultiIndex.from_tuples(cols)

Here is a slightly shorter alternative:

import pandas as pd

cols = pd.MultiIndex.from_tuples([('X', 'a'), ('X', 'b'), ('Y', 'a'), ('Y', 'b')])
df = pd.DataFrame([
   [1, 3, 4, 2],
   [5, 7, 8, 6]
], columns = cols)

# Here is the renaming part.
mapper = {
   ("Y", "b"): ("Y", "b1")
}
df.columns = pd.MultiIndex.from_tuples([mapper.get(x, x) for x in df.columns])

Suggestion : 3

String: How to iterate over strings in pandas dataframe and remove unwanted words?,Python: pandas.read_excel parameter "sheet_name" not working,What are data classes and how are they different from common classes in Python,Is virtualenv recommended for django production server?

I can not find a function can directly doing this so

df.columns = df.columns.values
df
Out[110]:
   (X, a)(X, b)(Y, a)(Y, b)
0 1 3 4 2
1 5 7 8 6
df.rename(columns = {
   ('Y', 'b'): ('Y', 'b1')
})
Out[111]:
   (X, a)(X, b)(Y, a)(Y, b1)
0 1 3 4 2
1 5 7 8 6
df = df.rename(columns = {
   ('Y', 'b'): ('Y', 'b1')
})
df.columns = pd.MultiIndex.from_tuples(df.columns)
df
Out[114]:
   X Y
a b a b1
0 1 3 4 2
1 5 7 8 6

Alternatively, you could do:

# Find index of column to change.
i = df.columns.get_loc(('Y', 'b'))
# Rename the column.
cols = df.columns.values
cols[i] = ('Y', 'b1')
df.columns = pd.MultiIndex.from_tuples(cols)

Suggestion : 4

The rename_axis() method is used to rename the name of a Index or MultiIndex. In particular, the names of the levels of a MultiIndex can be specified, which is useful if reset_index() is later used to move the values from the MultiIndex to a column.,The rename() method is used to rename the labels of a MultiIndex, and is typically used to rename the columns of a DataFrame. The columns argument of rename allows a dictionary to be specified that includes only the columns you wish to rename.,This method can also be used to rename specific labels of the main index of the DataFrame.,The reindex() method of Series/DataFrames can be called with another MultiIndex, or even a list or array of tuples:

In[1]: arrays = [
      ...: ["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
      ...: ["one", "two", "one", "two", "one", "two", "one", "two"],
      ...:
   ]
   ...:

   In[2]: tuples = list(zip( * arrays))

In[3]: tuples
Out[3]: [('bar', 'one'),
   ('bar', 'two'),
   ('baz', 'one'),
   ('baz', 'two'),
   ('foo', 'one'),
   ('foo', 'two'),
   ('qux', 'one'),
   ('qux', 'two')
]

In[4]: index = pd.MultiIndex.from_tuples(tuples, names = ["first", "second"])

In[5]: index
Out[5]:
   MultiIndex([('bar', 'one'),
         ('bar', 'two'),
         ('baz', 'one'),
         ('baz', 'two'),
         ('foo', 'one'),
         ('foo', 'two'),
         ('qux', 'one'),
         ('qux', 'two')
      ],
      names = ['first', 'second'])

In[6]: s = pd.Series(np.random.randn(8), index = index)

In[7]: s
Out[7]:
   first second
bar one 0.469112
two - 0.282863
baz one - 1.509059
two - 1.135632
foo one 1.212112
two - 0.173215
qux one 0.119209
two - 1.044236
dtype: float64
In[8]: iterables = [
   ["bar", "baz", "foo", "qux"],
   ["one", "two"]
]

In[9]: pd.MultiIndex.from_product(iterables, names = ["first", "second"])
Out[9]:
   MultiIndex([('bar', 'one'),
         ('bar', 'two'),
         ('baz', 'one'),
         ('baz', 'two'),
         ('foo', 'one'),
         ('foo', 'two'),
         ('qux', 'one'),
         ('qux', 'two')
      ],
      names = ['first', 'second'])
In[10]: df = pd.DataFrame(
      ....: [
         ["bar", "one"],
         ["bar", "two"],
         ["foo", "one"],
         ["foo", "two"]
      ],
      ....: columns = ["first", "second"],
      ....: )
   ....:

   In[11]: pd.MultiIndex.from_frame(df)
Out[11]:
   MultiIndex([('bar', 'one'),
         ('bar', 'two'),
         ('foo', 'one'),
         ('foo', 'two')
      ],
      names = ['first', 'second'])
In[12]: arrays = [
      ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),
      ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]),
      ....:
   ]
   ....:

   In[13]: s = pd.Series(np.random.randn(8), index = arrays)

In[14]: s
Out[14]:
   bar one - 0.861849
two - 2.104569
baz one - 0.494929
two 1.071804
foo one 0.721555
two - 0.706771
qux one - 1.039575
two 0.271860
dtype: float64

In[15]: df = pd.DataFrame(np.random.randn(8, 4), index = arrays)

In[16]: df
Out[16]:
   0 1 2 3
bar one - 0.424972 0.567020 0.276232 - 1.087401
two - 0.673690 0.113648 - 1.478427 0.524988
baz one 0.404705 0.577046 - 1.715002 - 1.039268
two - 0.370647 - 1.157892 - 1.344312 0.844885
foo one 1.075770 - 0.109050 1.643563 - 1.469388
two 0.357021 - 0.674600 - 1.776904 - 0.968914
qux one - 1.294524 0.413738 0.276662 - 0.472035
two - 0.013960 - 0.362543 - 0.006154 - 0.923061
In[17]: df.index.names
Out[17]: FrozenList([None, None])
In[18]: df = pd.DataFrame(np.random.randn(3, 8), index = ["A", "B", "C"], columns = index)

In[19]: df
Out[19]:
   first bar baz foo qux
second one two one two one two one two
A 0.895717 0.805244 - 1.206412 2.565646 1.431256 1.340309 - 1.170299 - 0.226169
B 0.410835 0.813850 0.132003 - 0.827317 - 0.076467 - 1.187678 1.130127 - 1.436737
C - 1.413681 1.607920 1.024180 0.569605 0.875906 - 2.211372 0.974466 - 2.006747

In[20]: pd.DataFrame(np.random.randn(6, 6), index = index[: 6], columns = index[: 6])
Out[20]:
   first bar baz foo
second one two one two one two
first second
bar one - 0.410001 - 0.078638 0.545952 - 1.219217 - 1.226825 0.769804
two - 1.281247 - 0.727707 - 0.121306 - 0.097883 0.695775 0.341734
baz one 0.959726 - 1.110336 - 0.619976 0.149748 - 0.732339 0.687738
two 0.176444 0.403310 - 0.154951 0.301624 - 2.179861 - 1.369849
foo one - 0.954208 1.462696 - 1.743161 - 0.826591 - 0.345352 1.314232
two 0.690579 0.995761 2.396780 0.014871 3.357427 - 0.317441

Suggestion : 5

pandas DataFrame.rename() function is used to rename the single column name, multiple columns, by index position, in place, with a list, with a dict and all columns e.t.c. We are often required to change the column name of the DataFrame before we perform any operations; in fact, rename() is one of the most searched and used functions of the Pandas.,You can also use the same approach to rename multiple columns of Pandas DataFrame. All you need to specify multiple columns you wanted to rename in a dictionary mapping.,By defaults rename() function returns a new DataFrame after updating the column names, you can change this behaviour and rename in place by using inplace=True pram.,In order to rename a single column name on pandas DataFrame, you can use column={} parameter with the dictionary mapping of the old name and a new name. Note that when you use column param, you cannot explicitly use axis param.

If you are in a hurry, below are some quick examples of how to rename a column name.

# # Quick Examples of Renaming DataFrame Columns # #

# Rename columns with list.
df.columns = ['A', 'B', 'C']

# Rename column name by index.This changes 3 rd column
df.columns.values[2] = "C"

# Rename Column Names using rename() method
df2 = df.rename({
   'a': 'A',
   'b': 'B'
}, axis = 1)
df2 = df.rename({
   'a': 'A',
   'b': 'B'
}, axis = 'columns')
df2 = df.rename(columns = {
   'a': 'A',
   'b': 'B'
})

# Rename columns inplace(self DataFrame)
df.rename(columns = {
   'a': 'A',
   'b': 'B'
}, inplace = True)

# Rename using lambda
function
df.rename(columns = lambda x: x[1: ], inplace = True)

# Rename with error.When x not present, it thorows error.
df.rename(columns = {
   'x': 'X'
}, errors = "raise")

#Rename all columns using set_axis()
df2 = df.set_axis(['A', 'B', 'C'], axis = 1)

Following is the syntax of the pandas.DataFrame.rename() method, this returns either DataFrame or None. By default returns DataFrame after updating columns. When use inplace=True it updates the existing DataFrame inplace (self) and returns None.

#DataFrame.rename() Syntax
DataFrame.rename(mapper = None, index = None, columns = None, axis = None,
   copy = True, inplace = False, level = None, errors = 'ignore')
3._
import pandas as pd
technologies = ({
   'Courses': ["Spark", "PySpark", "Hadoop", "Python", "pandas", "Oracle", "Java"],
   'Fee': [20000, 25000, 26000, 22000, 24000, 21000, 22000],
   'Duration': ['30day', '40days', '35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df.columns)

pandas DataFrame.rename() accepts a dict(dictionary) as a param for columns you wanted to rename, so you just pass a dict with key-value pair; the key is an existing column you would like to rename and value would be your preferred column name.

# Rename a Single Column
df2 = df.rename(columns = {
   'Courses': 'Courses_List'
})
print(df2.columns)

Yields below output. As you see it rename column from Courses to Courses_List.

Index(['Courses_List', 'Fee', 'Duration'], dtype = 'object')