change date format of pandas column (month-day-year to day-month-year)

  • Last Update :
  • Techknowledgy :

Let’s create a new column, “Birthday3” which stores the birthday in the DD-MM-YYYY format. That is, the date “1980-04-01” would be represented as “01-04-1980”. For this, pass the date format string '%d-%m-%Y to the dt.strftime() function.,Let’s create a new column, “Birthday2” which stores the birthday in the MM-DD-YYYY format. That is, the date “1980-04-01” would be represented as “04-01-1980”. For this, pass the date format string '%m-%d-%Y to the dt.strftime() function.,When working with data, you might often encounter instances where your dates are not in the format the you want. For example, the dates are in “YYYY-MM-DD” format and you want them to be in “MM-DD-YYYY” format. In this tutorial, we will look at how to change the format of a date column in a pandas dataframe.,Let’s create a new column, “Birthday3” which stores the birthday in the Month Day, Year format. That is, the date “1980-04-01” would be represented as “April 01, 1980”. For this, pass the date format string '%B %d, %Y to the dt.strftime() function.

To change the date format of a column in a pandas dataframe, you can use the pandas series dt.strftime() function. Pass the format that you want your date to have. The following is the syntax:

# change the format to DD - MM - YYYY
df['Col'] = df['Col'].dt.strftime('%d-%m%Y')

Let’s look at the usage of this function with the help of some examples. First, let’s create a sample dataframe that we will be using throughout this tutorial.

import pandas as pd

# create a dataframe
df = pd.DataFrame({
   'Name': ['Jim', 'Dwight', 'Pam', 'Angela', 'Michael'],
   'Birthday': ['1980-04-01', '1978-06-24', '1982-10-07', '1980-12-25', '1970-02-28']
})
# show the dataframe
print(df)

Output:

      Name Birthday
      0 Jim 1980 - 04 - 01
      1 Dwight 1978 - 06 - 24
      2 Pam 1982 - 10 - 07
      3 Angela 1980 - 12 - 25
      4 Michael 1970 - 02 - 28

You can see that the “Birthday” column is of type “object”. Let’s convert it to datetime, using the pandas to_datetime() function.

# covert to datetime
df['Birthday'] = pd.to_datetime(df['Birthday'])
# show the types
df.info()

Let’s create a new column, “Birthday2” which stores the birthday in the MM-DD-YYYY format. That is, the date “1980-04-01” would be represented as “04-01-1980”. For this, pass the date format string '%m-%d-%Y to the dt.strftime() function.

# date in MM - DD - YYYY format
df['Birthday2'] = df['Birthday'].dt.strftime('%m-%d-%Y')
# display the dataframe
print(df)

Suggestion : 2

Sample code first:

import pandas as pd
data = {
   'day': ['3-20-2019', None, '2-25-2019']
}
df = pd.DataFrame(data)

df['day'] = pd.to_datetime(df['day'])
df['day'] = df['day'].dt.strftime('%d.%m.%Y')
df[df == 'NaT'] = ''

Comments on the above. The first instance of df is in the ipython interpreter:

In[56]: df['day']
Out[56]:
   0 3 - 20 - 2019
1 None
2 2 - 25 - 2019
Name: day, dtype: object

After the conversion to datetime:

In[58]: df['day']
Out[58]:
   0 2019 - 03 - 20
1 NaT
2 2019 - 02 - 25
Name: day, dtype: datetime64[ns]

That NaT makes problems. So we replace all its occurrences with the empty string.

In[73]: df[df == 'NaT'] = ''

In[74]: df
Out[74]:
   day
0 20.03 .2019
1
2 25.02 .2019

Not sure if this is the fastest way to get it done. Anyway,

df = pd.DataFrame({
   'Date': {
      0: '3-20-2019',
      1: "",
      2: "2-25-2019"
   }
}) #your dataframe
df['Date'] = pd.to_datetime(df.Date) #convert to datetime format
df['Date'] = [d.strftime('%d-%m-%Y') if not pd.isnull(d)
   else ''
   for d in df['Date']
]

Output:

         Date
         0 20 - 03 - 2019
         1
         2 25 - 02 - 2019

Suggestion : 3

Lets create a DataFrame which has a single column StartDate:,Note: have in mind that this solution might be really slow in case of a huge DataFrame.,Notebook with all examples: Extract Month and Year from DateTime column,extract month and date to separate columns

To start, here is the syntax that you may apply in order extract concatenation of year and month:

.dt.to_period('M')

Lets create a DataFrame which has a single column StartDate:

dates = ['2021-08-01', '2021-08-02', '2021-08-03']
df = pd.DataFrame({
   'StartDate': dates
})

In order to convert string to Datetime column we are going to use:

df['StartDate'] = pd.to_datetime(df['StartDate'])

result:

0 2021 - 08
1 2021 - 08
2 2021 - 08

What if you like to get the month first and then the year? In this case we will use .dt.strftime in order to produce a column with format: MM/YYYY or any other format.

df['StartDate'].dt.strftime('%m/%Y')

Suggestion : 4

Last Updated : 04 Jul, 2022

Syntax:

strftime(format)

Suggestion : 5

This tutorial demonstrates how to modify the format of a datetime object in a pandas DataFrame in the Python programming language.,This post has shown how to modify the datetime format in pandas DataFrame in the Python programming language. Please leave a comment below, in case you have further questions.,Do you need more explanations on how to switch the datetime format in a pandas DataFrame in Python? Then you should have a look at the following video of Corey Schafer’s YouTube channel.,The previous Python code has created a new DataFrame containing two columns, i.e. our input column and a new column with modified datetime format.

import pandas as pd
data = pd.DataFrame({
   'Date': ['1/11/2021', '2/12/2021', '3/13/2021', '4/14/2021']
})
data['Date'] = pd.to_datetime(data['Date'])
print(data)
# Date
# 0 2021 - 01 - 11
# 1 2021 - 02 - 12
# 2 2021 - 03 - 13
# 3 2021 - 04 - 14
data_new1 = data
data_new1['Date_New'] = data_new1['Date'].dt.strftime('%m/%d/%Y')
print(data_new1)
# Date Date_New
# 0 2021 - 01 - 11 01 / 11 / 2021
# 1 2021 - 02 - 12 02 / 12 / 2021
# 2 2021 - 03 - 13 03 / 13 / 2021
# 3 2021 - 04 - 14 04 / 14 / 2021
data_new2 = data
data_new2['Date_New'] = data_new2['Date'].dt.strftime('%dxxx%mxxx%Y')
print(data_new2)
# Date Date_New
# 0 2021 - 01 - 11 11 xxx01xxx2021
# 1 2021 - 02 - 12 12 xxx02xxx2021
# 2 2021 - 03 - 13 13 xxx03xxx2021
# 3 2021 - 04 - 14 14 xxx04xxx2021

Suggestion : 6

Get better at data science interviews by solving a few questions per week

import pandas as pd
import numpy as np
import datetime
raw_data = {
   'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
   'age': [20, 19, 22, 21],
   'favorite_color': ['blue', 'red', 'yellow', "green"],
   'grade': [88, 92, 95, 70],
   'birth_date': ['01-02-1996', '08-05-1997', '04-28-1996', '12-16-1995']
}
df = pd.DataFrame(raw_data, index = ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'])
df
#pandas datetimeindex docs: https: //pandas.pydata.org/pandas-docs/stable/generated/pandas.DatetimeIndex.html
   #efficient way to extract year from string format date
df['year'] = pd.DatetimeIndex(df['birth_date']).year
df.head()
#pandas datetimeindex docs: https: //pandas.pydata.org/pandas-docs/stable/generated/pandas.DatetimeIndex.html
   df['month'] = pd.DatetimeIndex(df['birth_date']).month
df.head()
#if the date format comes in datetime, we can also extract the day / month / year using the to_period
function
#where 'D', 'M', 'Y'
are inputs
df['month_year'] = pd.to_datetime(df['birth_date']).dt.to_period('M')
df.head()

Suggestion : 7

The object to convert to a datetime. If a DataFrame is provided, the method expects minimally the following columns: "year", "month", "day".,array-like: DatetimeIndex (or Series with object dtype containing datetime.datetime),However, timezone-aware inputs with mixed time offsets (for example issued from a timezone with daylight savings, such as Europe/Paris) are not successfully converted to a DatetimeIndex. Instead a simple Index containing datetime.datetime objects is returned:,When another datetime conversion error happens. For example when one of ‘year’, ‘month’, day’ columns is missing in a DataFrame, or when a Timezone-aware datetime.datetime is found in an array-like of mixed time offsets, and utc=False.

>>> df = pd.DataFrame({
      'year': [2015, 2016],
      ...'month': [2, 3],
      ...'day': [4, 5]
   }) >>>
   pd.to_datetime(df)
0 2015 - 02 - 04
1 2016 - 03 - 05
dtype: datetime64[ns]
>>> s = pd.Series(['3/11/2000', '3/12/2000', '3/13/2000'] * 1000) >>>
   s.head()
0 3 / 11 / 2000
1 3 / 12 / 2000
2 3 / 13 / 2000
3 3 / 11 / 2000
4 3 / 12 / 2000
dtype: object
>>> % timeit pd.to_datetime(s, infer_datetime_format = True)
100 loops, best of 3: 10.4 ms per loop
>>> % timeit pd.to_datetime(s, infer_datetime_format = False)
1 loop, best of 3: 471 ms per loop
>>> pd.to_datetime(1490195805, unit = 's')
Timestamp('2017-03-22 15:16:45') >>>
   pd.to_datetime(1490195805433502912, unit = 'ns')
Timestamp('2017-03-22 15:16:45.433502912')
>>> pd.to_datetime([1, 2, 3], unit = 'D',
   ...origin = pd.Timestamp('1960-01-01'))
DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'],
   dtype = 'datetime64[ns]', freq = None)