Specifying the format of your date strings will speed up the conversion considerably:
df['Day_of_Week'] = pd.to_datetime(df['Date'], format = '%m/%d/%y').dt.weekday_name
Here are some benchmarks:
import io import pandas as pd data = io.StringIO('' '\ ProductCode,Date,Receipt,Total x1, 07 / 29 / 15, 101790, 17.35 x2, 07 / 29 / 15, 103601, 8.89 x3, 07 / 29 / 15, 103601, 8.58 x4, 07 / 30 / 15, 101425, 11.95 x5, 07 / 29 / 15, 101422, 1.09 x6, 07 / 29 / 15, 101422, 0.99 x7, 07 / 29 / 15, 101422, 3 y7, 08 / 05 / 15, 100358, 7.29 x8, 08 / 05 / 15, 100358, 2.6 z3, 08 / 05 / 15, 100358, 2.99 '' ') df = pd.read_csv(data) % timeit pd.to_datetime(df['Date']).dt.weekday_name # => 100 loops, best of 3: 2.48 ms per loop % timeit pd.to_datetime(df['Date'], format = '%m/%d/%y').dt.weekday_name # => 1000 loops, best of 3: 507 µs per loop large_df = pd.concat([df] * 1000) % timeit pd.to_datetime(large_df['Date']).dt.weekday_name # => 1 loop, best of 3: 1.62 s per loop % timeit pd.to_datetime(large_df['Date'], format = '%m/%d/%y').dt.weekday_name # => 10 loops, best of 3: 45.9 ms per loop
An alternative would be to load the csv with the date information, especially if you need this date column often. Unfortunately there does not seem to be a way to pass the format of the date in and the infer_datetime_format
parameter to read_csv
does not seem to make a difference:
import timeit repeat = 3 numbers = 100 setup = "" "import pandas as pd import io data = io.StringIO('' '\ ProductCode,Date,Receipt,Total '' ' + ' ''\ x1, 07 / 29 / 15, 101790, 17.35 x2, 07 / 29 / 15, 103601, 8.89 x3, 07 / 29 / 15, 103601, 8.58 x4, 07 / 30 / 15, 101425, 11.95 x5, 07 / 29 / 15, 101422, 1.09 x6, 07 / 29 / 15, 101422, 0.99 x7, 07 / 29 / 15, 101422, 3 y7, 08 / 05 / 15, 100358, 7.29 x8, 08 / 05 / 15, 100358, 2.6 z3, 08 / 05 / 15, 100358, 2.99 '' ' * 100)""" def time(statement, _setup = None): print(min( timeit.Timer(statement, setup = _setup or setup).repeat( repeat, numbers))) time('pd.read_csv(data); data.seek(0)') time('pd.read_csv(data, parse_dates=["Date"]); data.seek(0)') time('pd.read_csv(data, parse_dates=["Date"],' 'infer_datetime_format=True); data.seek(0)')
prints:
0.5536041843652657 25.298157679942697 25.34556727133409
Return the day of the week. It is assumed the week starts on Monday, which is denoted by 0 and ends on Sunday which is denoted by 6. This method is available on both Series with datetime values (using the dt accessor) or DatetimeIndex.,The day of the week with Monday=0, Sunday=6.,Returns the name of the day of the week., pandas.DatetimeIndex.is_month_start
>>> s = pd.date_range('2016-12-31', '2017-01-08', freq = 'D').to_series() >>>
s.dt.dayofweek
2016 - 12 - 31 5
2017 - 01 - 01 6
2017 - 01 - 02 0
2017 - 01 - 03 1
2017 - 01 - 04 2
2017 - 01 - 05 3
2017 - 01 - 06 4
2017 - 01 - 07 5
2017 - 01 - 08 6
Freq: D, dtype: int64
Pandas To Datetime (.to_datetime()) will convert your string representation of a date to an actual date format. This is extremely important when utilizing all of the Pandas Date functionality like resample.,Pandas to datetime is a beautiful function that allows you to convert your strings into DateTimes. This is extremely useful when working with Time Series data.,To convert any string to a datetime, you'll need to start with .to_datetime(). This is called directly from the pandas library.,One of the Top 10 Pandas functions you must know is Pandas To Datetime. It a need-to-have in your data analysis toolkit. The wonderful thing about to_datetime() is it’s flexibility to read 95% of any dates you’ll throw at it.
Pandas To Datetime (.to_datetime()
) will convert your string representation of a date to an actual date format. This is extremely important when utilizing all of the Pandas Date functionality like resample.
1. pd.to_datetime(your_date_data, format = "Your_datetime_format")
import pandas as pd
string_to_convert = '2020-02-01' print('Your string: {}'.format(string_to_convert)) print('Your string_to_convert type: {}'.format(type(string_to_convert))) print() # Convert your string new_date = pd.to_datetime(string_to_convert) print('Your new date is: {}'.format(new_date)) print('Your new type is: {}'.format(type(new_date)))
Your string: 2020-02-01
Your string_to_convert type: <class 'str'>
Your new date is: 2020-02-01 00:00:00
Your new type is: <class 'pandas._libs.tslibs.timestamps.Timestamp'>
s = pd.Series(['2020-02-01',
'2020-02-02',
'2020-02-03',
'2020-02-04'
])
s
0 2020 - 02 - 01 1 2020 - 02 - 02 2 2020 - 02 - 03 3 2020 - 02 - 04 dtype: object
December 18, 2021February 22, 2022
The Quick Answer: Use df[‘date_column’].dt.date To Extract Date from Pandas Datetime
# Extract date from datetime column in Pandas df['Date'] = df['DateTime'].dt.date
In order to follow along with this tutorial, I have provided a sample Pandas Dataframe. Feel free to copy the code below into your favourite code editor. If you want to follow along with your own dataset, your results will of course vary.
# Loading a Sample Pandas Dataframe import pandas as pd df = pd.DataFrame.from_dict({ 'DateTime': ['2022-01-01 15:34:21', '2022-02-03 10:13:45', '2022-03-04 12:12:45', '2022-04-03 14:45:23', '2022-05-27 18:23:45'], 'Name': ['Nik', 'Kate', 'Lou', 'Samrat', 'Jim'], 'Age': [33, 32, 45, 37, 23] }) df['DateTime'] = pd.to_datetime(df['DateTime']) print(df) # Returns: # DateTime Name Age # 0 2022 - 01 - 01 15: 34: 21 Nik 33 # 1 2022 - 02 - 03 10: 13: 45 Kate 32 # 2 2022 - 03 - 04 12: 12: 45 Lou 45 # 3 2022 - 04 - 03 14: 45: 23 Samrat 37 # 4 2022 - 05 - 27 18: 23: 45 Jim 23
We can see that we have three columns, one of which contains datetime values. We can check the type of this column by using the .dtype
property:
# Checking the data type of our DateTime column
print(df['DateTime'].dtype)
# Returns: datetime64[ns]
Something important to note is that the date that’s returned is actually an object
datatype. We can confirm this by checking the data type of the column:
# Checking the data type of the returned column df['Date'] = df['DateTime'].dt.date print(df['Date'].dtype) # Returns: object
Let’s see how we can use this method to extract a date from a datetime column:
# Extract date from datetime column in Pandas df['Date'] = df['DateTime'].dt.normalize() print(df) # Returns: # DateTime Name Age Date # 0 2022 - 01 - 01 15: 34: 21 Nik 33 2022 - 01 - 01 # 1 2022 - 02 - 03 10: 13: 45 Kate 32 2022 - 02 - 03 # 2 2022 - 03 - 04 12: 12: 45 Lou 45 2022 - 03 - 04 # 3 2022 - 04 - 03 14: 45: 23 Samrat 37 2022 - 04 - 03 # 4 2022 - 05 - 27 18: 23: 45 Jim 23 2022 - 05 - 27