python:how to extract date using regex

  • Last Update :
  • Techknowledgy :
import datetime
from datetime
import date
import re
s = "Jason's birthday is on 1991-09-21"
match = re.search(r '\d{4}-\d{2}-\d{2}', s)
date = datetime.datetime.strptime(match.group(), '%Y-%m-%d').date()
print date

Suggestion : 2

Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.,Given a string, the task is to write a Python program to extract date from it.,This is another way to solve this problem. In this inbuilt Python library python-dateutil, The parse() method can be used to detect date and time in a string. ,Convert integer to string in Python

1._
Input: test_str = "gfg at 2021-01-04"
Output: 2021 - 01 - 04
Explanation: Date format string found.

Input: test_str = "2021-01-04 for gfg"
Output: 2021 - 01 - 04
Explanation: Date format string found.

Output:

The original string is: gfg at 2021 - 01 - 04
Computed date: 2021 - 01 - 04

Suggestion : 3

Author: Josh Petty Last Updated: August 31, 2021

Example: Finding formatted dates with regex

import re

# open the text file and read the data
file = open("minutes.txt", 'r')

text = file.read()
# match a regex pattern
for formatted dates
matches = re.findall(r '(\d+/\d+/\d+)', text)

print(matches)

Example: Matching dates with a regex pattern

import re

# open a text file
f = open("apple2.txt", 'r')

# extract the file 's content
content = f.read()

# a regular expression pattern to match dates
pattern = "\d{2}[/-]\d{2}[/-]\d{4}"

# find all the strings that match the pattern
dates = re.findall(pattern, content)

for date in dates:
   print(date)

f.close()

By using datetime objects, we have more control over string data read from text files. For example, we can use a datetime object to get a copy of the current date and time of our computer.

import datetime

now = datetime.datetime.now()
print(now)

Example: Creating datetime objects from file data

import re
from datetime
import datetime

# open the data file
file = open("schedule.txt", 'r')
text = file.read()

match = re.search(r '\d+-\d+-\d{4}', text)
# create a new datetime object from the regex match
date = datetime.strptime(match.group(), '%d-%m-%Y').date()
print(f "The date of the meeting is on {date}.")
file.close()

The Python datefinder module can locate dates in a body of text. Using the find_dates() method, it’s possible to search text data for many different types of dates. Datefinder will return any dates it finds in the form of a datetime object.

Unlike the other packages we’ve discussed in this guide, Python does not come with datefinder. The easiest way to install the datefinder module is to use pip from the command prompt.

pip install datefinder

Suggestion : 4

Example code in Python:,Enter a text in the input above to see the result

"^[0-9]{1,2}\\/[0-9]{1,2}\\/[0-9]{4}$"
import re

# Validate date
date_pattern = "^[0-9]{1,2}\\/[0-9]{1,2}\\/[0-9]{4}$"
re.match(date_pattern, '12/12/2022') # Returns Match object

# Extract date from a string
date_extract_pattern = "[0-9]{1,2}\\/[0-9]{1,2}\\/[0-9]{4}"
re.findall(date_extract_pattern, 'I\'m on vacation from 1/18/2021 till 1/29/2021') # returns['1/18/2021', '1/29/2021']
"^(?:\\d{4})-(?:\\d{2})-(?:\\d{2})T(?:\\d{2}):(?:\\d{2}):(?:\\d{2}(?:\\.\\d*)?)(?:(?:-(?:\\d{2}):(?:\\d{2})|Z)?)$"
import re

# Validate ISO date
date_pattern = "^(?:\\d{4})-(?:\\d{2})-(?:\\d{2})T(?:\\d{2}):(?:\\d{2}):(?:\\d{2}(?:\\.\\d*)?)(?:(?:-(?:\\d{2}):(?:\\d{2})|Z)?)$"
re.match(date_pattern, '2021-11-04T22:32:47.142354-10:00') # Returns Match object

# Extract ISO date from a string
date_extract_pattern = "(?:\\d{4})-(?:\\d{2})-(?:\\d{2})T(?:\\d{2}):(?:\\d{2}):(?:\\d{2}(?:\\.\\d*)?)(?:(?:-(?:\\d{2}):(?:\\d{2})|Z)?)"
re.findall(date_extract_pattern, '2017-05-23T15:02:27Z | WARN | Record not found\n2018-05-23T15:02:28Z | WARN | Project with the id ’53’ was not found') # returns['2017-05-23T15:02:27Z', '2018-05-23T15:02:28Z']
try:
datetime.datetime.strptime("1-2-2022", '%d-%m-%Y')
print("Correct date string")
except ValueError:
   print("Incorrect date string")

Suggestion : 5

We will use only the basic notations for creating a regex pattern for dates. Our target is to match dates that follow the format day/month/year or day-month-year with the day and month containing 2 digits and the year containing 4 digits. Let’s now construct the pattern step by step.,In this article, we will discuss how to extract dates from a text file using Python. The text may contain several thousand lines and you might need to extract the dates alone. We will do this using an interesting concept called regular expressions.,Since we are using regular expressions for this purpose, we first need to know some basics of regular expressions. Regular expressions are patterns that can be used to match strings that follow that pattern and there are several ways to specify patterns and it can look complicated but it is not. It is recommended that you read the following article to understand how regular expressions work.,You would have known that \d will match digits. In order to match the strings that contain exactly 2 digits, we need to specify the value 2 inside {}. So \d{2} will match all strings that contain 2 digits and nothing else. The pattern for the day is \d{2} and for the month is \d{2} and for the year is \d{4}. We need to combine these 3 using a ‘/’ or ‘-‘.

The hard part is over and the rest of the work is simple.

import re

# Open the file that you want to search
f = open("doc.txt", "r")

# Will contain the entire content of the file as a string
content = f.read()

# The regex pattern that we created
pattern = "\d{2}[/-]\d{2}[/-]\d{4}"

# Will
return all the strings that are matched
dates = re.findall(pattern, content)

 

import re

# Open the file that you want to search
f = open("doc.txt", "r")

# Will contain the entire content of the file as a string
content = f.read()

# The regex pattern that we created
pattern = "\d{2}[/-]\d{2}[/-]\d{4}"

# Will
return all the strings that are matched
dates = re.findall(pattern, content)

for date in dates:
   if "-" in date:
   day, month, year = map(int, date.split("-"))
else:
   day, month, year = map(int, date.split("/"))
if 1 <= day <= 31 and 1 <= month <= 12:
   print(date)
f.close()

For example, if the content of the text file is as follows

My name is XXX.I was born on 07 / 04 / 1998 in YYY city.
I graduated from ZZZ college on 09 - 05 - 2019.