Not an answer for pd.to_datetime
, but there's another package dateparser
which can handle dates in multiple languages.
import dateparser
df['Datum'] = df['Datum'].apply(dateparser.parse)
import dateparser
df['Datum'] = df['Datum'].apply(dateparser.parse)
Tag Datum 0 Tag 1 1971 - 03 - 07 1 Tag 2 1970 - 11 - 29
I am trying to parse the following dataframe,Not an answer for pd.to_datetime, but anycodings_pandas there's another package dateparser which anycodings_pandas can handle dates in multiple languages.,I understand that I need to use locale to be anycodings_datetime able to specify the format:,Is the problem connected to the fact that anycodings_datetime the German format does not have a consistent anycodings_datetime month format? How to I use to_datetime here?
I am trying to parse the following dataframe
IN:
import pandas as pd
d = {
'Tag': ['Tag 1', 'Tag 2'],
'Datum': ['07. März 1971', '29. Nov. 1970']
}
df = pd.DataFrame(data = d)
OUT:
Tag Datum
0 Tag 1 07. Mä rz 1971
1 Tag 2 29. Nov.1970
I understand that I need to use locale to be anycodings_datetime able to specify the format:
import locale
and I have found multiple settings which I anycodings_datetime have tried out:
# locale.setlocale(locale.LC_ALL, "german") # locale.setlocale(locale.LC_ALL, 'deu_deu') # locale.setlocale(locale.LC_ALL, 'de_DE') locale.setlocale(locale.LC_ALL, 'de_DE.utf8')
Not an answer for pd.to_datetime, but anycodings_pandas there's another package dateparser which anycodings_pandas can handle dates in multiple languages.
import dateparser
df['Datum'] = df['Datum'].apply(dateparser.parse)
import dateparser
df['Datum'] = df['Datum'].apply(dateparser.parse)
Tag Datum 0 Tag 1 1971 - 03 - 07 1 Tag 2 1970 - 11 - 29
11/20/2021
string dateInput = "Jan 1, 2009";
var parsedDate = DateTime.Parse(dateInput);
Console.WriteLine(parsedDate);
// Displays the following output on a system whose culture is en-US:
// 1/1/2009 00:00:00
Dim MyString As String = "Jan 1, 2009" Dim MyDateTime As DateTime = DateTime.Parse(MyString) Console.WriteLine(MyDateTime) ' Displays the following output on a system whose culture is en-US: ' 1/1/2009 00:00:00
You can also explicitly define the culture whose formatting conventions are used when you parse a string. You specify one of the standard DateTimeFormatInfo objects returned by the CultureInfo.DateTimeFormat property. The following example uses a format provider to parse a German string into a DateTime. It creates a CultureInfo representing the de-DE
culture. That CultureInfo
object ensures successful parsing of this particular string. This precludes whatever setting is in the CurrentCulture of the CurrentThread.
var cultureInfo = new CultureInfo("de-DE");
string dateString = "12 Juni 2008";
var dateTime = DateTime.Parse(dateString, cultureInfo);
Console.WriteLine(dateTime);
// The example displays the following output:
// 6/12/2008 00:00:00
In the following example, the DateTime.ParseExact method is passed a string object to parse, followed by a format specifier, followed by a CultureInfo object. This ParseExact method can only parse strings that follow the long date pattern in the en-US
culture.
var cultureInfo = new CultureInfo("en-US");
string[] dateStrings = {
" Friday, April 10, 2009",
"Friday, April 10, 2009"
};
foreach(string dateString in dateStrings) {
try {
var dateTime = DateTime.ParseExact(dateString, "D", cultureInfo);
Console.WriteLine(dateTime);
} catch (FormatException) {
Console.WriteLine("Unable to parse '{0}'", dateString);
}
}
// The example displays the following output:
// Unable to parse ' Friday, April 10, 2009'
// 4/10/2009 00:00:00
Since you have month precision input, use year_month_day_parse() to parse into a month precision calendar type, then set the day to whatever you want (here I chose 1 for the first of the month) and convert to Date.,You other option is to use read::parse_datetime() where you can supply a vector of month names in locale(); locale("de") comes with German names built-in.,You can do this with clock. It takes the same approach as readr where you can supply a locale object containing the localized month/weekday names. clock_labels("de") doesn't seem to exactly match the abbreviations you want, but you can just overwrite them with the ones you need.,But setting exact = TRUE or calling strptime(x = monthly_german, format = "%b %Y") also results in NAs and using parse_date_time2 is not an option either since its C parser only understands English month names and wrongly interprets some of my months. Therefore
monthly_german < -c("Feb 1999", "Mär 2000", "Mai 2001", "Mär 2001", "Dez 2001", "Mär 2002", "Sep 2002", "Nov 2002", "Mai 2003", "Feb 2004", "Sep 2004", "Nov 2004", "Sep 2005", "Nov 2005", "Nov 2006", "Jun 2007", "Feb 2008", "Nov 2008", "Feb 2009", "Sep 2009", "Nov 2009", "Sep 2010", "Nov 2010", "Mär 2012", "Jun 2012", "Nov 2012", "Mär 2013", "Sep 2013", "Nov 2013", "Feb 2014", "Mai 2014", "Sep 2014", "Nov 2014", "Jun 2015", "Feb 2016")
parse_date_time(x = monthly_german,
orders = "b Y")
data.frame(original = monthly_german,
parsed = parse_date_time2(x = monthly_german,
orders = "b Y"))
orig parsed 1 Feb 1999 1999 - 02 - 01 2 Mär 2000 2000 - 05 - 01 3 Mai 2001 2001 - 05 - 01 4 Mär 2001 2001 - 05 - 01 5 Dez 2001 2001 - 12 - 01 6 Mär 2002 2002 - 05 - 01 7 Sep 2002 2002 - 09 - 01 8 Nov 2002 2002 - 11 - 01 9 Mai 2003 2003 - 05 - 01 10 Feb 2004 2004 - 02 - 01 11 Sep 2004 2004 - 09 - 01 12 Nov 2004 2004 - 11 - 01 13 Sep 2005 2005 - 09 - 01 14 Nov 2005 2005 - 11 - 01 15 Nov 2006 2006 - 11 - 01 16 Jun 2007 2007 - 06 - 01 17 Feb 2008 2008 - 02 - 01 18 Nov 2008 2008 - 11 - 01 19 Feb 2009 2009 - 02 - 01 20 Sep 2009 2009 - 09 - 01 21 Nov 2009 2009 - 11 - 01 22 Sep 2010 2010 - 09 - 01 23 Nov 2010 2010 - 11 - 01 24 Mär 2012 2012 - 05 - 01 25 Jun 2012 2012 - 06 - 01 26 Nov 2012 2012 - 11 - 01 27 Mär 2013 2013 - 05 - 01 28 Sep 2013 2013 - 09 - 01 29 Nov 2013 2013 - 11 - 01 30 Feb 2014 2014 - 02 - 01 31 Mai 2014 2014 - 05 - 01 32 Sep 2014 2014 - 09 - 01 33 Nov 2014 2014 - 11 - 01 34 Jun 2015 2015 - 06 - 01 35 Feb 2016 2016 - 02 - 01
> month(as.yearmon(c("Mär 2000", "März 2000")))[1] 3 3
> month(as.yearmon(c("Mar 2000", "March 2000")))[1] NA NA
>
Sys.setlocale(locale = "en_GB.utf8")[1]
"LC_CTYPE=en_GB.utf8;LC_NUMERIC=C;LC_TIME=en_GB.utf8;LC_COLLATE=en_GB.utf8;LC_MONETARY=en_GB.utf8;LC_MESSAGES=de_CH.UTF-8;LC_PAPER=de_CH.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_CH.UTF-8;LC_IDENTIFICATION=C" >
month(as.yearmon(c("Mar 2000", "March 2000")))[1] NA NA