how can i replace unicode characters with turkish characters in a text file with python

  • Last Update :
  • Techknowledgy :

You can read a file containing a JSON object with the json.load function. This returns a Python object with the escaped characters decoded. Writing it again with json.dump and passing ensure_ascii=False as an argument writes the object back to a file without encoding Turkish characters as escape sequences. An example:

import json
inp = open('input.txt', 'r')
out = open('output.txt', 'w')
in_as_obj = json.load(inp)
json.dump(in_as_obj, out, ensure_ascii = False)

Your file isn't really a JSON file, but instead a file containing multiple JSON objects. If each JSON object is on its own line, you can try the following:

import json
inp = open('input.txt', 'r')
out = open('output.txt', 'w')
for line in inp:
   if not line.strip():
   out.write(line)
continue
in_as_obj = json.loads(line)
json.dump(in_as_obj, out, ensure_ascii = False)
out.write('\n')

But in your case it's probably better to write unescaped JSON to the file in the first place. Try replacing your on_data method by (untested):

def on_data(self, raw_data):
   data = json.loads(raw_data)
print(json.dumps(data, ensure_ascii = False))

You can use this method:

# For Turkish Character
translationTable = str.maketrans("ğĞıİöÖüÜşŞçÇ", "gGiIoOuUsScC")

yourText = "Pijamalı Hasta Yağız Şoföre Çabucak Güvendi"
yourText = yourText.translate(translationTable)

print(yourText)

Suggestion : 2

Here are Unicode characters and Turkish anycodings_unicode characters which I want to replace.,How to Change repository detected language in Gitea source controller,Query to check configuration properties of running QuestDB instance,VueJS i18n - How to change the variable prefix in translation files

I tried two different type

#!/usr/bin/env python

# - * -coding: utf - 8 - * -

   import re

dosya = open('veri.txt', 'r')

for line in dosya:
   match = re.search(line, "\u011f")
if (match):
   replace("\u011f", "ğ")

dosya.close()

and:

#!/usr/bin/env python

# - * -coding: utf - 8 - * -

   f1 = open('veri.txt', 'r')
f2 = open('veri2.txt', 'w')

for line in f1:
   f2.write = (line.replace('\u011f', 'ğ'))
f2.write = (line.replace('\u011e', 'Ğ'))
f2.write = (line.replace('\u0131', 'ı'))
f2.write = (line.replace('\u0130', 'Ä°'))
f2.write = (line.replace('\u00f6', 'ö'))
f2.write = (line.replace('\u00d6', 'Ö'))
f2.write = (line.replace('\u00fc', 'ü'))
f2.write = (line.replace('\u00dc', 'Ü'))
f2.write = (line.replace('\u015f', 'ş'))
f2.write = (line.replace('\u015e', 'Ş'))
f2.write = (line.replace('\u00e7', 'ç'))
f2.write = (line.replace('\u00c7', 'Ç'))

f1.close()
f2.close()

You can read a file containing a JSON anycodings_python object with the json.load function. anycodings_python This returns a Python object with the anycodings_python escaped characters decoded. Writing it anycodings_python again with json.dump and passing anycodings_python ensure_ascii=False as an argument writes anycodings_python the object back to a file without anycodings_python encoding Turkish characters as escape anycodings_python sequences. An example:

import json
inp = open('input.txt', 'r')
out = open('output.txt', 'w')
in_as_obj = json.load(inp)
json.dump(in_as_obj, out, ensure_ascii = False)

Your file isn't really a JSON file, but anycodings_python instead a file containing multiple JSON anycodings_python objects. If each JSON object is on its anycodings_python own line, you can try the following:

import json
inp = open('input.txt', 'r')
out = open('output.txt', 'w')
for line in inp:
   if not line.strip():
   out.write(line)
continue
in_as_obj = json.loads(line)
json.dump(in_as_obj, out, ensure_ascii = False)
out.write('\n')

But in your case it's probably better to anycodings_python write unescaped JSON to the file in the anycodings_python first place. Try replacing your on_data anycodings_python method by (untested):

def on_data(self, raw_data):
   data = json.loads(raw_data)
print(json.dumps(data, ensure_ascii = False))

You can use this method:

# For Turkish Character
translationTable = str.maketrans("ğĞıİöÖüÜşŞçÇ", "gGiIoOuUsScC")

yourText = "Pijamalı Hasta Yağız Şoföre Çabucak Güvendi"
yourText = yourText.translate(translationTable)

print(yourText)

Suggestion : 3

If you want to read or write a text file with Python, it is necessary to first open the file. To open a file, you can use Python’s built-in open() function.,To write something to this newly opened text fle, you can use the .write() method.,We haven’t fully discussed Python modules and for loops yet, but once you’re comfortable with these concepts, it’s helpful to know how to work with all the files in a directory.,A file object does not contain readable text. To read this file object as text, you need to use the .read() method.

open('sample-file.txt', encoding = 'utf-8')
< _io.TextIOWrapper name = 'sample-file.txt'
mode = 'r'
encoding = 'utf-8' >
open('sample-file.txt', mode = 'r', encoding = 'utf-8').read()
'This text file is now open and being read!'
open('a-new-file.txt', mode = 'w', encoding = 'utf-8')
< _io.TextIOWrapper name = 'a-new-file.txt'
mode = 'w'
encoding = 'utf-8' >

Suggestion : 4

Replace with XML/HTML numeric character reference, which is a decimal form of Unicode code point with format &#num; Implemented in xmlcharrefreplace_errors().,The unencodable character is replaced by an appropriate XML/HTML numeric character reference, which is a decimal form of Unicode code point with format &#num; .,Implements the 'namereplace' error handling (for encoding within text encoding only).,Implements the 'xmlcharrefreplace' error handling (for encoding within text encoding only).

>>> 'German ß, ♬'.encode(encoding = 'ascii', errors = 'backslashreplace')
b 'German \\xdf, \\u266c' >>>
   'German ß, ♬'.encode(encoding = 'ascii', errors = 'xmlcharrefreplace')
b 'German &#223;, &#9836;'