efficient way to change the header of a file in python

  • Last Update :
  • Techknowledgy :

Python handles various file operations. In the case of reading files, the user can start reading a file either from the first-line or from the second line. This article will show how you can skip the header row or the first line and start reading a file from line 2. Let us discuss four different methods to read a file from line 2. We will read a sample.txt file as well as a sample.csv file.,We use the sample.txt file to read the contents. This method uses next() to skip the header and starts reading the file from line 2.,Now, let us look at four different ways to read a text file and a csv file from line 2 in Python. We will use the above files to read the contents.,We use the sample.csv file to read the contents. This method reads the file from line 2 using csv.reader that skips the header using next() and prints the rows from line 2. This method can also be useful while reading the content of multiple CSV files.

Sample Text File //sample.txt

Student Details of Class X
David, 18, Science
Amy, 19, Commerce
Frank, 19, Commerce
Mark, 18, Arts
John, 18, Science

Sample CSV File //sample.csv

Student Details of Class X
David 18 Science
Amy 19 Commerce
Frank 19 Commerce
Mark 18 Arts
John 18 Science

Note: If you want to print the header later, instead of next(f) use f.readline() and store it as a variable or use header_line = next(f). This shows that the header of the file is stored in next().

#opens the file
with open("sample.txt") as f:
   #start reading from line 2
next(f)
for line in f:
   print(line)

#closes the file
f.close()

We use the sample.txt file to read the contents. This method imports islice from itertools module in Python. islice() takes three arguments. The first argument is the file to read the data, the second is the position from where the reading of the file will start and the third argument is None which represents the step. This is an efficient and pythonic way of solving the problem and can be extended to an arbitrary number of header lines. This even works for in-memory uploaded files while iterating over file objects.

from itertools
import islice

#opens the file
with open("sample.txt") as f:
   for line in islice(f, 1, None):
   print(line)

#closes the file
f.close()

We use the sample.csv file to read the contents. This method reads the file from line 2 using csv.reader that skips the header using next() and prints the rows from line 2. This method can also be useful while reading the content of multiple CSV files.

import csv

#opens the file
with open("sample.csv", 'r') as r:
   next(r)
#skip headers
rr = csv.reader(r)
for row in rr:
   print(row)

Suggestion : 2

There are various methods available for this purpose. We can use the read(size) method to read in the size number of data. If the size parameter is not specified, it reads and returns up to the end of the file.,There are various methods available with the file object. Some of them have been used in the above examples.,Alternatively, we can use the readline() method to read individual lines of a file. This method reads a file till the newline, including the newline character.,We can see that the read() method returns a newline as '\n'. Once the end of the file is reached, we get an empty string on further reading.

Python has a built-in open() function to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly.

>>> f = open("test.txt") # open file in current directory >>>
   f = open("C:/Python38/README.txt") # specifying full path
2._
f = open("test.txt") # equivalent to 'r'
or 'rt'
f = open("test.txt", 'w') # write in text mode
f = open("img.bmp", 'r+b') # read and write in binary mode

Hence, when working with files in text mode, it is highly recommended to specify the encoding type.

f = open("test.txt", mode = 'r', encoding = 'utf-8')

A safer way is to use a try...finally block.

try:
f = open("test.txt", encoding = 'utf-8')
# perform file operations
finally:
f.close()

We don't need to explicitly call the close() method. It is done internally.

with open("test.txt", encoding = 'utf-8') as f:
   # perform file operations

Suggestion : 3

Second, create a CSV writer object by calling the writer() function of the csv module.,First, open the CSV file for writing (w mode) by using the open() function.,Next, open the CSV file for writing by calling the open() function.,Third, write data to CSV file by calling the writerow() or writerows() method of the CSV writer object.

The following code illustrates the above steps:

.wp - block - code {
      border: 0;
      padding: 0;
   }

   .wp - block - code > div {
      overflow: auto;
   }

   .shcb - language {
      border: 0;
      clip: rect(1 px, 1 px, 1 px, 1 px); -
      webkit - clip - path: inset(50 % );
      clip - path: inset(50 % );
      height: 1 px;
      margin: -1 px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1 px;
      word - wrap: normal;
      word - break: normal;
   }

   .hljs {
      box - sizing: border - box;
   }

   .hljs.shcb - code - table {
      display: table;
      width: 100 % ;
   }

   .hljs.shcb - code - table > .shcb - loc {
      color: inherit;
      display: table - row;
      width: 100 % ;
   }

   .hljs.shcb - code - table.shcb - loc > span {
      display: table - cell;
   }

   .wp - block - code code.hljs: not(.shcb - wrap - lines) {
      white - space: pre;
   }

   .wp - block - code code.hljs.shcb - wrap - lines {
      white - space: pre - wrap;
   }

   .hljs.shcb - line - numbers {
      border - spacing: 0;
      counter - reset: line;
   }

   .hljs.shcb - line - numbers > .shcb - loc {
      counter - increment: line;
   }

   .hljs.shcb - line - numbers.shcb - loc > span {
      padding - left: 0.75 em;
   }

   .hljs.shcb - line - numbers.shcb - loc::before {
      border - right: 1 px solid #ddd;
      content: counter(line);
      display: table - cell;
      padding: 0 0.75 em;
      text - align: right; -
      webkit - user - select: none; -
      moz - user - select: none; -
      ms - user - select: none;
      user - select: none;
      white - space: nowrap;
      width: 1 % ;
   }
import csv

# open the file in the write mode
f = open('path/to/csv_file', 'w')

# create the csv writer
writer = csv.writer(f)

# write a row to the csv file
writer.writerow(row)

# close the file
f.close()
Code language: Python(python)

It’ll be shorter if you use the with statement so that you don’t need to call the close() method to explicitly close the file:

import csv

# open the file in the write mode
with open('path/to/csv_file', 'w') as f:
   # create the csv writer
writer = csv.writer(f)

# write a row to the csv file
writer.writerow(row)
Code language: PHP(php)

The following illustrates how to write UTF-8 characters to a CSV file:

import csv

# open the file in the write mode
with open('path/to/csv_file', 'w', encoding = 'UTF8') as f:
   # create the csv writer
writer = csv.writer(f)

# write a row to the csv file
writer.writerow(row)
Code language: PHP(php)

To remove the blank line, you pass the keyword argument newline='' to the open() function as follows:

import csv

header = ['name', 'area', 'country_code2', 'country_code3']
data = ['Afghanistan', 652090, 'AF', 'AFG']

with open('countries.csv', 'w', encoding = 'UTF8', newline = '') as f:
   writer = csv.writer(f)

# write the header
writer.writerow(header)

# write the data
writer.writerow(data) Code language: PHP(php)

The following uses the writerows() method to write multiple rows into the countries.csv file:

import csv

header = ['name', 'area', 'country_code2', 'country_code3']
data = [
   ['Albania', 28748, 'AL', 'ALB'],
   ['Algeria', 2381741, 'DZ', 'DZA'],
   ['American Samoa', 199, 'AS', 'ASM'],
   ['Andorra', 468, 'AD', 'AND'],
   ['Angola', 1246700, 'AO', 'AGO']
]

with open('countries.csv', 'w', encoding = 'UTF8', newline = '') as f:
   writer = csv.writer(f)

# write the header
writer.writerow(header)

# write multiple rows
writer.writerows(data)
Code language: PHP(php)

Suggestion : 4

A dictionary containing key-value pairs of an associated pax extended header.,A TarInfo object also provides some convenient query methods:,A dictionary containing key-value pairs of pax global headers.,The details of character conversion in tarfile are controlled by the encoding and errors keyword arguments of the TarFile class.

$ python - m tarfile - c monty.tar spam.txt eggs.txt
$ python - m tarfile - c monty.tar life - of - brian_1979 /
$ python - m tarfile - e monty.tar
$ python - m tarfile - e monty.tar other - dir /
$ python - m tarfile - l monty.tar
import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()