add columns to a data frame calculated by for loops in python

  • Last Update :
  • Techknowledgy :

As I understand, i,j,bina are not part of your df. Build arrays for each one of those, each array element representing a 'row' and once you have all rows for i,j,bina ready, then you can concatenate like this:

>>> new_df = pd.DataFrame(data = {
      'i': i,
      'j': j,
      'bina': bina
   }, columns = ['i', 'j', 'bina']) >>>
   pd.concat([df, new_df], axis = 1)

Alternatively, once you have all data for 'i', 'j' and 'bina' collected and assuming you have the data for each of these in a separate array, you can do this:

>>> df['i'] = i >>>
   df['j'] = j >>>
   df['bina'] = bina

Typically you add columns to a Dataframe using its built-in __setitem__(), which you can access with []. For example:

import pandas as pd

df = pd.DataFrame()

df["one"] = 1, 1, 1
df["two"] = 2, 2, 2
df["three"] = 3, 3, 3

print df

# Output:
   # one two three
# 0 1 2 3
# 1 1 2 3
# 2 1 2 3

list_ib = df.columns.values

for i in list_ib:
   for j in list_ib:
   if i == j:
   break
else:
   bina = df[i] * df[j]
df['bina_' + str(i) + '_' + str(j)] = bina # Add new column which is the result of multiplying columns i and j together

print df

# Output:
   # one two three bina_two_one bina_three_one bina_three_two
# 0 1 2 3 2 3 6
# 1 1 2 3 2 3 6
# 2 1 2 3 2 3 6

Suggestion : 2

Last Updated : 09 Jun, 2022

Output:

    First_name Last_name Marks Results
    0 Ram Kumar 12 Fail
    1 Mohan Sharma 52 Pass
    2 Tina Ali 36 Pass
    3 Jeetu Gandhi 85 Pass
    4 Meera Kumari 23 Fail

  First_name Last_name Marks Result
  0 Ram Kumar 12 Fail
  1 Mohan Sharma 52 Pass
  2 Tina Ali 36 Pass
  3 Jeetu Gandhi 85 Pass
  4 Meera Kumari 23 Fail

  First_name Last_name Marks Result
  0 Ram Kumar 12 Fail
  1 Mohan Sharma 52 Pass
  2 Tina Ali 36 Pass
  3 Jeetu Gandhi 85 Pass
  4 Meera Kumari 23 Fail

Suggestion : 3

This tutorial demonstrates how to add new columns to a pandas DataFrame within a for loop in Python programming.,The article will contain one example for the addition of new variables to a pandas DataFrame within a for loop. To be more specific, the post is structured as follows:,In this example, I’ll illustrate how to use a for loop to append new variables to a pandas DataFrame in Python.,Have a look at the table that got returned after executing the previously shown Python programming code. It shows that our example pandas DataFrame is constructed of five data points and three columns.

import pandas as pd # Load pandas
data = pd.DataFrame({
   'x1': range(5, 10),
   # Create pandas DataFrame 'x2': range(10, 15),
   'x3': range(20, 25)
})
print(data) # Print pandas DataFrame
for i in range(1, 4): # Append columns within
for loop
data[i] = i * 3
print(data) # Print updated DataFrame

Suggestion : 4

To create a new column, use the [] brackets with the new column name at the left side of the assignment.,The calculation of the values is done element_wise. This means all values in the given column are multiplied by the value 1.882 at once. You do not need to use a loop to iterate each of the rows!,Create a new column by assigning the output to the DataFrame with a new column name in between the [].,The rename() function can be used for both row labels and column labels. Provide a dictionary with the keys the current names and the values the new names to update the corresponding names.

In[1]: import pandas as pd
In[2]: air_quality = pd.read_csv("data/air_quality_no2.csv", index_col = 0, parse_dates = True)

In[3]: air_quality.head()
Out[3]:
   station_antwerp station_paris station_london
datetime
2019 - 05 - 07 02: 00: 00 NaN NaN 23.0
2019 - 05 - 07 03: 00: 00 50.5 25.0 19.0
2019 - 05 - 07 04: 00: 00 45.0 27.7 19.0
2019 - 05 - 07 05: 00: 00 NaN 50.4 16.0
2019 - 05 - 07 06: 00: 00 NaN 61.9 NaN
In[4]: air_quality["london_mg_per_cubic"] = air_quality["station_london"] * 1.882

In[5]: air_quality.head()
Out[5]:
   station_antwerp station_paris station_london london_mg_per_cubic
datetime
2019 - 05 - 07 02: 00: 00 NaN NaN 23.0 43.286
2019 - 05 - 07 03: 00: 00 50.5 25.0 19.0 35.758
2019 - 05 - 07 04: 00: 00 45.0 27.7 19.0 35.758
2019 - 05 - 07 05: 00: 00 NaN 50.4 16.0 30.112
2019 - 05 - 07 06: 00: 00 NaN 61.9 NaN NaN
In[6]: air_quality["ratio_paris_antwerp"] = (
      ...: air_quality["station_paris"] / air_quality["station_antwerp"]
      ...: )
   ...:

   In[7]: air_quality.head()
Out[7]:
   station_antwerp station_paris station_london london_mg_per_cubic ratio_paris_antwerp
datetime
2019 - 05 - 07 02: 00: 00 NaN NaN 23.0 43.286 NaN
2019 - 05 - 07 03: 00: 00 50.5 25.0 19.0 35.758 0.495050
2019 - 05 - 07 04: 00: 00 45.0 27.7 19.0 35.758 0.615556
2019 - 05 - 07 05: 00: 00 NaN 50.4 16.0 30.112 NaN
2019 - 05 - 07 06: 00: 00 NaN 61.9 NaN NaN NaN
In[8]: air_quality_renamed = air_quality.rename(
      ...: columns = {
         ...: "station_antwerp": "BETR801",
         ...: "station_paris": "FR04014",
         ...: "station_london": "London Westminster",
         ...:
      }
      ...: )
   ...:
In[9]: air_quality_renamed.head()
Out[9]:
   BETR801 FR04014 London Westminster london_mg_per_cubic ratio_paris_antwerp
datetime
2019 - 05 - 07 02: 00: 00 NaN NaN 23.0 43.286 NaN
2019 - 05 - 07 03: 00: 00 50.5 25.0 19.0 35.758 0.495050
2019 - 05 - 07 04: 00: 00 45.0 27.7 19.0 35.758 0.615556
2019 - 05 - 07 05: 00: 00 NaN 50.4 16.0 30.112 NaN
2019 - 05 - 07 06: 00: 00 NaN 61.9 NaN NaN NaN

Suggestion : 5

Dataframe class provides a member function iteritems() i.e.,It yields an iterator which can can be used to iterate over all the columns of a dataframe. For each column in the Dataframe it returns an iterator to the tuple containing the column name and column contents as series.,Let’s user iteritems() to iterate over the columns of above created Dataframe, ,To iterate over the columns of a Dataframe by index we can iterate over a range i.e. 0 to Max number of columns then for each index we can select the columns contents using iloc[]. Let’s see how to iterate over all columns of dataframe from 0th index to last index i.e.

Let’s first create a Dataframe i.e.

# List of Tuples
empoyees = [('jack', 34, 'Sydney'),
   ('Riti', 31, 'Delhi'),
   ('Aadi', 16, 'New York'),
   ('Mohit', 32, 'Delhi'),
]

# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns = ['Name', 'Age', 'City'], index = ['a', 'b', 'c', 'd'])

    Name Age City
    a jack 34 Sydney
    b Riti 31 Delhi
    c Aadi 16 New York
    d Mohit 32 Delhi

Let’s user iteritems() to iterate over the columns of above created Dataframe,

# Yields a tuple of column name and series
for each column in the dataframe
for (columnName, columnData) in empDfObj.iteritems():
   print('Colunm Name : ', columnName)
print('Column Contents : ', columnData.values)

Suppose we want to iterate over two columns i.e. Name & Age in the above created dataframe. To do the we can select those columns only from dataframe and then iterate over them i.e.

# Iterate over two given columns only from the dataframe
for column in empDfObj[['Name', 'City']]:
   # Select column contents by column name using[] operator
columnSeriesObj = empDfObj[column]
print('Colunm Name : ', column)
print('Column Contents : ', columnSeriesObj.values)

As Dataframe.columns returns a sequence of column names. We can reverse iterate over these column names and for each column name we can select the column contents by column name i.e.

# Iterate over the sequence of column names in reverse order
for column in reversed(empDfObj.columns):
   # Select column contents by column name using[] operator
columnSeriesObj = empDfObj[column]
print('Colunm Name : ', column)
print('Column Contents : ', columnSeriesObj.values)