python pandas : create new rows from values of a list column

  • Last Update :
  • Techknowledgy :

You can pass list constructor as an aggregating function, along axis 1 (index)

data['New_column'] = data.agg(list, axis = 1)

Outputs:

     C1 C2 C3 C4 New_column
     0 0.98 1.25 1.30 1.00[0.98, 1.25, 1.3, 1.0]
     1 1.10 0.99 1.41 0.99[1.1, 0.99, 1.41, 0.99]

You can use the insert function for inserting a new column. This should solve your issue.

def append_new_column():
   data = {
      "C1": [0.98, 1.10],
      "C2": [1.25, 0.99],
      "C3": [1.3, 1.41],
      "C4": [1.00, .99]
   }
data = pd.DataFrame(data)
new_column = []
for i in range(len(data)):
   new_column.append(data.iloc[i, 0: 4].values)
data.insert(len(data.columns), "New Column", new_column, True)
return data

Suggestion : 2

In this article, we will discuss how to add / append a single or multiple rows in a dataframe using dataframe.append() or loc & iloc.,We can pass a list of series too in the dataframe.append() for appending multiple rows in dataframe. For example, we can create a list of series with same column names as dataframe i.e.,We can add a row at specific position too in the dataframe using iloc[] attribute. Checkout the example, where we will add a list as the 3rd row the dataframe. For example,,It will append a new row to the dataframe with index label ‘k’. Let’s see a complete example to append a list as row to the dataframe,

Pandas Dataframe provides a function dataframe.append() to add rows to a dataframe i.e.

DataFrame.append(other, ignore_index = False, verify_integrity = False, sort = None)
2._
    Name Age City Country
    a jack 34 Sydeny Australia
    b Riti 30 Delhi India
    c Vikas 31 Mumbai India
    d Neelu 32 Bangalore India
    e John 16 New York US
    f Mike 17 las vegas US

Let’s add a new row in above dataframe by passing dictionary i.e.

# Pass the row elements as key value pairs to append()
function
mod_df = df.append({
      'Name': 'Sahil',
      'Age': 22
   },
   ignore_index = True)

print('Modified Dataframe')
print(mod_df)

Complete example to add a dictionary as row to the dataframe is as follows,

import pandas as pd

# List of Tuples
students = [('jack', 34, 'Sydeny', 'Australia'),
   ('Riti', 30, 'Delhi', 'India'),
   ('Vikas', 31, 'Mumbai', 'India'),
   ('Neelu', 32, 'Bangalore', 'India'),
   ('John', 16, 'New York', 'US'),
   ('Mike', 17, 'las vegas', 'US')
]

#Create a DataFrame object
df = pd.DataFrame(students,
   columns = ['Name', 'Age', 'City', 'Country'],
   index = ['a', 'b', 'c', 'd', 'e', 'f'])

print('Original Dataframe')
print(df)

# Pass the row elements as key value pairs to append()
function
mod_df = df.append({
      'Name': 'Sahil',
      'Age': 22
   },
   ignore_index = True)

print('Modified Dataframe')
print(mod_df)

Output:

Original Dataframe
Name Age City Country
a jack 34 Sydeny Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 las vegas US
Modified Dataframe
Name Age City Country
0 jack 34 Sydeny Australia
1 Riti 30 Delhi India
2 Vikas 31 Mumbai India
3 Neelu 32 Bangalore India
4 John 16 New York US
5 Mike 17 las vegas US
6 Sahil 22 NaN NaN

Suggestion : 3

The inner square brackets define a Python list with column names, whereas the outer brackets are used to select the data from a pandas DataFrame as seen in the previous example.,To select a single column, use square brackets [] with the column name of the column of interest.,Each column in a DataFrame is a Series. As a single column is selected, the returned object is a pandas Series. We can verify this by checking the type of the output:,I’m interested in the age of the Titanic passengers. In [4]: ages = titanic["Age"] In [5]: ages.head() Out[5]: 0 22.0 1 38.0 2 26.0 3 35.0 4 35.0 Name: Age, dtype: float64 To select a single column, use square brackets [] with the column name of the column of interest.

In[1]: import pandas as pd
In[2]: titanic = pd.read_csv("data/titanic.csv")

In[3]: titanic.head()
Out[3]:
   PassengerId Survived Pclass Name Sex...Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr.Owen Harris male...0 A / 5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs.John Bradley(Florence Briggs Th...female...0 PC 17599 71.2833 C85 C 2 3 1 3 Heikkinen, Miss.Laina female...0 STON / O2.3101282 7.9250 NaN S 3 4 1 1 Futrelle, Mrs.Jacques Heath(Lily May Peel) female...0 113803 53.1000 C123 S 4 5 0 3 Allen, Mr.William Henry male...0 373450 8.0500 NaN S

      [5 rows x 12 columns]
In[4]: ages = titanic["Age"]

In[5]: ages.head()
Out[5]:
   0 22.0
1 38.0
2 26.0
3 35.0
4 35.0
Name: Age, dtype: float64
In[6]: type(titanic["Age"])
Out[6]: pandas.core.series.Series
In[7]: titanic["Age"].shape
Out[7]: (891, )
In[8]: age_sex = titanic[["Age", "Sex"]]

In[9]: age_sex.head()
Out[9]:
   Age Sex
0 22.0 male
1 38.0 female
2 26.0 female
3 35.0 female
4 35.0 male

Suggestion : 4

By using df.loc[index]=list you can append a list as a row to the DataFrame at a specified Index, In order to add at the end get the index of the last record using len(df) function. The below example adds the list ["Hyperion",27000,"60days",2000] to the end of the pandas DataFrame.,Use df.iloc[1]=list to append the row to the second position of the DataFrame as Index starts from zero.,Using append() you can also append series as a row to the DataFrame.,In this article, you have learned how to append a list as a row to Pandas DataFrame using DataFrame.loc[], DataFrame.iloc[], DataFrame.append() methods. Using these you can append a row at any position/index.

1._
# New list to append Row to DataFrame
list = ["Hyperion", 27000, "60days", 2000]
df.loc[len(df)] = list

# Addes at second position
df.iloc[1] = list

# Using append()
list = ["Bigdata", 27000, "40days", 2800]
df2 = df.append(pd.DataFrame([list],
      columns = ["Courses", "Fee", "Duration", "Discount"]),
   ignore_index = True)

# A series object with the same index as DataFrame
df2 = df.append(pd.Series(list, index = ["Courses", "Fee", "Duration", "Discount"]),
   ignore_index = True)

Let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration and Discount.

import pandas as pd
technologies = {
   'Courses': ["Spark", "PySpark", "Hadoop", "Python", "Pandas"],
   'Fee': [22000, 25000, 23000, 24000, 26000],
   'Duration': ['30days', '50days', '35days', '40days', '55days'],
   'Discount': [1000, 2300, 1000, 1200, 2500]
}
df = pd.DataFrame(technologies)
print(df)

Yields below output.

Courses Fee Duration Discount
0 Spark 22000 30 days 1000
1 PySpark 25000 50 days 2300
2 Hadoop 23000 35 days 1000
3 Python 24000 40 days 1200
4 Pandas 26000 55 days 2500

Yields below output.

Courses Fee Duration Discount
0 Spark 22000 30 days 1000
1 PySpark 25000 50 days 2300
2 Hadoop 23000 35 days 1000
3 Python 24000 40 days 1200
4 Pandas 26000 55 days 2500
5 Hyperion 27000 60 days 2000

Use df.iloc[1]=list to append the row to the second position of the DataFrame as Index starts from zero.

# New list to append DataFrame
list = ["Oracle", 20000, "60days", 2000]
# using iloc[] method
df.iloc[1] = list
print(df)

Suggestion : 5

There are multiple ways to get a python list from a pandas dataframe depending upon what sort of list you want to create. To quickly get a list from a dataframe with each item representing a row in the dataframe, you can use the tolist() function like df.values.tolist(),As mentioned above, you can quickly get a list from a dataframe using the tolist() function.,The following are some of the ways to get a list from a pandas dataframe explained with examples.,In the above example, df.values returns the numpy representation of the dataframe df which is then converted to a list using the tolist() function. You can see that we get a list of lists with each item in the list representing a row in the dataframe.

First, let’s create a dataframe of a sample stock portfolio that we’ll be using throughout this tutorial.

import pandas as pd

data = {
   'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.', \
      'Apple Inc.', 'Netflix, Inc.'
   ],
   'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'],
   'Industry': ['Tech', 'Tech', 'Automotive', 'Tech', 'Entertainment'],
   'Shares': [100, 50, 150, 200, 80]
}

df = pd.DataFrame(data)
print(df)

Output:

                    Name Symbol Industry Shares
                    0 Microsoft Corporation MSFT Tech 100
                    1 Google, LLC GOOG Tech 50
                    2 Tesla, Inc.TSLA Automotive 150
                    3 Apple Inc.AAPL Tech 200
                    4 Netflix, Inc.NFLX Entertainment 80

As mentioned above, you can quickly get a list from a dataframe using the tolist() function.

ls = df.values.tolist()
print(ls)

You can also use tolist() function on individual columns of a dataframe to get a list with column values.

# list with each item representing a column
ls = []
for col in df.columns:
   # convert pandas series to list
col_ls = df[col].tolist()
# append column list to ls
ls.append(col_ls)
# print the created list
print(ls)

You can also create a list by iterating through the rows of the dataframe.

ls = []
# iterate over the rows
for i, row in df.iterrows():
   # create a list representing the dataframe row
row_ls = [row['Name'], row['Symbol'], row['Industry'], row['Shares']]
# append row list to ls
ls.append(row_ls)

print(ls)