You can pass list
constructor as an aggregating function, along axis 1 (index)
data['New_column'] = data.agg(list, axis = 1)
Outputs:
C1 C2 C3 C4 New_column
0 0.98 1.25 1.30 1.00[0.98, 1.25, 1.3, 1.0]
1 1.10 0.99 1.41 0.99[1.1, 0.99, 1.41, 0.99]
You can use the insert function for inserting a new column. This should solve your issue.
def append_new_column():
data = {
"C1": [0.98, 1.10],
"C2": [1.25, 0.99],
"C3": [1.3, 1.41],
"C4": [1.00, .99]
}
data = pd.DataFrame(data)
new_column = []
for i in range(len(data)):
new_column.append(data.iloc[i, 0: 4].values)
data.insert(len(data.columns), "New Column", new_column, True)
return data
In this article, we will discuss how to add / append a single or multiple rows in a dataframe using dataframe.append() or loc & iloc.,We can pass a list of series too in the dataframe.append() for appending multiple rows in dataframe. For example, we can create a list of series with same column names as dataframe i.e.,We can add a row at specific position too in the dataframe using iloc[] attribute. Checkout the example, where we will add a list as the 3rd row the dataframe. For example,,It will append a new row to the dataframe with index label ‘k’. Let’s see a complete example to append a list as row to the dataframe,
Pandas Dataframe provides a function dataframe.append() to add rows to a dataframe i.e.
DataFrame.append(other, ignore_index = False, verify_integrity = False, sort = None)
Name Age City Country
a jack 34 Sydeny Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 las vegas US
Let’s add a new row in above dataframe by passing dictionary i.e.
# Pass the row elements as key value pairs to append() function mod_df = df.append({ 'Name': 'Sahil', 'Age': 22 }, ignore_index = True) print('Modified Dataframe') print(mod_df)
Complete example to add a dictionary as row to the dataframe is as follows,
import pandas as pd # List of Tuples students = [('jack', 34, 'Sydeny', 'Australia'), ('Riti', 30, 'Delhi', 'India'), ('Vikas', 31, 'Mumbai', 'India'), ('Neelu', 32, 'Bangalore', 'India'), ('John', 16, 'New York', 'US'), ('Mike', 17, 'las vegas', 'US') ] #Create a DataFrame object df = pd.DataFrame(students, columns = ['Name', 'Age', 'City', 'Country'], index = ['a', 'b', 'c', 'd', 'e', 'f']) print('Original Dataframe') print(df) # Pass the row elements as key value pairs to append() function mod_df = df.append({ 'Name': 'Sahil', 'Age': 22 }, ignore_index = True) print('Modified Dataframe') print(mod_df)
Output:
Original Dataframe
Name Age City Country
a jack 34 Sydeny Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 las vegas US
Modified Dataframe
Name Age City Country
0 jack 34 Sydeny Australia
1 Riti 30 Delhi India
2 Vikas 31 Mumbai India
3 Neelu 32 Bangalore India
4 John 16 New York US
5 Mike 17 las vegas US
6 Sahil 22 NaN NaN
The inner square brackets define a Python list with column names, whereas the outer brackets are used to select the data from a pandas DataFrame as seen in the previous example.,To select a single column, use square brackets [] with the column name of the column of interest.,Each column in a DataFrame is a Series. As a single column is selected, the returned object is a pandas Series. We can verify this by checking the type of the output:,I’m interested in the age of the Titanic passengers. In [4]: ages = titanic["Age"] In [5]: ages.head() Out[5]: 0 22.0 1 38.0 2 26.0 3 35.0 4 35.0 Name: Age, dtype: float64 To select a single column, use square brackets [] with the column name of the column of interest.
In[1]: import pandas as pd
In[2]: titanic = pd.read_csv("data/titanic.csv")
In[3]: titanic.head()
Out[3]:
PassengerId Survived Pclass Name Sex...Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr.Owen Harris male...0 A / 5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs.John Bradley(Florence Briggs Th...female...0 PC 17599 71.2833 C85 C 2 3 1 3 Heikkinen, Miss.Laina female...0 STON / O2.3101282 7.9250 NaN S 3 4 1 1 Futrelle, Mrs.Jacques Heath(Lily May Peel) female...0 113803 53.1000 C123 S 4 5 0 3 Allen, Mr.William Henry male...0 373450 8.0500 NaN S
[5 rows x 12 columns]
In[4]: ages = titanic["Age"]
In[5]: ages.head()
Out[5]:
0 22.0
1 38.0
2 26.0
3 35.0
4 35.0
Name: Age, dtype: float64
In[6]: type(titanic["Age"])
Out[6]: pandas.core.series.Series
In[7]: titanic["Age"].shape
Out[7]: (891, )
In[8]: age_sex = titanic[["Age", "Sex"]]
In[9]: age_sex.head()
Out[9]:
Age Sex
0 22.0 male
1 38.0 female
2 26.0 female
3 35.0 female
4 35.0 male
By using df.loc[index]=list you can append a list as a row to the DataFrame at a specified Index, In order to add at the end get the index of the last record using len(df) function. The below example adds the list ["Hyperion",27000,"60days",2000] to the end of the pandas DataFrame.,Use df.iloc[1]=list to append the row to the second position of the DataFrame as Index starts from zero.,Using append() you can also append series as a row to the DataFrame.,In this article, you have learned how to append a list as a row to Pandas DataFrame using DataFrame.loc[], DataFrame.iloc[], DataFrame.append() methods. Using these you can append a row at any position/index.
# New list to append Row to DataFrame list = ["Hyperion", 27000, "60days", 2000] df.loc[len(df)] = list # Addes at second position df.iloc[1] = list # Using append() list = ["Bigdata", 27000, "40days", 2800] df2 = df.append(pd.DataFrame([list], columns = ["Courses", "Fee", "Duration", "Discount"]), ignore_index = True) # A series object with the same index as DataFrame df2 = df.append(pd.Series(list, index = ["Courses", "Fee", "Duration", "Discount"]), ignore_index = True)
Let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
and Discount
.
import pandas as pd
technologies = {
'Courses': ["Spark", "PySpark", "Hadoop", "Python", "Pandas"],
'Fee': [22000, 25000, 23000, 24000, 26000],
'Duration': ['30days', '50days', '35days', '40days', '55days'],
'Discount': [1000, 2300, 1000, 1200, 2500]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
Courses Fee Duration Discount 0 Spark 22000 30 days 1000 1 PySpark 25000 50 days 2300 2 Hadoop 23000 35 days 1000 3 Python 24000 40 days 1200 4 Pandas 26000 55 days 2500
Yields below output.
Courses Fee Duration Discount 0 Spark 22000 30 days 1000 1 PySpark 25000 50 days 2300 2 Hadoop 23000 35 days 1000 3 Python 24000 40 days 1200 4 Pandas 26000 55 days 2500 5 Hyperion 27000 60 days 2000
Use df.iloc[1]=list
to append the row to the second position of the DataFrame as Index starts from zero.
# New list to append DataFrame list = ["Oracle", 20000, "60days", 2000] # using iloc[] method df.iloc[1] = list print(df)
There are multiple ways to get a python list from a pandas dataframe depending upon what sort of list you want to create. To quickly get a list from a dataframe with each item representing a row in the dataframe, you can use the tolist() function like df.values.tolist(),As mentioned above, you can quickly get a list from a dataframe using the tolist() function.,The following are some of the ways to get a list from a pandas dataframe explained with examples.,In the above example, df.values returns the numpy representation of the dataframe df which is then converted to a list using the tolist() function. You can see that we get a list of lists with each item in the list representing a row in the dataframe.
First, let’s create a dataframe of a sample stock portfolio that we’ll be using throughout this tutorial.
import pandas as pd
data = {
'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.', \
'Apple Inc.', 'Netflix, Inc.'
],
'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'],
'Industry': ['Tech', 'Tech', 'Automotive', 'Tech', 'Entertainment'],
'Shares': [100, 50, 150, 200, 80]
}
df = pd.DataFrame(data)
print(df)
Output:
Name Symbol Industry Shares
0 Microsoft Corporation MSFT Tech 100
1 Google, LLC GOOG Tech 50
2 Tesla, Inc.TSLA Automotive 150
3 Apple Inc.AAPL Tech 200
4 Netflix, Inc.NFLX Entertainment 80
As mentioned above, you can quickly get a list from a dataframe using the tolist()
function.
ls = df.values.tolist() print(ls)
You can also use tolist()
function on individual columns of a dataframe to get a list with column values.
# list with each item representing a column ls = [] for col in df.columns: # convert pandas series to list col_ls = df[col].tolist() # append column list to ls ls.append(col_ls) # print the created list print(ls)
You can also create a list by iterating through the rows of the dataframe.
ls = [] # iterate over the rows for i, row in df.iterrows(): # create a list representing the dataframe row row_ls = [row['Name'], row['Symbol'], row['Industry'], row['Shares']] # append row list to ls ls.append(row_ls) print(ls)