You can make copies of each slice, like this:
def process(x):
new = []
for d in x:
d = d.copy() # each one is now a copy
d.iloc[1, 0] = 0
d.iloc[1, 2] = 0
new.append(d)
return new
Change your code and process
function call to get your required output. Also, I used copy in for loop to make subset of dataframe which is independent to change in future, in your case it makes changes to original df which are reflected with all zeros in other dfs list:
for col in range(df.shape[1] - 2):
for row in range(df.shape[0] - 2):
dfs.append(df.iloc[row: row + 3, col: col + 3].copy())
dfs = process(dfs)
I am creating a dataframe like this.,Lastly, note that dfs = process(dfs) is anycodings_dataframe actually fine; you don't need to make a anycodings_dataframe copy of the enclosing list.,Long story short, you are a victim of anycodings_dataframe chained indexing, which can lead to bad anycodings_dataframe things happening.,spliting the dataframe into 3,3 matrix like anycodings_pandas below, it will have 16 matrix. dfs=[]
I am creating a dataframe like this.
np.random.seed(2) df = pd.DataFrame(np.random.randint(1, 6, (6, 6))) out[] 0 1 1 4 3 4 1 1 3 2 4 3 5 5 2 5 4 5 3 4 4 3 3 2 3 5 4 1 4 5 4 2 3 1 5 5 5 3 5 3 2 1
spliting the dataframe into 3,3 matrix like anycodings_pandas below, it will have 16 matrix. dfs=[]
for col in range(df.shape[1] - 2):
for row in range(df.shape[0] - 2):
dfs.append(df.iloc[row: row + 3, col: col + 3])
lets print,
dfs[0]
1 1 4
3 2 4
5 4 5
dfs[1]
3 2 4
5 4 5
3 2 3
.
.
.
dfs[15]
5 4 1
3 1 5
3 2 1
my expected output, is
dfs[0]
1 1 4
0 2 0
5 4 5
but what my function returns is,
dfs[0]
1 1 4
0 0 0
0 0 0
dfs[1]
0 0 0
0 0 0
0 0 0
You can make copies of each slice, like anycodings_dataframe this:
def process(x):
new = []
for d in x:
d = d.copy() # each one is now a copy
d.iloc[1, 0] = 0
d.iloc[1, 2] = 0
new.append(d)
return new
Change your code and process function anycodings_dataframe call to get your required output. Also, anycodings_dataframe I used copy in for loop to make subset anycodings_dataframe of dataframe which is independent to anycodings_dataframe change in future, in your case it makes anycodings_dataframe changes to original df which are anycodings_dataframe reflected with all zeros in other dfs anycodings_dataframe list:
for col in range(df.shape[1] - 2):
for row in range(df.shape[0] - 2):
dfs.append(df.iloc[row: row + 3, col: col + 3].copy())
dfs = process(dfs)
This will not modify df because the column alignment is before value assignment.,pandas aligns all AXES when setting Series and DataFrame from .loc, and .iloc.,In this section, we will focus on the final point: namely, how to slice, dice, and generally get and set subsets of pandas objects. The primary focus will be on Series and DataFrame as they have received more development attention in this area.,.loc, .iloc, and also [] indexing can accept a callable as indexer. The callable must be a function with one argument (the calling Series or DataFrame) that returns valid output for indexing.
In[1]: dates = pd.date_range('1/1/2000', periods = 8)
In[2]: df = pd.DataFrame(np.random.randn(8, 4),
...: index = dates, columns = ['A', 'B', 'C', 'D'])
...:
In[3]: df
Out[3]:
A B C D
2000 - 01 - 01 0.469112 - 0.282863 - 1.509059 - 1.135632
2000 - 01 - 02 1.212112 - 0.173215 0.119209 - 1.044236
2000 - 01 - 03 - 0.861849 - 2.104569 - 0.494929 1.071804
2000 - 01 - 04 0.721555 - 0.706771 - 1.039575 0.271860
2000 - 01 - 05 - 0.424972 0.567020 0.276232 - 1.087401
2000 - 01 - 06 - 0.673690 0.113648 - 1.478427 0.524988
2000 - 01 - 07 0.404705 0.577046 - 1.715002 - 1.039268
2000 - 01 - 08 - 0.370647 - 1.157892 - 1.344312 0.844885
In[4]: s = df['A']
In[5]: s[dates[5]]
Out[5]: -0.6736897080883706
In[6]: df
Out[6]:
A B C D
2000 - 01 - 01 0.469112 - 0.282863 - 1.509059 - 1.135632
2000 - 01 - 02 1.212112 - 0.173215 0.119209 - 1.044236
2000 - 01 - 03 - 0.861849 - 2.104569 - 0.494929 1.071804
2000 - 01 - 04 0.721555 - 0.706771 - 1.039575 0.271860
2000 - 01 - 05 - 0.424972 0.567020 0.276232 - 1.087401
2000 - 01 - 06 - 0.673690 0.113648 - 1.478427 0.524988
2000 - 01 - 07 0.404705 0.577046 - 1.715002 - 1.039268
2000 - 01 - 08 - 0.370647 - 1.157892 - 1.344312 0.844885
In[7]: df[['B', 'A']] = df[['A', 'B']]
In[8]: df
Out[8]:
A B C D
2000 - 01 - 01 - 0.282863 0.469112 - 1.509059 - 1.135632
2000 - 01 - 02 - 0.173215 1.212112 0.119209 - 1.044236
2000 - 01 - 03 - 2.104569 - 0.861849 - 0.494929 1.071804
2000 - 01 - 04 - 0.706771 0.721555 - 1.039575 0.271860
2000 - 01 - 05 0.567020 - 0.424972 0.276232 - 1.087401
2000 - 01 - 06 0.113648 - 0.673690 - 1.478427 0.524988
2000 - 01 - 07 0.577046 0.404705 - 1.715002 - 1.039268
2000 - 01 - 08 - 1.157892 - 0.370647 - 1.344312 0.844885
In[9]: df[['A', 'B']]
Out[9]:
A B
2000 - 01 - 01 - 0.282863 0.469112
2000 - 01 - 02 - 0.173215 1.212112
2000 - 01 - 03 - 2.104569 - 0.861849
2000 - 01 - 04 - 0.706771 0.721555
2000 - 01 - 05 0.567020 - 0.424972
2000 - 01 - 06 0.113648 - 0.673690
2000 - 01 - 07 0.577046 0.404705
2000 - 01 - 08 - 1.157892 - 0.370647
In[10]: df.loc[: , ['B', 'A']] = df[['A', 'B']]
In[11]: df[['A', 'B']]
Out[11]:
A B
2000 - 01 - 01 - 0.282863 0.469112
2000 - 01 - 02 - 0.173215 1.212112
2000 - 01 - 03 - 2.104569 - 0.861849
2000 - 01 - 04 - 0.706771 0.721555
2000 - 01 - 05 0.567020 - 0.424972
2000 - 01 - 06 0.113648 - 0.673690
2000 - 01 - 07 0.577046 0.404705
2000 - 01 - 08 - 1.157892 - 0.370647
In[12]: df.loc[: , ['B', 'A']] = df[['A', 'B']].to_numpy()
In[13]: df[['A', 'B']]
Out[13]:
A B
2000 - 01 - 01 0.469112 - 0.282863
2000 - 01 - 02 1.212112 - 0.173215
2000 - 01 - 03 - 0.861849 - 2.104569
2000 - 01 - 04 0.721555 - 0.706771
2000 - 01 - 05 - 0.424972 0.567020
2000 - 01 - 06 - 0.673690 0.113648
2000 - 01 - 07 0.404705 0.577046
2000 - 01 - 08 - 0.370647 - 1.157892
In[14]: sa = pd.Series([1, 2, 3], index = list('abc'))
In[15]: dfa = df.copy()
Preferred Option: Make sure that your column label (or row label) is in your dataframe!,Now let's try to call a column that is in our dataframe and is NOT in our dataframe,In most cases, think of ‘key’ as the same as ‘name.’ Pandas is telling you that it can not find your column name. The preferred method is to *make sure your column name is in your dataframe.*,It’s best to head back upstream with your code and debug where your expectations and dataframe columns mismatch.
1. df.get('your_column',
default = value_if_no_column)
import pandas as pd
df = pd.DataFrame([('Foreign Cinema', 'Restaurant'),
('Liho Liho', 'Restaurant'),
('500 Club', 'bar'),
('The Square', 'bar')
],
columns = ('name', 'type')
)
df
# Is in our dataframe
df['name']
0 Foreign Cinema 1 Liho Liho 2 500 Club 3 The Square Name: name, dtype: object
# Is not in our dataframe
df['food']