Using groupby
follow with describe
df.groupby('first_tag').viewcount.describe()
Out[89]:
count mean std min 25 % 50 % 75 % max
first_tag
Excel 1.0 83.0 NaN 83.0 83.0 83.0 83.0 83.0
pandas 2.0 87.0 0.0 87.0 87.0 87.0 87.0 87.0
python 1.0 78.0 NaN 78.0 78.0 78.0 78.0 78.0
Now i want to get min, max and avg of anycodings_pandas viewcount for each tag python, pandas and anycodings_pandas dataframe. ,I have made separate database where anycodings_pandas first_tag are python, pandas and dataframe anycodings_pandas but i don't know how to get min max and avg anycodings_pandas of viewcount for each tag.,Pandas empty dataframe resulting from an isin function that keeps objects with an ID if the ID is present in a dataFrame of just IDs,Spring boot autowiring an interface with multiple implementations
My data set is like this
id viewcount title answercount tags first_tag 1 78 ** 2 ** python 2 87 ** 1 ** pandas 3 87 ** 1 ** pandas 4 83 ** 0 ** Excel
Using groupby follow with describe
df.groupby('first_tag').viewcount.describe()
Out[89]:
count mean std min 25 % 50 % 75 % max
first_tag
Excel 1.0 83.0 NaN 83.0 83.0 83.0 83.0 83.0
pandas 2.0 87.0 0.0 87.0 87.0 87.0 87.0 87.0
python 1.0 78.0 NaN 78.0 78.0 78.0 78.0 78.0
1 week ago Aug 23, 2016 · Max returns the maximum value of the column. It does this using the collating sequence so it can work on character and datetime columns in addition to numeric ones. Min is the inverse. It returns the smallest value of the column and also works with several different data types. Avg returns the average or arithmetic mean of the values. , 1 week ago Mar 19, 2019 · I have a script which uses system.tag.queryTagHistory to retrieve log data in 15 minute intervals for the previous 24 hours. I run it three times, and get a minimum dataset, a maximum dataset, and an average dataset. Now, I want to find the lowest value in the minimum dataset, the highest value in the maximum dataset, and the average value of all values in the … , 5 days ago May 04, 2022 · dmachuca December 21, 2020, 1:48pm #1. I have a report that I am trying to display a history tag over a period of time. On the report I want to display the value, the min, the max and the average. When I run the reports the value shows correct but the min, max and average show “<N/A>”. I am trying to achieve this using a “Tag Historian ... , 1 day ago Apr 10, 2006 · Hi Nikita, MIN,MAX,AVG,SUM , COUNT keywords are used for reading the aggregate data. MIN - returns the minimum value. MAX - returns the maximum value. AVG - returns the average value. SUM - returns the sum. Supposing that you want to find out the minimum value from all the records in the database , you can store the final value into a …
id viewcount title answercount tags first_tag 1 78 ** 2 ** python 2 87 ** 1 ** pandas 3 87 ** 1 ** pandas 4 83 ** 0 ** Excel
df.groupby('first_tag').viewcount.describe() Out[89]: count mean std min 25 % 50 % 75 % max first_tag Excel 1.0 83.0 NaN 83.0 83.0 83.0 83.0 83.0 pandas 2.0 87.0 0.0 87.0 87.0 87.0 87.0 87.0 python 1.0 78.0 NaN 78.0 78.0 78.0 78.0 78.0
id viewcount title answercount tags first_tag 1 78 ** 2 ** python 2 87 ** 1 ** pandas 3 87 ** 1 ** pandas 4 83 ** 0 ** Excel
df.groupby('first_tag').viewcount.describe() Out[89]: count mean std min 25 % 50 % 75 % max first_tag Excel 1.0 83.0 NaN 83.0 83.0 83.0 83.0 83.0 pandas2 .0 87.0 0.0 87.0 87.0 87.0 87.0 87.0 python1 .0 78.0 NaN 78.0 78.0 78.0 78.0 78.0
Count consecutive values and average/min/max time for each group of values,To merge multiple columns into one column and count the repetition of unique values and maintain a separate column for each count in pandas dataframe,How to concatenate consecutive rows of each group and make them as columns in dataframe and count the occurrences for each group?,How to count consective 1 in column and get max count of each group
First, make sure that the index uses the datetime type:
df.index = pd.to_datetime(df.index)
- first element of each stretch (first_stretch)
- last element of each stretch (last_stretch)
- groups of stretches (stretch_group)
- the time difference in seconds from the first value (timedelta)
- the time difference in seconds between consecutive rows (time_diff)
- the cumulated time in seconds within each stretch (cum_diff)
df['first_stretch'] = df['col1'] & df['col1'].shift(1).fillna(0).eq(0)
df['last_stretch'] = (df['col1'] - df['col1'].shift(-1)).eq(1)
df['stretch_group'] = df['first_stretch'].cumsum().mask(~df['col1'].astype(bool))
df['timedelta'] = (df.index - df.index[0]).total_seconds().astype(int)
df['timediff'] = df['timedelta'].diff(1).fillna(0).astype(int)
df['cum_diff'] = df.groupby('stretch_group')['timediff'].cumsum() * df['col1']
col1 col2 first_stretch last_stretch stretch_group timedelta timediff cum_diff datetime 2021 - 05 - 24 00: 09: 22 1 0 True False 1.0 0 0 0 2021 - 05 - 24 00: 09: 24 1 0 False True 1.0 2 2 2 2021 - 05 - 24 00: 09: 25 0 1 False False NaN 3 1 0 2021 - 05 - 24 00: 09: 26 1 0 True True 2.0 4 1 1 2021 - 05 - 24 00: 09: 27 0 0 False False NaN 5 1 0
You could try this:
def count_secs(ser):
return (ser.index[-1] - ser.index[0]).seconds + 1
def min_max_mean(col):
if 1 not in col.values:
return 0, 0, 0
groups = (col != col.shift(1))[col.eq(1)].cumsum()
counts = groups.groupby(groups.values).apply(count_secs)
return counts.min(), counts.max(), counts.mean()
df = df.apply(min_max_mean, axis = 'index')
df.index = ['min', 'max', 'mean']
Result for df
col1 col2 datetime 2021 - 05 - 24 00: 09: 22 1 0 2021 - 05 - 24 00: 09: 24 1 0 2021 - 05 - 24 00: 09: 25 0 1 2021 - 05 - 24 00: 09: 26 1 0 2021 - 05 - 24 00: 09: 27 0 0
is
col1 col2 min 1.0 1.0 max 3.0 1.0 mean 2.0 1.0
Result:
col1 col2 min 1.0 1.0 max 2.0 1.0 mean 1.5 1.0