Questions tagged [top-n]
298 questions
196
votes
4 answers
Pandas get topmost n records within each group
Suppose I have pandas DataFrame like this:
>>> df = pd.DataFrame({'id':[1,1,1,2,2,2,2,3,4],'value':[1,2,3,1,2,3,4,1,1]})
>>> df
id value
0 1 1
1 1 2
2 1 3
3 2 1
4 2 2
5 2 3
6 2 4
7 3 1
8 …
Roman Pekar
- 92,153
- 25
- 168
- 181
159
votes
6 answers
Oracle SELECT TOP 10 records
I have an big problem with an SQL Statement in Oracle. I want to select the TOP 10 Records ordered by STORAGE_DB which aren't in a list from an other select statement.
This one works fine for all records:
SELECT DISTINCT
APP_ID,
NAME,
…
opHASnoNAME
- 18,735
- 24
- 93
- 138
59
votes
2 answers
Evaluation & Calculate Top-N Accuracy: Top 1 and Top 5
I have come across few (Machine learning-classification problem) journal papers mentioned about evaluate accuracy with Top-N approach. Data was show that Top 1 accuracy = 42.5%, and Top-5 accuracy = 72.5% in the same training, testing condition.
I…
D_9268
- 829
- 1
- 7
- 16
41
votes
1 answer
How to see top n entries of term-document matrix after tfidf in scikit-learn
I am new to scikit-learn, and I was using TfidfVectorizer to find the tfidf values of terms in a set of documents. I used the following code to obtain the same.
vectorizer = TfidfVectorizer(stop_words=u'english',ngram_range=(1,5),lowercase=True)
X =…
Amrith Krishna
- 2,366
- 3
- 29
- 50
21
votes
2 answers
Oracle SQL query: Retrieve latest values per group based on time
I have the following table in an Oracle DB
id date quantity
1 2010-01-04 11:00 152
2 2010-01-04 11:00 210
1 2010-01-04 10:45 132
2 2010-01-04 10:45 318
4 2010-01-04 10:45 122
1 2010-01-04 10:30 …
Tom
- 1,613
- 4
- 17
- 24
16
votes
5 answers
Oracle SQL - How to Retrieve highest 5 values of a column
How do you write a query where only a select number of rows are returned with either the highest or lowest column value.
i.e. A report with the 5 highest salaried employees?
Trevor
- 225
- 1
- 2
- 5
13
votes
2 answers
In MariaDB how do I select the top 10 rows from a table?
I just read online that MariaDB (which SQLZoo uses), is based on MySQL. So I thought that I can use ROW_NUMBER() function
However, when I try this function in SQLZoo :
SELECT * FROM (
SELECT * FROM route
) TEST7
WHERE ROW_NUMBER() < 10
then I…
Caffeinated
- 10,270
- 37
- 107
- 197
12
votes
2 answers
Find names of top-n highest-value columns in each pandas dataframe row
I have the following dataframe:
id p1 p2 p3 p4
1 0 9 1 4
2 0 2 3 4
3 1 3 10 7
4 1 5 3 1
5 2 3 7 10
I need to reshape the data frame in a way that for each id it will have the top 3 columns with…
chessosapiens
- 2,436
- 9
- 27
- 49
11
votes
6 answers
Tidyverse: filtering n largest groups in grouped dataframe
I want to filter the n largest groups based on count, and then do some calculations on the filtered dataframe
Here is some data
Brand <- c("A","B","C","A","A","B","A","A","B","C")
Category <- c(1,2,1,1,2,1,2,1,2,1)
Clicks <-…
Shinobi_Atobe
- 1,214
- 1
- 9
- 29
11
votes
2 answers
SUM of only TOP 10 rows
I have a query where I am only selecting the TOP 10 rows, but I have a SUM function in there that is still taking the sum of all the rows (disregarding the TOP 10). How do I get the total of only the top 10 rows?
Here is my SUM function :
SUM(…
Cfw412
- 111
- 1
- 1
- 3
10
votes
2 answers
How to find column-index of top-n values within each row of huge dataframe
I have a dataframe of format: (example data)
Metric1 Metric2 Metric3 Metric4 Metric5
ID
1 0.5 0.3 0.2 0.8 0.7
2 0.1 0.8 0.5 0.2 0.4
3 0.3 0.1 0.7 0.4 0.2 …
tfcoe
- 330
- 1
- 13
10
votes
4 answers
How to get top n companies from a data frame in decreasing order
I am trying to get the top 'n' companies from a data frame.Here is my code below.
data("Forbes2000", package = "HSAUR")
sort(Forbes2000$profits,decreasing=TRUE)
Now I would like to get the top 50 observations from this sorted vector.
Teja
- 11,878
- 29
- 80
- 137
8
votes
5 answers
Finding top N columns for each row in data frame
given a data frame with one descriptive column and X numeric columns, for each row I'd like to identify the top N columns with the higher values and save it as rows on a new dataframe.
For example, consider the following data frame:
df =…
Diego
- 31,278
- 18
- 81
- 126
8
votes
4 answers
Oracle Analytic function for min value in grouping
I'm new to working with analytic functions.
DEPT EMP SALARY
---- ----- ------
10 MARY 100000
10 JOHN 200000
10 SCOTT 300000
20 BOB 100000
20 BETTY 200000
30 ALAN 100000
30 TOM 200000
30 JEFF 300000
I want the department…
Travis Heseman
- 10,909
- 8
- 34
- 46
7
votes
1 answer
Is there a way to get the nlargest items per group in dask?
I have the following dataset:
location category percent
A 5 100.0
B 3 100.0
C 2 50.0
4 13.0
D 2 75.0
3 59.0
4 …
whisperstream
- 1,596
- 1
- 16
- 25