0

I have a table like this:

CREATE TABLE test (
  ID SERIAL PRIMARY KEY,
  user_id INT,
  createdAt DATE,
  status_id INT
);

INSERT INTO test VALUES
  (1, 12, '2020-01-01', 4),
  (2, 12, '2020-01-03', 7),
  (3, 12, '2020-01-06', 7),
  (4, 13, '2020-01-02', 5),
  (5, 13, '2020-01-03', 6),
  (6, 14, '2020-03-03', 8),
  (7, 13, '2020-03-04', 4),
  (8, 15, '2020-04-04', 7),
  (9, 14, '2020-03-02', 6),
  (10, 14, '2020-03-10', 5),
  (11, 13, '2020-04-10', 8);

this is my fiddle

In that table there's id for the id of each transaction, user_id was the user, createdAt was the date of transaction happen, and status_id was the status for every transaction (in this case status_id 4, 5, 6, 8 are approved transaction)

I want to find out max, min, avg different day for every each transaction on every users who doing transaction between '2020-02-01' and '2020-04-01' with >1 transaction approved on that period

This is my query:

SELECT MIN(diff) AS `MIN`, MAX(diff) AS `MAX`, SUM(diff) / COUNT(DISTINCT user_id) AS `AVG`
FROM (
  SELECT ID, user_id, DATEDIFF((SELECT t2.createdAt FROM test t2 WHERE t2.user_id = t1.user_id AND t1.createdAt <= t2.createdAt AND t2.id <> t1.id LIMIT 1), t1.createdAt) AS diff
  FROM test t1
  where 
  status_id in (4, 5, 6, 8)
  HAVING SUM(t1.user_id BETWEEN '2020-02-01' AND '2020-04-01')
                  AND SUM(t1.user_id >= '2020-02-01') > 1 

) DiffTable
WHERE diff IS NOT NULL

but it's said:

In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'fiddle_KDQIQDMUZEIOVXFHRZPY.t1.ID'; this is incompatible with sql_mode=only_full_group_by

what should i do?

this is my fiddle

expected result

+-----+-----+---------+
| MAX | MIN | AVERAGE |
+-----+-----+---------+
|  36 |   1 |      22 |
+-----+-----+---------+

explanation :

- the user_id who have approval transaction on 2020-02-01 until 2020-04-01 and user_id who have transaction more than 1 are user_id 13 & 14
- the maximum of different day on 2020-02-01 until 2020-04-01 are user_id 13 which the different day for each transaction happen in 2020-03-04 and doing next transaction again in 2020-04-10
- the minimum day of different day of each transaction are user_id 14 who doing transaction on 2020-03-02 and next transaction 2020-03-03
- average are 22 days (sum of different day on user_Id 13 & 14 / amount of user_id who fit on this condition) 
Nick
  • 118,076
  • 20
  • 42
  • 73
18Man
  • 540
  • 2
  • 13
  • When you say `different day`, do you mean you want the min/max/average number of days between transactions for a user? – Nick Mar 09 '20 at 05:28
  • not for a user, but all of the user who fit in this condition – 18Man Mar 09 '20 at 06:41

1 Answers1

1

You need to do the GROUPing outside of your subquery; the subquery should just be used to limit the selected transactions to those which have the desired status_id value and dates within the required range. Then you can select users with more than one transaction in the period in the outer query:

SELECT user_id,
       COUNT(*) AS transactions, 
       MIN(diff) AS `MIN`, 
       MAX(diff) AS `MAX`, 
       SUM(diff) / COUNT(diff) AS `AVG`
FROM (
  SELECT user_id, DATEDIFF((SELECT MIN(t2.createdAt)
                            FROM test t2
                            WHERE t2.user_id = t1.user_id
                              AND t1.createdAt < t2.createdAt
                              AND t2.status_id in (4, 5, 6, 8)
                            ), t1.createdAt) AS diff
  FROM test t1
  WHERE status_id in (4, 5, 6, 8)
    AND createdAt BETWEEN '2020-02-01' AND '2020-04-01'
) DiffTable
WHERE diff IS NOT NULL
GROUP BY user_id
HAVING COUNT(*) > 1

Output (for your fiddle):

user_id     transactions    MIN     MAX     AVG
14          2               1       7       4.0000

Demo on dbfiddle

If you want the values based on all transactions that occurred over that period, rather than by user_id, you can simply remove the GROUP BY and HAVING clauses:

SELECT COUNT(*) AS transactions, 
       MIN(diff) AS `MIN`, 
       MAX(diff) AS `MAX`, 
       SUM(diff) / COUNT(diff) AS `AVG`
FROM (
  SELECT user_id, DATEDIFF((SELECT MIN(t2.createdAt)
                            FROM test t2
                            WHERE t2.user_id = t1.user_id
                              AND t1.createdAt < t2.createdAt
                              AND t2.status_id in (4, 5, 6, 8)
                            ), t1.createdAt) AS diff
  FROM test t1
  WHERE status_id in (4, 5, 6, 8)
    AND createdAt BETWEEN '2020-02-01' AND '2020-04-01'
) DiffTable
WHERE diff IS NOT NULL

Output:

transactions    MIN     MAX     AVG
3               1       37      15.0000

Demo on dbfiddle

Note that there a couple of issues with the existing subquery in the DATEDIFF computation: a LIMIT without ORDER BY is not guaranteed to give the expected results, and there is no conditioning on status_id. I've fixed both those issues in my updated queries.

Nick
  • 118,076
  • 20
  • 42
  • 73
  • this is almost right sir, accidentally only user_id 14 who match with my situation instead all of the users who match with this condition, pls look at my new fiddle on this https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=8a357120e975e2740fb9a1f1620d7890 the minimun days should be in user_id 14, but the maximum different date should be users_Id 13 not 14, basically this question are the continous from my question before https://stackoverflow.com/questions/60562167/how-to-count-timediff-for-each-users-mysql, but the different is this question are including several conditions – 18Man Mar 09 '20 at 06:38
  • 1
    @FachryDzaky please see my edit. This includes transactions for any user which occurred in the designated time period. Note that I've made some updates to your `DATEDIFF` computation as well to correct a couple of issues. – Nick Mar 09 '20 at 07:34
  • sorry im new on mysql, the expected result was right, but can you tell me which part i should change when the conditions for user_id are change, usually i used having clause to find out several conditions for user_id , example like this user_id had more than one transaction, on next time, maybe i search the user_id who only transaction one time in the date range, so which part of having clause i must edited it, anyway thankyou for your answer @Nick – 18Man Mar 09 '20 at 07:37
  • yes sir, but in this case which part i should change if i met another case for the user, example like the user who only transaction once, or the user who doing transaction more than once, usually i used having count, but when hving count didnt used, i confused which part i should change to find out new cases for any other condition on users – 18Man Mar 09 '20 at 07:47
  • the average shouldn't 15 because user_id 15 are had status_id 7 which not counted because it's not approval transaction – 18Man Mar 09 '20 at 07:51
  • 1
    @FachryDzaky the average should be 15, it includes 1 and 7 for user 14, and 37 for user 13, so 3 transactions which add to 45 and average would be 45/3 = 15 – Nick Mar 09 '20 at 08:15
  • oh sure sir my bad, thankyou very much sir, can you tell me which part i should change/add if i faced several cases for any users_id such as the users_Id who doing transaction for more than one or the users_Id who transaction just only once in time range, usually i used having sum but on your query which part i should add or change for that having sum – 18Man Mar 09 '20 at 08:21
  • 1
    @FachryDzaky if you use the original query in my answer as a subquery (with slight modifications to just generate a sum of diffs instead of an average) you can get the same results as the second query in my answer by aggregating over that. This then allows you to put constraints e.g. `HAVING COUNT(*)>1` to restrict to users with more than 1 transaction) on the inner query. See https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=2e7680d2d350e31e6a6772a6242db841 – Nick Mar 09 '20 at 08:38
  • thankyou very much sir for your help, have a nice day sir @Nick – 18Man Mar 09 '20 at 08:55
  • 1
    @FachryDzaky no worries. I'm glad I could help. – Nick Mar 09 '20 at 09:04
  • sir im sorry, can you please check my new fiddle https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=2adbd41f461c8382630f27ad8cabcefa i've put having count(*) = 1 for the date range, its should be zero because there's no any user_id who transaction just one time on the time range. but why it's not zero for the result, pls take a look at the last fiddle – 18Man Mar 09 '20 at 09:05
  • 1
    @FachryDzaky I think you need to add `AND t2.createdAt BETWEEN '2020-02-01' AND '2020-04-01'` to the `DATEDIFF` subquery. https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=871f445bb8f6c9d088959d4cce8bfa01 – Nick Mar 09 '20 at 09:11
  • ah sir thankyou very much, sorry for lot of asking, have a nice day sir – 18Man Mar 09 '20 at 09:12