0

Scenario

I want to obtain the registered languages the user has along with all the user details, my current problem is that values are getting duplicated when I GROUP_CONCAT the target column. Can someone explain me why the values are being duplicated.

My tables

// users
+------+-------------+------------+----------------+--------------+
|  id  |  firstname  |  lastname  |      email     |  created_at  |
+------+-------------+------------+----------------+--------------+
|  10  |    John     |     Doe    |  john@doe.com  |  2014-10-21  |
+------+-------------+------------+----------------+--------------+

// teachers
+------+-----------+------------------+-------------------+
|  id  |  user_id  |  years_teaching  |  months_teaching  |
+------+-----------+------------------+-------------------+
|  35  |     10    |        3         |         6         |
+------+-----------+------------------+-------------------+

// teacher_languages
+------+--------------+---------------+----------+---------+
|  id  |  teacher_id  |  language_id  |  status  |  views  |
+------+--------------+---------------+----------+---------+
|  52  |      35      |       5       |     1    |    80   |
+------+--------------+---------------+----------+---------+
|  53  |      35      |       7       |     1    |    40   |
+------+--------------+---------------+----------+---------+

// translators
+------+-----------+------------------------+---------------------+
|  id  |  user_id  |  certified_translator  |  accept_travelling  |
+------+-----------+------------------------+---------------------+
|  23  |     10    |            0           |          1          |
+------+-----------+------------------------+---------------------+

// translator_languages
+------+-----------------+---------------+----------+---------+
|  id  |  translator_id  |  language_id  |  status  |  views  |
+------+-----------------+---------------+----------+---------+
|  52  |       23        |       5       |     1    |    27   |
+------+-----------------+---------------+----------+---------+
|  53  |       23        |       7       |     1    |    82   |
+------+-----------------+---------------+----------+---------+

// languages
+------+-----------------+------------+
|  id  |  language_text  |  language  |
+------+-----------------+------------+
|   5  |     English     |    EN      |
+------+-----------------+------------+
|   7  |      French     |    FR      |
+------+-----------------+------------+

Current query

This is the query that produces the duplicate values

SELECT 
  u.*,
  // teacher languages
  GROUP_CONCAT(l.language_text) AS teacher_languages,
  GROUP_CONCAT(tl.status) AS teacher_languages_status,
  GROUP_CONCAT(tl.views) AS teacher_language_views
  // translator languages
  GROUP_CONCAT(trl_txt.language_text) AS translator_languages,
  GROUP_CONCAT(trl.status) AS translator_languages_status,
  GROUP_CONCAT(trl.views) AS translator_language_views
FROM
  users u
LEFT JOIN 
  teachers t ON t.user_id = u.id
LEFT JOIN 
  teacher_languages tl ON tl.teacher_id = t.id
LEFT JOIN 
  languages l ON l.id = tl.language_id
LEFT JOIN 
  translators tr ON tr.user_id = u.id
LEFT JOIN 
  translator_languages trl ON trl.teacher_id = t.id
LEFT JOIN 
  languages trl_txt ON trl_txt.id = trl.language_id
GROUP BY
  u.id
ORDER BY
  u.created_at

Results

[0] =>
  [id]                         => 10
  [firstname]                  => 'John'
  [lastname]                   => 'Doe'
  [email]                      => 'john@doe.com'
  [created_at]                 => '2014-10-21'
  [teacher_languages]          => 'English, English, French, French'
  [teacher_language_status]    => '1,1,1,1'
  [teacher_language_views]     => '80,40,80,40'
  [translator_languages]       => 'English, French, English, French'
  [translator_language_status] => '1,1,1,1'
  [translator_language_views]  => '27,82,27,82'
CupOfJoe
  • 499
  • 1
  • 5
  • 13
  • possible duplicate of [MySQL DISTINCT on a GROUP\_CONCAT()](http://stackoverflow.com/questions/3083499/mysql-distinct-on-a-group-concat) – NMK Oct 21 '14 at 05:05
  • 1
    I cannot use `DISTINCT` because the order of the values are in order of the views and status, therefore if I do a `DISTINCT` on the status I will get 1 value in return if all are set to 1 (TRUE). – CupOfJoe Oct 21 '14 at 05:09

1 Answers1

1

Q: Can someone explain me why the values are being duplicated.

A: To better understand why values are duplicated, run query without GROUP_CONCAT and GROUP BY. Your query then produces some sort of Cartesian product: because each teacher and each translator has two languages in the result you get all possible combinations:

teacher_languages  translator_languages
-----------------  --------------------
English            English
English            French
French             English
French             French

GROUP_CONCAT just concatenates all column values and you get:

[teacher_languages]          => 'English, English, French, French'
[translator_languages]       => 'English, French, English, French'

One of possible solutions to avoid duplicates is to use subqueries:

SELECT 
  u.*,
  a.teacher_languages, a.teacher_languages_status, a.teacher_language_views,
  b.translator_languages, b.translator_languages_status, b.translator_language_views
FROM users u
LEFT JOIN (
  SELECT 
    t.user_id,
    -- teacher languages
    GROUP_CONCAT(tl_txt.language_text) AS teacher_languages,
    GROUP_CONCAT(tl.status) AS teacher_languages_status,
    GROUP_CONCAT(tl.views) AS teacher_language_views
  FROM teachers t
  LEFT JOIN teacher_languages tl ON tl.teacher_id = t.id
  LEFT JOIN languages tl_txt ON tl_txt.id = tl.language_id
  GROUP BY t.user_id
) a ON a.user_id = u.id
LEFT JOIN (
  SELECT 
    tr.user_id,
    -- translator languages
    GROUP_CONCAT(trl_txt.language_text) AS translator_languages,
    GROUP_CONCAT(trl.status) AS translator_languages_status,
    GROUP_CONCAT(trl.views) AS translator_language_views
  FROM translators tr
  LEFT JOIN translator_languages trl ON trl.teacher_id = tr.id
  LEFT JOIN languages trl_txt ON trl_txt.id = trl.language_id
  GROUP BY tr.user_id
) b ON b.user_id = u.id
ORDER BY u.created_at
Rimas
  • 5,684
  • 2
  • 22
  • 37
  • thank you very much! this worked like a charm, I even used the same strategy to obtain other information that I wasn't sure how to bring it out from the database. thanks again! – CupOfJoe Oct 22 '14 at 15:57