Write the most efficient implementation of the function RemoveDuplication()

Question

so i am studying and i have this question , Write the most efficient implementation of the function RemoveDuplication(), which removes any duplication in the list. Assume that the list is sorted but may have duplication. So, if the list is originally  <2, 2, 5, 6, 6, 6, 9>, your function should make it <2, 5, 6, 9>.

the code that i thought of to remove the duplication is this right here , i wanted to know , if there is more efficient ways of removing the duplications in the list

template <class T>
void DLList<T>:: RemoveDuplication()
{
    for(DLLNode<T>*ptr = head; ptr!=NULL; ptr=ptr->next)
        while (ptr->val == ptr->next->val)
        {
            ptr->next->next->prev = ptr;
            ptr->next = ptr->next->next;
        }
}

Define _efficient_ please. Also your code removes adjacent duplicate values only, is you list already sorted at this point? — πάντα ῥεῖ, May 30 '15 at 14:16
Why not just use [`std::unique`](http://en.cppreference.com/w/cpp/algorithm/unique)? — Baum mit Augen, May 30 '15 at 14:17
@πάνταῥεῖ yes the question assumes the list is already sorted but may have duplication , like the example i provided — Abdarahman Hajabi PyroStreak, May 30 '15 at 14:21
@Abdarahman Hajabi PyroStreak It is entirely a wrong code. As for efficiency then there is nothing better except to traverse the list sequentially. — Vlad from Moscow, May 30 '15 at 14:21
@VladfromMoscow hmm i am not following you , isn't my code traversing the list sequentially , i am starting from the head and moving till i reach the point after the tail , my ptr->next value will change upon reaching duplicate values ? — Abdarahman Hajabi PyroStreak, May 30 '15 at 14:32
@Abdarahman Hajabi PyroStreak Traversing list can be done differently either correctly or incorrectly. For example what will be the value of ptr->next->next if the list contains one or two nodes? And what will be with duplicated nodes? — Vlad from Moscow, May 30 '15 at 14:45
so i guess i need to add another case as if i had a list with one or two nodes ? , other than that i think as with the example i provided it should work fine but i have to change ptr!=NULL into a ptr->next!=NULL — Abdarahman Hajabi PyroStreak, May 30 '15 at 15:08

score 1 · Accepted Answer · answered May 30 '15 at 14:25

1

It looks like your code will run in O(n) which is good for an algorithm. It is probably not going to be any more efficient, because you'll have to visit every item to delete it.

If you don't want to delete duplicate objects though, but want to return an new list containing the non-duplicate objects, you could make it slightly faster by making it O(m) where m is the amount of unique numbers, which is smaller or equal to n. But i couldn't think up any way to do this.

Recapping, it is possible to be slightly faster, but this is hard and the improvement is negligible.

ps. Dont forget to delete stuff when you take it out of your list ;)

answered May 30 '15 at 14:25

David van rijn

1,802
7
20

for the deletion should i add another pointer as in DLLNode* ptr2 = ptr; and then deleting it after i am done with the edit on the ptr->next value in the while loop – Abdarahman Hajabi PyroStreak May 30 '15 at 14:34
No what i meant by delete stuff is that now your program has a memory leak. You remove a `DLList` item from the list, but you don't call `delete item`. So it will be floating around in memory. So you should create a pointer pointing to the deleted item and then call `delete ptr` on it. – David van rijn May 30 '15 at 14:38
hmm i think i somewhat understood what you said , so if i did what i suggested that means i will only be deleting the pointer to that item but not the item itself ? – Abdarahman Hajabi PyroStreak May 30 '15 at 14:43
yes. That is, i am assuming that you do call something like `new DLList<..>` and as a rule of thumb, for every `new` there has to be a `delete`. (with some exceptions) – David van rijn May 30 '15 at 14:45
well the idea is am making a list in my main and filling lets say the values i provided in the example then i want to call this RemoveDuplication() function that i made , and the list should be with no duplication , assuming its this way , does this only mean that i have to delete the ptr only since i am not using a "new" ? – Abdarahman Hajabi PyroStreak May 30 '15 at 14:52
No, if you don't use new you should be fine. Have a look at [this](http://stackoverflow.com/questions/3673998/what-is-difference-between-instantiating-an-object-using-new-vs-without) its all about heap vs stack allocation – David van rijn May 30 '15 at 15:07

score 0 · Answer 2 · answered May 30 '15 at 15:58

0

I think O(n) is ok. However the most important thing is your program will crash :-)

for(DLLNode<T>*ptr = head; ptr!=NULL; ptr=ptr->next)
    while (ptr->val == ptr->next->val)

The code is deferencing ptr->next without checking that it is != NULL. Hence, the algorithm should crash when it reaches the last element of the list.

And now an optimization question for you: how to make the program correct without testing for ptr AND ptr->next at each iteration ?

answered May 30 '15 at 15:58

Guillaume

91
4

yes i have actually noticed this problem and modified prt!=NULL , into ptr->next!=NULL , i think that will fix the problem since this way ptr will stop before reaching the last iteration but it will still check for the last element , i have no idea tbh, i am still busy studying atm and that was the only way i thought of for now , but i'd like to see if there is a more efficient way better than O(n) that i could have probably missed – Abdarahman Hajabi PyroStreak May 30 '15 at 16:37
testing for `ptr->next!=NULL` is good but then won't the program crash on empty lists ? – Guillaume May 30 '15 at 18:41
yep that's true i forgot to edit the question on assuming the list is not empty, or i can just add extra code to test if its empty or not , also further testing i should also add cases in case it has only one node or two nodes only – Abdarahman Hajabi PyroStreak May 30 '15 at 19:11

Write the most efficient implementation of the function RemoveDuplication()

2 Answers2