Greedy conditional algorithm

Question

I have a list of k artistes mapped to their respective music videos that they have starred in. This is represented in a multidimensional array:

musicvid_arr = 
[["MUSICVID 1", 2014, ["ARTISTE 1", "ARTISTE 2", "ARTISTE 3"]], 
["MUSICVID 2", 2014, ["ARTISTE 4", "ARTISTE 1", "ARTISTE 9", "ARTISTE 10"]], 
["MUSICVID 3", 1935, ["ARTISTE 2", "ARTISTE 10", "ARTISTE 6"]], 
["MUSICVID 4", 2010, ["ARTISTE 1", "ARTISTE 2", "ARTISTE 3"]], 
["MUSICVID 5", 2009, ["ARTISTE 4", "ARTISTE 1", "ARTISTE 9", "ARTISTE 2", "ARTISTE 6", "ARTISTE 5"]], 
["MUSICVID 6", 2014, ["ARTISTE 18", "ARTISTE 10", "ARTISTE 6", "ARTISTE 2"]], 
["MUSICVID 7", 2014, ["ARTISTE 9", "ARTISTE 2", "ARTISTE 3", "ARTISTE 0", "ARTISTE 9"]], 
["MUSICVID 8", 2000, ["ARTISTE 8", "ARTISTE 3", "ARTISTE 9", "ARTISTE 11", "ARTISTE 2", "ARTISTE 1"]],
["MUSICVID 9", 2014, ["ARTISTE 21", "ARTISTE 0", "ARTISTE 6"]], 
["MUSICVID 10", 2014, ["ARTISTE 12", "ARTISTE 2", "ARTISTE 3"]], 
["MUSICVID 11", 2013, ["ARTISTE 14", "ARTISTE 1", "ARTISTE 9", "ARTISTE 12"]], 
["MUSICVID 12", 2014, ["ARTISTE 2"]]]

I want to create a method get_artistes that takes the parameters: k , r, and musicvid_arr:

def get_artistes(k, r, musicvid_arr)
  # the code here
end

where

k: the number of artistes to return
r: the least number of artistes found in the return array of k artistes that must appear in each music video for the music video to be counted/valid

This method should return a list of artistes. If k = 3:

["ARTISTE 1", "ARTISTE 2", "ARTISTE 9"]

This image gives a better understanding of how k and r affect the most number. With reference to the image above,

# for r = 1 , it would have 11 valid music videos.
# for r = 2 , it would have 6 valid music videos.
# for r = 3 , it would have 3 valid music videos.

No matter what r and k we pass to this method, we want an array of artistes that have the most number of valid music videos.

What would be an effective and efficient approach on tackling this problem?

I attempted to do this via the following algorithm. I do not think that it is the most effective. With big datasets, it takes very long to run.

def get_artistes(k, r, musicvid_arr)
    musicvid_arr=musicvid_arr.select{|t| t[2].size>=r} 
    artiste_arr = musicvid_arr.map.reduce({}){|a,vs|vs[2].each{|v|(a[v]||= [])<< vs[0]};a}.to_a.sort_by{|x| -x[1].count}
    output = []
    for i in 0...k
        output << artiste_arr[i]
    end
    return output
end

Thanks for the edit, Sawa. I think i missed a lot of corner cases too, using the method that I tried. — JQuack, Mar 06 '15 at 13:54

score 1 · Answer 1 · answered Mar 06 '15 at 16:03

Often Jr. Programmers doing something new see the problem as complex and feel the solution should have equal complexity.

Regardless of the complexity of any problem when writing code, the solution, should be clear and readable. If you can't speak your code aloud it's too complicated.

You should create a class ex: ParseArtists with multiple methods that clearly state each step in the process of parsing artists. This class should have the single responsibility of parsing artists with descriptive variable names, and small obvious methods.

Greedy conditional algorithm

1 Answers1