When Radix sort is used with a stable sort (counting sort, specifically), the best and worst case time costs for Radix sort are usually both given by Theta(d(n+k)), where d is the number of digits for each number to be sorted and k is the number of values each digit can take (usually 10 (because of 0 to 9)).
Despite my research, I have still been unable to find a good explanation as to the difference between what the "best" and "worst" cases are for Radix sort. Can somebody please explain what constitutes a "best" case and a "worst" case in the context of using Radix sort? If so, can you then prove that both of them are in Theta(d(n+k))?