If we want to use data.table, on which operations in R are generally faster, we can first merge A1 and A2, setting by the "by" field as "Type" in order to retain all of the variables after we merge. Then order by worker and type, the data.table way:
A1 <- data.table(
Type = c('A', 'B', 'C', 'D', 'K'),
Worker = c('SHARON', 'RUN', 'JACK', 'NICK', 'EMILY')
)
A2 <- data.table(
Type = sample(c('A', 'B', 'C', 'D', 'K'), size = 100, replace = TRUE),
Rate = rnorm(100, mean = 0.8, sd = 1)
)
K = merge(A1, A2, by = "Type", all.x = TRUE)
setorder(K, "Worker", "Type")
Then we get the following results:
K
Type Worker Rate
1: K EMILY 3.03436918
2: K EMILY -0.43926123
3: K EMILY 1.51949381
4: K EMILY 0.68492265
5: K EMILY 2.21078017
6: K EMILY 2.83141791
7: K EMILY -0.19588209
8: K EMILY -0.93980497
9: K EMILY 0.55941370
10: K EMILY 0.68850875
11: K EMILY 0.07640872
12: K EMILY 1.93482533
13: K EMILY 1.18638298
14: K EMILY 0.76216663
15: K EMILY 0.28404814
16: K EMILY 0.82054706
17: K EMILY 0.51706340
18: K EMILY -0.52274670
19: K EMILY 1.44324550
20: K EMILY -0.53222671
21: K EMILY 0.45271088
22: K EMILY -1.77868504
23: K EMILY 0.04365219
24: K EMILY 2.03788999
25: K EMILY -0.60246720
26: C JACK 1.98340019
27: C JACK 0.80640231
28: C JACK 1.42432058
...