0

Let's say that I have two vectors x=c(1:10), y=c(20:30) and two more larger length vectors let's say n1=c(100:200), n2=c(200:300).

I want to get samples (with replacement of course) from x of sizes that n1 depicts and from y that n2 depicts, i.e. first 100 and store their sum, then 101 and store their sum etc until 200 and then the samy for variable y. Is there any idea of how to avoid the for loop to do that?

I want to end up with a matrix that has in the first column the sums of the samples of x and the second column the sums of the samples of y.

HamTheAstroChimp
  • 1,013
  • 4
  • 12
  • 26
manos92
  • 1
  • 1

2 Answers2

2

Nothing wrong with a loop here... or a loop hidden in an sapply:

x_n1_sums = sapply(n1, function(z) sum(sample(x, size = z, replace = TRUE)))
y_n2_sums = sapply(n2, function(z) sum(sample(y, size = z, replace = TRUE)))

Note that these are just slightly condensed ways of writing out a loop. You could just as well do:

x_n1_sums = integer(length(n1))
for(i in seq_along(n1)) {
  x_n1_sums[i] = sum(sample(x, size = n1[i], replace = TRUE))
}
Gregor Thomas
  • 104,719
  • 16
  • 140
  • 257
  • can these two commands you wrote x_n1_sums and y_n2_sums be summarized to one command, because I have to do this almost 300 times? – manos92 Dec 15 '20 at 15:52
  • Thank you very much for your answer!!! – manos92 Dec 15 '20 at 15:53
  • I mean, you can write a function that takes arguments, `foo = function(n, vec) {sapply(n, function(z) sum(sample(vec, size = z, replace = TRUE)))}` and then call `foo(n1, x)`, `foo(n1, y)`. But since `n1 != n2` and `x != y`, it seems pointless to put those calculations together. If you have a lot of variables hopefully you are using a data structure like a `data.frame` [or a list](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207) rather than sequentially named variables... If your variables are in a data structure looping over them is easy. – Gregor Thomas Dec 15 '20 at 15:55
  • I see. You are right. Thank you again! – manos92 Dec 15 '20 at 16:07
0

You could put Gregor's sapply into a Map call and use mget to grab your ~300 variable pairs from your workspace.

Solution 1

set.seed(42)
res <- Map(function(n, x) sapply(n, function(x, z) sum(sample(x, size=x, replace=TRUE)), x), 
           mget(paste0("n", 1:3)), mget(paste0("x", 1:3)))

Result

res
# $n1
#  [1]  4978  5092  5283  5388  5420  5738  5573  5919  5722  6529  6316
# [12]  6139  6449  6397  6468  6509  6363  7292  7265  7402  7021  6942
# [23]  7692  7582  7822  7291  8556  8094  7981  8754  9135  8475  8248
# [34]  9393  9019  9526  9644  9259  8970 10132  9607  9387 10784  9289
# [45] 10754 10587 11221 11124 10608 10179 11316 11518 11484 11543 12143
# [56] 12989 12735 12596 12967 10797 11470 12179 13665 12289 14454 13615
# [67] 15095 13114 14614 15732 14359 13981 13165 16073 15491 13936 15536
# [78] 15344 14836 17143 16005 16773 16240 16941 15804 17564 17379 17322
# [89] 17345 19950 17827 18920 18596 16919 17882 18726 18110 19984 19318
# [100] 19663 20269
# 
# $n2
#  [1] 20295 21295 20518 19210 21409 22386 21364 20367 19515 23105 23371
# [12] 23842 23468 23933 22296 22832 22672 25237 25101 23490 23267 25867
# [23] 24151 24247 25679 24107 25360 24675 25567 25607 27670 26662 28478
# [34] 28203 26617 28843 26182 27908 28560 27190 29840 31566 30309 28351
# [45] 28916 30026 29089 31159 30579 31835 32682 31770 32310 32142 31964
# [56] 31421 31044 32155 32557 33202 34205 34566 34153 33589 31831 33486
# [67] 37466 34722 36257 34910 35638 37128 35913 36201 40031 37720 38232
# [78] 38718 39940 37769 36714 41539 41617 37948 41666 42407 40008 40601
# [89] 43563 41941 42636 43073 39286 43103 43510 44785 43444 43998 45678
# [100] 46244 47021
# 
# $n3
#  [1] 46006 47640 45236 45229 46936 44924 45400 49306 48429 48752 46101
# [12] 47612 52132 47411 46958 46082 50105 51873 49299 46963 52396 53336
# [23] 50490 53146 52912 56005 51678 52715 52141 56446 56304 55270 56233
# [34] 56514 58445 51637 55880 57316 57663 59554 54236 54819 61929 58387
# [45] 56888 58575 58663 59174 62947 65705 59852 62723 62621 58702 60114
# [56] 63817 61939 62426 63192 62476 65098 65568 67466 70020 64776 65646
# [67] 68086 69753 68597 70598 74347 69225 68329 75894 69546 71321 69488
# [78] 74288 71575 69625 69752 67965 71759 74306 74373 75003 73970 71771
# [89] 73239 75570 75231 78031 78555 78968 76116 76290 79374 79989 77838
# [100] 79819 81109

Solution 2

Or in the matrix case (see comment).

set.seed(42)
res <- mapply(function(n, x) sapply(n, function(x, z) sum(sample(x, size=x, replace=TRUE)), x), 
           as.data.frame(N), as.data.frame(X))

Result

res
# n1    n2    n3
# [1,]  4557 21390 46224
# [2,]  4730 20593 45036
# [3,]  5304 20686 45713
# [4,]  5181 20838 48072
# [5,]  5782 19631 48738
# [6,]  5027 22297 47345
# [7,]  5764 21091 47339
# [8,]  5600 21230 45641
# [9,]  5716 21020 48194
# [10,]  6255 21748 47226
# [11,]  6281 23919 50694
# [12,]  6368 20777 49840
# [13,]  6067 23884 47946
# [14,]  6710 21705 51508
# [15,]  6846 23811 52632
# [16,]  6950 23932 46732
# [17,]  6586 24824 50691
# [18,]  6980 22466 50747
# [19,]  6718 25260 52010
# [20,]  6866 23362 52200
# [21,]  7842 25005 49955
# [22,]  7095 24620 53393
# [23,]  7024 25055 52922
# [24,]  7750 24493 50034
# [25,]  8019 26124 50297
# [26,]  7981 27113 53247
# [27,]  8103 25540 51848
# [28,]  8243 26843 51241
# [29,]  8302 25642 53940
# [30,]  8632 26775 55112
# [31,]  8198 26909 54559
# [32,]  9259 28335 54736
# [33,]  9045 26858 56934
# [34,]  8687 26882 55029
# [35,]  9023 28942 59308
# [36,]  8459 29333 56032
# [37,]  9819 28123 57040
# [38,]  8929 26026 57613
# [39,]  9357 29223 52860
# [40,] 11236 30784 59318
# [41,]  9697 27775 59480
# [42,]  9961 30569 59968
# [43,]  9281 31688 56250
# [44,] 10422 30423 58201
# [45,] 10135 28990 59129
# [46,] 10941 29856 57587
# [47,] 11777 31513 58765
# [48,] 11028 32374 55899
# [49,] 11023 32168 62379
# [50,] 10413 30418 59354
# [51,] 11867 33008 58512
# [52,] 11000 31445 62448
# [53,] 11757 33587 61652
# [54,] 11934 33203 59858
# [55,] 12179 32286 62068
# [56,] 12195 31595 67006
# [57,] 12023 33316 63311
# [58,] 13444 32450 62873
# [59,] 12739 31884 63376
# [60,] 12699 35333 63013
# [61,] 12239 32621 66080
# [62,] 13751 33761 63407
# [63,] 13294 34145 67968
# [64,] 13674 35789 66286
# [65,] 15203 35859 66017
# [66,] 13367 36746 65226
# [67,] 13804 35473 68551
# [68,] 14338 36512 66777
# [69,] 13511 38545 68397
# [70,] 13519 34987 64400
# [71,] 13768 33394 66711
# [72,] 14611 36195 66209
# [73,] 14667 35841 64381
# [74,] 14900 36682 68143
# [75,] 16008 38334 68719
# [76,] 15006 37555 71335
# [77,] 15403 35518 71890
# [78,] 16159 38154 72890
# [79,] 16971 40389 74732
# [80,] 15804 39832 71549
# [81,] 16988 41786 73404
# [82,] 16489 39934 70910
# [83,] 17238 40637 69625
# [84,] 15947 40782 74477
# [85,] 16265 40399 76900
# [86,] 17685 41937 76328
# [87,] 17165 43287 78224
# [88,] 18084 38838 71176
# [89,] 16233 43162 74248
# [90,] 16713 43828 78293
# [91,] 18397 41304 78578
# [92,] 18574 42549 76102
# [93,] 18425 45310 73165
# [94,] 17854 43327 79458
# [95,] 19946 43807 78581
# [96,] 17291 44097 74152
# [97,] 19718 45748 78085
# [98,] 19708 47821 79974
# [99,] 19734 44316 81048
# [100,] 19268 45343 79915
# [101,] 21176 43944 85171

Note that I've also used mapply here instead of Map which gives back a matrix. To get a list instead just use Map as above. If your matrices are transposed, do e.g. as.data.frame(t(N)).


Data:

## vectors solution 1
n1=c(100:200)
n2=c(200:300)
n3=c(300:400)
x1=c(1:10)
x2=c(21:30)
x3=c(31:40)

## matrices solution 2
N <- cbind(n1, n2, n3)
X <- cbind(x1, x2, x3)
jay.sf
  • 33,483
  • 5
  • 39
  • 75