Contrary to the what answer above claims:
Architecture-specific extensions can easily be used by a static compiler. Visual Studio, for example, has a SIMD-extension option that can be toggled on and off.
Cache Size is usually the same for processors of a given architecture. Intel, for example, usually has a L1 cache size of 4kB, L2 cache size of 32kB, and L3 cache size of 4MB.
Optimizing for memory size would only be necessary if you are for some reason, writing a massive program that can use over 4GB of memory.
This may actually be an optimization in which using a JIT compiler is actually useful. However, you can create more threads than there are cores, meaning that those threads will use separate cores in CPUs with more cores, and simply be threads in CPUs with fewer cores. I also think it's quite safe to assume that a CPU has 4 cores.
Still, even using multi-core optimizations doesn't make using a JIT compiler useful, because a program's installer can check the number of cores available, and install the appropriate version of the program most optimized for that computer's number of cores.
I do not think that JIT compilation results in better performance than static compilation. You can always create multiple versions of your code that are each optimized for a specific device. The only type of optimization that I can think of that can result in JIT code being faster is when you receive input, and whatever code you use to process it can be optimized in such a way as to make the code faster for the most common case (which the JIT compiler might be able to discover), but slower for the rarer case. Even then, you can perform that optimization (the static compiler, however, would not be able to perform this optimization).
For example, let's say that you can perform an optimization on a mathematical algorithm that results in an error for values 1-100, but all higher numbers work with this optimization. You notice that values 1-100 can easily be pre-calculated, so you do this:
switch(num) {
case 0: {
//code
}
//...until case 100
}
//main calculation code
However, this is inefficient (assuming the switch statement is not compiled to a jump table), since cases 0-100 are rarely entered, as they be found mentally, without the help of a computer. A JIT might be able to discover that this is more efficient (upon seeing that the values in the range 0-100 are rarely entered):
if(num < 101) {
switch(num) {
/...same as other code above
}
}
//main calculation code
In this version of the code, only 1 if is executed if the most common case instead of an average of 50 ifs in the extremely rare case(if the switch statement is implemented as a series of ifs).