Questions tagged [performance]

For questions pertaining to the measurement or improvement of code and application efficiency.

The performance of applications is often a paramount concern for mission-critical systems. If your question pertains to optimization, whether it be database queries, algorithms, reducing network/transactional overhead, resource contention, or anything that deals with speed or capacity, consider using this tag.

A good question states performance goals that need to be achieved as well as other restrictions. Trying to optimize something without measuring is not a "performance" question or work, but most likely personal entertainment - expect a question without goals/measurements to be treated as such.

Performance for many programs is represented in big O notation, which classifies how an algorithm's resource requirements change in response to a change in the input size.

This tag can also represent system performance, which is one of the key non-functional requirements of an application or system.

The two main measures of performance are

  • Throughput (how many in a time frame). Example of units: transactions per second (TPS), megabytes per second (MB/s), gigabits per second (Gb/s), messages/request/pages per second.
  • Latency (how long for an action). For example, seek time of 8 ms and search time of 100 ms.

Latency is often qualified with a statistical measure. Note: latencies usually don't follow a normal distribution and have very high upper limits compared with the average latency. As such the standard deviation is not useful.

  • Average latency. The average of all the latencies.
  • Typical or median latency. The mid-point of the range of possible latencies. This is usually 50% to 90% of the average latency. As this is the lowest figure it is often reported by vendors.
  • Percentile latency. The figure which it is less than or equal to N% of the time. That is, 99 percentile if the latency is not more than this, 99 times out of 100.
  • Worst or maximum latency. The highest latency measured.

When seeking to improve performance: prototype and measure first, optimize only if and where needed.

See also:

93482 questions
35
votes
2 answers

Numpy: What is special about division by 0.5?

This answer of @Dunes states, that due to pipeline-ing there is (almost) no difference between floating-point multiplication and division. However, from my expience with other languages I would expect the division to be slower. My small test looks…
ead
  • 27,136
  • 4
  • 67
  • 108
35
votes
8 answers

IE11 XMLHttpRequest really slow performance

I have an angular material SPA web site that performs very well in Chrome, Firefox and Edge, but it lags massively in IE11. I am aware of angular material issues with animations and styles in IE11 and have made several changes to improve general…
35
votes
7 answers

Efficiently detect sign-changes in python

I want to do exactly what this guy did: Python - count sign changes However I need to optimize it to run super fast. In brief I want to take a time series and tell every time it crosses crosses zero (changes sign). I want to record the time in…
chriscauley
  • 13,911
  • 8
  • 28
  • 29
35
votes
6 answers

Reusing the same curl handle. Big performance increase?

In a php script I am doing a lot of different cUrl GET requests (a hundred) to different URL. Is reusing the same curl handle from curl_init will improve the performance or is it negligible compare to the response time of the cURL requests? I am…
benjisail
  • 1,467
  • 5
  • 19
  • 30
35
votes
11 answers

Will enabling XDebug on a production server make PHP slower?

The title pretty much says it all...is it a bad idea ? I'd like to have the enhanced debug messages that XDebug provides on the server. [edit] Just to make things clear. I'm aware there are security risks involved. Perhaps I should complement my…
Andrei
  • 1,508
  • 3
  • 16
  • 31
35
votes
5 answers

What is the efficient way to count set bits at a position or lower?

Given std::bitset<64> bits with any number of bits set and a bit position X (0-63) What is the most efficient way to count bits at position X or lower or return 0 if the bit at X is not set Note: If the bit is set the return will always be at least…
Glenn Teitelbaum
  • 9,339
  • 3
  • 31
  • 75
35
votes
2 answers

Significant FMA performance anomaly experienced in the Intel Broadwell processor

Code1: vzeroall mov rcx, 1000000 startLabel1: vfmadd231ps ymm0, ymm0, ymm0 vfmadd231ps ymm1, ymm1, ymm1 vfmadd231ps ymm2, ymm2, ymm2 vfmadd231ps ymm3, ymm3, ymm3 vfmadd231ps ymm4, ymm4, ymm4 vfmadd231ps ymm5,…
User9973
  • 509
  • 4
  • 8
35
votes
2 answers

Are scalar and strict types in PHP7 a performance enhancing feature?

Since PHP7 we can now use scalar typehint and ask for strict types on a per-file basis. Are there any performance benefits from using these features? If yes, how? Around the interwebs I've only found conceptual benefits, such as: more precise…
igorsantos07
  • 3,943
  • 4
  • 38
  • 59
35
votes
3 answers

deque.popleft() and list.pop(0). Is there performance difference?

deque.popleft() and list.pop(0) seem to return the same result. Is there any performance difference between them and why?
Bin
  • 2,722
  • 7
  • 28
  • 52
35
votes
7 answers

Is there any modern review of solutions to the 10000 client/sec problem

(Commonly called the C10K problem) Is there a more contemporary review of solutions to the c10k problem (Last updated: 2 Sept 2006), specifically focused on Linux (epoll, signalfd, eventfd, timerfd..) and libraries like libev or libevent? Something…
gdamjan
  • 968
  • 8
  • 12
35
votes
1 answer

Comparing BSXFUN and REPMAT

Few questions were asked before on comparisons between bsxfun and repmat for performance. One of them was: Matlab - bsxfun no longer faster than repmat?. This one tried to investigate performance comparisons between repmat and bsxfun, specific to…
Divakar
  • 204,109
  • 15
  • 192
  • 292
35
votes
15 answers

Most efficient algorithm for merging sorted IEnumerable

I have several huge sorted enumerable sequences that I want to merge. Theses lists are manipulated as IEnumerable but are already sorted. Since input lists are sorted, it should be possible to merge them in one trip, without re-sorting anything. I…
franck
  • 361
  • 3
  • 4
35
votes
3 answers

Most efficient way for a lookup/search in a huge list (python)

-- I just parsed a big file and I created a list containing 42.000 strings/words. I want to query [against this list] to check if a given word/string belongs to it. So my question is: What is the most efficient way for such a lookup? A first…
user229269
  • 377
  • 1
  • 3
  • 7
35
votes
5 answers

calendar.getInstance() or calendar.clone()

I need to make a copy of a given date 100s of times (I cannot pass-by-reference). I am wondering which of the below two are better options newTime=Calendar.getInstance().setTime(originalDate); OR newTime=originalDate.clone(); Performance is of…
Aravind Yarram
  • 74,434
  • 44
  • 210
  • 298
35
votes
2 answers

Is Swift really slow at dealing with numbers?

As I was playing around with a swift tutorial, I started to write a custom isPrime method to check if a given Int is prime or not. After writing it I realized it was working properly but found it a bit slow to perform isPrime on some quite large…
apouche
  • 8,943
  • 5
  • 37
  • 45
1 2 3
99
100