1

Due to performance issues, I'd like to attempt to convert a Freepascal function (SHA1Update, from the SHA1 unit) to assembly. I use Freepascal 2.6.4 and Lazxarus 1.2.4.

The reason is, I have a loop structure (repeat...until) that reads 64Kb blocks of raw data from disk into a buffer, and then it is hashed. Without the hashing, I can read the disk at 4Gb p\min. With the hashing, it slows to just over 1Gb p\min. So someone suggested converting the hashing routine to assembly.

I am a below average programmer when using high-level languages, let alone assembly, but the potential for performance improvement is drving me to at least enquire.

So my question is : is there a program or script that can take a procedure or function and magically convert it to assembly that I can then compile using the Freepascal compiler? I know it can be done for C\C++ using even web based system like this one

Gizmo_the_Great
  • 959
  • 13
  • 27
  • Many, if not most native code compilers, translate to an intermediate assembly source representation of the compiled code. But the fact that they do doesn't say anything about how efficient that generated code is - which is probably the core of your question. Just being represented as intermediate assembly source doesn't automatically mean that the program will be faster. faster than what? – Deleted User Jul 04 '14 at 14:28
  • Fair comment. Extra detail added to question. – Gizmo_the_Great Jul 04 '14 at 14:36
  • 3
    Before going to the trouble of trying to convert a fairly large and complicated bit of security-sensitive code to assembly language, I would suggest that you examine the rest of your program for the cause of the slowdown. In particular, you should run a profiler or at least do some timings yourself to see where the time is being spent. You say that you're a "below average programmer," so it's quite likely that there are things you can do to speed up your own code without jumping feet first into the deep well of assembly language and cryptographic code. – Jim Mischel Jul 04 '14 at 15:00
  • 1
    If you're not an assembly expert, there's very little chance that you can create better code than a compiler. If you can do asynchronous I/O, double buffering would be an alternative. Also keep in mind that if you run one tool after another on the same file, the second one may be much faster due to the file system cache. – molbdnilo Jul 04 '14 at 15:09
  • Gents, I have run the timers already. I even tried the code without the timers to see if it was the timers that was slowing it down! This is how I know that without the hashing it is 3 times faster, reading at 4Gb per minute instead of 1.22Gb. And every test was done with a power off of the PC and hardware inbetween. But yes, fair points re the assembly. Maybe I was over confident. Think I will give it a miss. – Gizmo_the_Great Jul 04 '14 at 15:16

2 Answers2

2

Assembly is indeed what you would use for optimising selected sequences of code. But, because native code compilers generate machine code, usually using an intermediate assembly source representation, which is then run through an assembler, the advantage you gain from using a compiler to "magically convert" your section of code, subject to optimisation, to assembly which then is linked to the rest of the program, compared to simply compiling the whole program with the compiler, is about zero - you're using the same compiler for converting, after all. From that angle, a compiler is nothing else than such a program which "magically converts it to assembly". For optimisation purposes, you want to hand code those section of code - and you need to be good at it. Many compilers generate code nowadays which performs better than non-expert crafted code, for various reasons. One is that target CPUs are very different in what is best performing code for them, and the rules to determine how efficient code for a specific CPU must look like, are often extremely complex. As a hand coder, you need to know the differences between them, to know how to write code which performs well. This knowledge is something many compilers have, and are therefore able to generate code such that one or another CPU architecture or model can benefit from the differences the compiler puts into code generation.

Often much better performance gains can be achieved by choosing more efficient algorithms. A better algorithm, coded in high level, usually outperforms a less adequate algorithm, hand coded in assembly. Therefore I'd look into possibilities to make the hashing process as such faster, by looking at alternative and faster algorithms, rather than trying to improve speed using assembly at this stage - consider assembly optimisation as a last, final step optimisation, when other means to speed up your code have been exhausted.

Deleted User
  • 2,507
  • 1
  • 9
  • 18
  • Thanks for explanation, which is useful. I have already explored other libraries : DCPCrypt, SuperFastHash, XXHash but there are issues with them, thus why I want to stick with the inbuilt Freepascal SHA1 unit. And I've also looked at different methods of calling CreateFile with various flags. I've also looked at compiler options, one or two of which have made a difference. So the assembly route was the last course of action to take, so I hope that whoever has marked my question down didn't do so on the assumption that I just asked the question without first considering the bigger picture. – Gizmo_the_Great Jul 04 '14 at 15:12
  • 2
    One way to speed up your code is to speed up the disk I/O. For instance, use a memory mapping of the file/disk via `CreateFileMapping()` and `MapViewOfFile()`. That will usually give you much faster access to the data than using `ReadFile()` into a buffer. Another option is to use worker threads, Overlapped I/O, or an I/O Completion Port to prepare additional buffers of data in the background while hashing an earlier buffer is busy, so your hashing is not waiting on disk I/O. Learn how to use Parallel Processing. – Remy Lebeau Jul 04 '14 at 15:59
1

As @Bushmills already explained your code is converted to assembly automatically by the FreePascal compiler - before producing the machine code in the Portable Executable (*.exe) format.

What you would need is not the assembly language, but hand-optimized code written in assembly language. This is task for experienced assembly programmer. You can 1) become an assembly language expert by yourself, this Stack Overflow question can give you some starting points: A good NASM/FASM tutorial?

My guess is that any programmer can become an assembly language expert (either CISC or RISC architectures) in about a year. Depending on your previous experience and the courses you'd take and your eagerness. For theoretical background (processor-neutral) I'd recommend Donald Knuth's MMIX lectures

You should be able to 2) see the intermediate assembly files produced by the FreePascal compiler by following instructions in this: http://free-pascal-general.1045716.n5.nabble.com/Assembler-file-generate-by-compiler-td5710837.html discussion

If you want to really move on in a reasonable time-frame then I'd suggest you to create Minimal, Complete and Verifiable example and 3) ask for code review at some code review sites where some more experienced programmers will take a look at your code and propose some changes. These sites should be a good candidates:

Those are sites designed especially for helping beginners and intermediate programmers with problems like the one of yours

Community
  • 1
  • 1
xmojmr
  • 7,696
  • 5
  • 29
  • 51