0

For a game I am currently working on a particle system. Particles are objects with only position data. These particles have a method that update their position by using a vector field they are embedded into. The particles are inside an array

To update all particles in one physical step, I currently use a foreach:

foreach (particle p in ParList) p.update;

Where the update method is just doing two bitshifts and two additions to the original position. Now I wonder how many particles my system can handle and if I can optimize it to raise that number.

Looking into how foreach works, I found it is just a basic for loop, with doing a comparision and an addition to the index number.

I want to do what a for loop does without the check if the index number is >= 0 and without reducing the index number by one.

These two operations are not much usually, but in my case they take roughly 1/3 of the number of operations. So I wonder if I can do it this way:

switch (Parlist.Length)
{
    case 1024:
        ParList[1023].update;
        goto case 1023;
    case 1023:
        ParList[1022].update;
        goto case 1022;
    //and so on until
    case 1:
        ParList[0].update;
        break;
}

While it looks horrible, and I am aware this is not how it should be done, first tests make it look like I actually can rise performance quite much here. I would like to put this away into a class and access it in a more general manner like the foreach syntax is translated into a for loop. I would it to end up like this way:

eachcase (particle p in ParList)
{
    //instructions using p
}

which is translated into this:

switch (Parlist.Length)
{
    case 1024:
        //reference to instructions using Parlist[1023]
        goto case 1023;
    //and so on until
    case 1:
        //reference to instructions using Parlist[0]
        break;
}

How do I build such custom structures? Is this possible in C#?

If I am able to make this working, I also would like to implement a custom break condition like this:

eachcase (particle p in ParList, p.x == 0 && p.z == 0)
{
    //instructions using p
}

which is translated into this:

switch (Parlist.Length)
{
    case 1024:
        if (/*break condition*/) break;
        //reference to instructions using Parlist[1023]
        goto case 1023;
    //and so on until
    case 1:
        if (/*break condition*/) break;
        //reference to instructions using Parlist[0]
        break;
}

I read about Lambda expressions which can help me here but I am not sure how to connect it to a usual object array.

Urben
  • 25
  • 5
  • Why not just use "break;" within the for-loop, or am i missing something? How exactly do you explain the performance going up? – blackforest-tom Nov 25 '19 at 12:16
  • Read here : https://stackoverflow.com/questions/365615/in-net-which-loop-runs-faster-for-or-foreach – CodeFromItaly Nov 25 '19 at 12:18
  • @matteocracco I am not sure how this is related to me, I already know _foreach_ is based on _for_. I try to create something better here by using switch case. – Urben Nov 25 '19 at 12:24
  • @blackforest-tom A _for_ loop uses an _int_ as the index number, and after every step it adds 1 to it and checks if it is still not a negative number. These two operations are not needed in my _switch case_ idea. – Urben Nov 25 '19 at 12:25
  • oh! just omit them. for(int i = 0;;) - is that what you are looking for? – blackforest-tom Nov 25 '19 at 12:35
  • 1
    Btw, foreach uses iterators (c# yield) on your `List<>`, whereas you probably want a real array and a for-loop with indices. Did you try out that approach in terms of performance? – blackforest-tom Nov 25 '19 at 13:00

3 Answers3

1

Looking into how foreach works, I found it is just a basic for loop, with doing a comparison and an addition to the index number.

That's how it works for arrays because that's the most efficient way of navigating arrays, but that's not how it works in general. It is optimized by the compiler to use the index for arrays. Other collection types will have a GetEnumerator() method that will return some object that has methods to navigate through the collection. How it does that is completely defined by the enumerator's implementation. For List that enumeration is just incrementing the index and using the indexer (very similar to the array, but with the small overhead of the enumerator object).

I don't see how your "switch" statement is an improvement. All you're doing is hard-coding a loop into N statements. I find it hard to believe that the bounds checking is a significant amount of your program time. You have an enormous amount of duplicate code, the limit is finite (foreach loops could loop forever if the underlying collection is infinite), and it's much harder (IMHO) to understand the intent.

Now I wonder how many particles my system can handle

That's easy. Set it to the maximum size of the underlying collection type (~2 billion elements if Particle is a reference type, possibly less if it's a value type). If your system can handle that, then you're good to go. If not, then reduce the size until you find your limit.

and if I can optimize it to raise that number.

Well before you can optimize you have to figure out where the inefficiencies are. Find the operations that take up the most program time and work on those first.

I would not assume that the foreach has an efficiency problem. If you measure the performance of your program and determine that the foreach itself (the loop, not what each iteration is doing), then start looking for alternatives.

If your process is CPU bound (meaning that it pegs a single logical processor at 100% then you might benefit from parallelization. You might try things like PLinq or TPL and see if the benefit of parallelization outweighs the overhead it creates.

D Stanley
  • 139,271
  • 11
  • 154
  • 219
1

What you are trying to do, is either manual loop unrolling or building a variation of Duff's device.

Both will probably, with modern compilers and CPU architectures, give you very little speedup, and possibly make your code slower on different CPUs.

They will, however, definitely make your code harder to read.

If you, after careful measuring, really need more performance here are some things you can try (ordered by perf. gain per amount of work):

  1. Mark your .update function/getter with [MethodImpl(MethodImplOptions.AggressiveInlining)]

    • This instructs the compiler to inline your function
    • The reason this might help is that it may help the compiler do #4 for you
  2. Make sure your Particle class is a value type (struct) and they are stored in an array or ArrayList

    • This will help the CPU with prefetching. Fetching a piece data from RAM ("cache miss") will take ~300 times as long as a floating point operation. Making sure the data is in the right order will allow the CPU to prefetch this data before it is needed and possibly provide significant speedups.
  3. SIMD (Somewhat Advanced)

    • This uses SIMD instructions which execute operations on 2-8 values at once, providing an almost equivalent (2x-8x) speedup.

    class Particles { // No Idea what your particles actually do, but it should be implementable like this
        Vector<float> positions;
        Vector<float> motion;
    
        Particles(unsigned n) { positions = new Vector(n * 3); motion = new Vector(n*3); }
    
        void update() {
            positions += motion; // Really, really, really efficient.
        }
    };
    
  4. Multithreading

  5. (Run your calculations on a GPU?)
Benno Straub
  • 1,436
  • 14
  • 20
-2

Try Select(), it is more efficient than loops https://www.reddit.com/r/csharp/comments/4xb0d9/why_is_linqs_select_better_than_foreach/

ParList = ParList.Select(s => s.update).ToArray();

PeterL
  • 48
  • 7
  • 2
    That article is comparing Linq to the `ForEach()` _method_ on `List`, not the `foreach` _statement_. `Select` does essenatially the same thing that `foreach` does, so there should not be any significant efficiency - in fact it may be _less_ efficient since you have the overhead of an iterator object. – D Stanley Nov 25 '19 at 14:04
  • Both `ForEach()` and `foreach` is essentially a for loop, the `Select()` itself is fast because of the deferred execution. – PeterL Nov 25 '19 at 14:14
  • 1
    Neither of those is "essentially a for loop" in all cases, but that's a different argument. `Select` is "essentially a `foreach` loop" also since it creates an enumerator and enumerates the underlying collection (deference is not material here since you're calling `ToArray()` immediately after it). Plus yours has the disadvantage of copying the results to a _new_ array, which could significantly affect performance if the collection is large. – D Stanley Nov 25 '19 at 14:16
  • 1
    As D Stanley mentioned, deferred execution will not provide any performance gain in this scenario. You can read more about deferred execution here: https://blogs.msdn.microsoft.com/charlie/2007/12/10/linq-and-deferred-execution/ – Knoop Nov 25 '19 at 15:21