There have been several good answers both as to general trial division algorithm using lists (@pad) and in choice of an array for a sieving data structure using the Sieve of Eratosthenes (SoE) (@John Palmer and @Jon Harrop). However, @pad's list algorithm isn't particularly fast and will "blow up" for larger sieving ranges and @John Palmer's array solution is somewhat more complex, uses more memory than necessary, and uses external mutable state so is not different than if the program were written in an imperative language such as C#.
EDIT_ADD: I've edited the below code (old code with line comments) modifying the sequence expression to avoid some function calls so as to reflect more of an "iterator style" and while it saved 20% of the speed it still doesn't come close to that of a true C# iterator which is about the same speed as the "roll your own enumerator" final F# code. I've modified the timing information below accordingly. END_EDIT
The following true SoE program only uses 64 KBytes of memory to sieve primes up to a million (due to only considering odd numbers and using the packed bit BitArray) and still is almost as fast as @John Palmer's program at about 40 milliseconds to sieve to one million on a i7 2700K (3.5 GHz), with only a few lines of code:
open System.Collections
let primesSoE top_number=
let BFLMT = int((top_number-3u)/2u) in let buf = BitArray(BFLMT+1,true)
let SQRTLMT = (int(sqrt (double top_number))-3)/2
let rec cullp i p = if i <= BFLMT then (buf.[i] <- false; cullp (i+p) p)
for i = 0 to SQRTLMT do if buf.[i] then let p = i+i+3 in cullp (p*(i+1)+i) p
seq { for i = -1 to BFLMT do if i<0 then yield 2u
elif buf.[i] then yield uint32(3+i+i) }
// seq { yield 2u; yield! seq { 0..BFLMT } |> Seq.filter (fun i->buf.[i])
// |> Seq.map (fun i->uint32 (i+i+3)) }
primesSOE 1000000u |> Seq.length;;
Almost all of the elapsed time is spent in the last two lines enumerating the found primes due to the inefficiency of the sequence run time library as well as the cost of enumerating itself at about 28 clock cycles per function call and return with about 16 function calls per iteration. This could be reduced to only a few function calls per iteration by rolling our own iterator, but the code is not as concise; note that in the following code there is no mutable state exposed other than the contents of the sieving array and the reference variable necessary for the iterator implementation using object expressions:
open System
open System.Collections
open System.Collections.Generic
let primesSoE top_number=
let BFLMT = int((top_number-3u)/2u) in let buf = BitArray(BFLMT+1,true)
let SQRTLMT = (int(sqrt (double top_number))-3)/2
let rec cullp i p = if i <= BFLMT then (buf.[i] <- false; cullp (i+p) p)
for i = 0 to SQRTLMT do if buf.[i] then let p = i+i+3 in cullp (p*(i+1)+i) p
let nmrtr() =
let i = ref -2
let rec nxti() = i:=!i+1;if !i<=BFLMT && not buf.[!i] then nxti() else !i<=BFLMT
let inline curr() = if !i<0 then (if !i= -1 then 2u else failwith "Enumeration not started!!!")
else let v = uint32 !i in v+v+3u
{ new IEnumerator<_> with
member this.Current = curr()
interface IEnumerator with
member this.Current = box (curr())
member this.MoveNext() = if !i< -1 then i:=!i+1;true else nxti()
member this.Reset() = failwith "IEnumerator.Reset() not implemented!!!"
interface IDisposable with
member this.Dispose() = () }
{ new IEnumerable<_> with
member this.GetEnumerator() = nmrtr()
interface IEnumerable with
member this.GetEnumerator() = nmrtr() :> IEnumerator }
primesSOE 1000000u |> Seq.length;;
The above code takes about 8.5 milliseconds to sieve the primes to a million on the same machine due to greatly reducing the number of function calls per iteration to about three from about 16. This is about the same speed as C# code written in the same style. It's too bad that F#'s iterator style as I used in the first example doesn't automatically generate the IEnumerable boiler plate code as C# iterators do, but I guess that is the intention of sequences - just that they are so damned inefficient as to speed performance due to being implemented as sequence computation expressions.
Now, less than half of the time is expended in enumerating the prime results for a much better use of CPU time.