23

I have a function that takes a lazy ByteString, that I wish to have return lists of strict ByteStrings (the laziness should be transferred to the list type of the output).

import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
csVals :: L.ByteString -> [B.ByteString]

I want to do this for various reasons, several lexing functions require strict ByteStrings, and I can guarantee the outputted strict ByteStrings in the output of csVals above are very small.

How do I go about "strictifying" ByteStrings without chunking them?

Update0

I want to take a Lazy ByteString, and make one strict ByteString containing all its data.

Matt Joiner
  • 100,604
  • 94
  • 332
  • 495
  • 6
    What is your problem with [`toChunks`](http://hackage.haskell.org/packages/archive/bytestring/latest/doc/html/Data-ByteString-Lazy.html#v%3atoChunks)? From the initial glimpse it looks like it preserves laziness. – Mikhail Glushenkov Oct 19 '11 at 01:17
  • @Matt Joiner:Maybe you should write a lexing yourself, or force eval the results using DeepSeq. – wuxb Oct 19 '11 at 02:04
  • @Matt Joiner: there is a Lazy version: 'Data.ByteString.Lex.Lazy.Double' in the same package. – wuxb Oct 19 '11 at 02:06
  • @Matt Joiner: so you want chunks of specified size? Possibly repeated calls to splitAt? Note that toChunks generates strict ByteStrings are of maximum size (except for possibly the last one). – ivanm Oct 19 '11 at 02:45
  • @MikhailGlushenkov: toChunks returns a list of strict ByteStrings. I want them all in one. – Matt Joiner Oct 19 '11 at 02:56
  • @WuXingbo: I have switched to the Lazy readDouble for now, thanks. My question still stands however. – Matt Joiner Oct 19 '11 at 02:56
  • 7
    There's a misunderstanding here -- a lazy bytestring *is* just a list of chunks (i.e. strict bytestrings), essentially. `toChunks` exposes that structure. To put the list all in one strict bytestring, there's no other way than `concat . toChunks` (or the equiv). In many typical cases, the list will have a single element -- in those cases `concat . toChunks` will be relatively efficient as well. – sclv Oct 19 '11 at 15:43
  • @sclv: What you describe is what I'm after. – Matt Joiner Oct 20 '11 at 01:16

5 Answers5

39

The bytestring package now exports a toStrict function:

http://hackage.haskell.org/packages/archive/bytestring/0.10.2.0/doc/html/Data-ByteString-Lazy.html#v:toStrict

This might not be exactly what you want, but it certainly answers the question in the title of this post :)

ocharles
  • 6,002
  • 2
  • 33
  • 46
17

Like @sclv said in the comments above, a lazy bytestring is just a list of strict bytestrings. There are two approaches to converting lazy ByteString to strict (source: haskell mailing list discussion about adding toStrict function) - relevant code from the email thread below:

First, relevant libraries:

import qualified Data.ByteString               as B
import qualified Data.ByteString.Internal      as BI
import qualified Data.ByteString.Lazy          as BL
import qualified Data.ByteString.Lazy.Internal as BLI
import           Foreign.ForeignPtr
import           Foreign.Ptr

Approach 1 (same as @sclv):

toStrict1 :: BL.ByteString -> B.ByteString
toStrict1 = B.concat . BL.toChunks

Approach 2:

toStrict2 :: BL.ByteString -> B.ByteString
toStrict2 BLI.Empty = B.empty
toStrict2 (BLI.Chunk c BLI.Empty) = c
toStrict2 lb = BI.unsafeCreate len $ go lb
  where
    len = BLI.foldlChunks (\l sb -> l + B.length sb) 0 lb

    go  BLI.Empty                   _   = return ()
    go (BLI.Chunk (BI.PS fp s l) r) ptr =
        withForeignPtr fp $ \p -> do
            BI.memcpy ptr (p `plusPtr` s) (fromIntegral l)
            go r (ptr `plusPtr` l)

If performance is a concern, I recommend checking out the email thread above. It has criterion benchmark as well. toStrict2 is faster than toStrict1 in those benchmarks.

Sal
  • 4,237
  • 1
  • 15
  • 23
5

If the lazy ByteString in question is <= the maximum size of a strict ByteString:

toStrict = fromMaybe SB.empty . listToMaybe . toChunks

toChunks makes each chunk be as large as possible (except for possibly the last one).

If the size of you lazy ByteString is larger than what a strict ByteString can be, then this isn't possible: that's exactly what lazy ByteStrings are for.

ivanm
  • 3,857
  • 19
  • 29
2

Data.ByteString.Lazy.Char8 now has toStrict and fromStrict functions.

Jeffrey Benjamin Brown
  • 2,618
  • 2
  • 21
  • 33
1

You can also use blaze-builder to build strict ByteString from lazy

toStrict :: BL.ByteString -> BS.ByteString
toStrict = toByteString . fromLazyByteString

It must be effective.

s9gf4ult
  • 833
  • 6
  • 19