How to model relation efficiently in Haskell with mutable data

Question

I have a problem similar to this. However, it involves LOTS of update and I'm not sure IxSet is the solution.

Basically I'm writing an application to optimize a warehouse layout. There is no database or anything; it's just plain data to manipulate and generate a file at the end. A warehouse is made of shelves of different sizes; shelves contains boxes of different sizes; and the goal is to find the best arrangement (or at least, a good one), where to put boxes so they all fit.

The basic model (which doesn't really matter) is :

data Dimension = Dimension {length::Double, width::Double, height::Double}
data Box = Box  {boxId :: Int,  boxDim:: Dimension }
data Shelf = Shelf {shelfId :: Int, shelfDim :: Dimension, postion :: ... }

Now, the first problem is that there is a shelves graph. Some shelves are back to back. I need to know it because the depth of one shelf can be adjusted (which modify in the opposite way the back shelf). I also need to know the opposite shelf and the next one.

What is the most efficient way to model this?

My first thought is:

data Shelf = Shelf { ... , backShelf :: Maybe Shelf 
                         , frontShelf :: Maybe Shelf
                         , nextShelf :: Maybe Shelf
                   }

Now, data are immutable in Haskell, so how can I update a Shelf? I mean, imagine I have a list of Shelf's; if I modify one, then I need to update all the copies of it?

My second thought is to use ids instead:

newtype ShelfId = Int
data Shelf = Shelf { ..., backShelfId :: Maybe ShelfId ... }

Or should I use external graphs? Something like

 back, front, next :: [(shelfId, shelfId)]

The second problem how to model the fact a box belongs to a shelf.

Possible approaches that come to mind are:

  data Box = Box { ..., shelf :: Shelf }

or

  data Box = Box { ..., shelfId :: Maybe ShelfId }

or

  data Shelf = Shelf { ..., boxes = [Box] }

or

  data Shelf = Shelf { ..., boxIds = [BoxId] }

An external graph

 boxToShelf :: [(BoxId, ShelfId)]

Also, as I said, the operations involve lots of updates; I might add boxes one by one to each shelf, which can be really inefficient with immutable data.

The other solution I thought would be to use STRef or equivalent:

data Box = { ..., shelf :: STRef Shelf }

It feels like using pointer, so it's probably no a good idea.

Update

This is not an XY problem neither a toy problem. This is a realworld application for a real warehouse (arount 100 shelves and 3000 of boxes). It needs to be able to draw and load existing data (boxes and their location). The optimisation is only a small part of it and will probably be semi-manual.

So my question is about representing relation between mutable objects not basic combinatory problems.

"Also, as I said, the operations involves lots of updates , I might add boxes one by one to each shelf which can be really unefficiant with unmutable data." - Not necessarily. If written properly, immutable data structures can be very efficient. If written poorly, mutable data structures can be very inefficient. What makes you think it's not efficient? I would think it'd be better to build it immutable, profile it, get it fast if you can, then start thinking about mutability. — bheklilr, Dec 12 '14 at 18:10
It would be very helpful if you gave an example problem with a small number of shelves and boxes and what the solution is and explain why that solution is the best one. — ErikR, Dec 12 '14 at 18:15
I'm sure , but how ? Example , in scenario `data Shelf = ... boxes :: [Box], backShelf :: Maybe Shelf} . Modifying a box involves modifying the shelf containing the box, as well as all the shelf refering to it, and the shelves refering to them etc ... — mb14, Dec 12 '14 at 18:17
@user5402 I'm focusing in here on the data structure, not the optimisation algorithm. However an simple example is when you have a shelf of width 9 (A) and another of width (B). 3 boxes of width 3 (small) and 2 of width 4 (big).4x8 boxes. You need to fit the 3 small in A and the 2 big in B. There are also lots of constraints about boxes : for example boxes same category , needs to be kept together or form a chain of continuous shelf. — mb14, Dec 12 '14 at 18:27
See also [How do you represent a graph in Haskell?](http://stackoverflow.com/q/9732084/791604). — Daniel Wagner, Dec 12 '14 at 18:42
Sounds like the problem could be amenable to a conventional combinatorial search approach. How big is your problem - i.e. number of shelves and boxes? — ErikR, Dec 12 '14 at 18:51
I remember this from somewhere. Oh yes, [here we go](https://www.reddit.com/r/haskell/comments/2p911v/how_to_deal_with_shared_object_and_immutability/). Trying to farm as many internet points as possible, I see. — Shoe, Dec 14 '14 at 16:29

ErikR · Answer 1 · 2014-12-12T21:28:20.630

Knowing more about how the optimization algorithm works would help.

At the heart of the problem is a data structure which keeps track of which boxes are on which shelves (and vice-versa). Let's call this a "configuration".

A combinatorial search algorithm creates new configurations from old ones as it explores the space of all possible configurations. At any one time there are several configurations in memory - one for each stack frame of the recursive search.

On the other hand, an algorithm like local search just has one (global) data structure which it mutates using heuristics or randomness until it finds a good enough solution.

What is your algorithm most like?

Update: Note that there may not be a single representation which works for all of your use cases. For storing the data you only need the map from shelves to boxes. For display you might find it handy to also have the reverse map (boxes -> shelves.) And for optimization you might need model the problem with mutable arrays for efficient updates.

Update 2: I would try the presistent data structure approach and see how well it works.

type BoxId = Int
type ShelfId = Int

data Shelf = Shelf { ... just the dimensions of the shelf ... }
data Box   = Box { ... just the dimensions of the box ... }

data Configuration = {
    shelves       :: IntMap Shelf,    -- map from shelf id to shelf characterstics
    boxes         :: IntMap Box,      -- map from box id to box characteristics
    boxesForShelf :: IntMap [BoxId],  -- map from shelf id to box ids
    shelfForBox   :: IntMap ShelfId   -- map from box id to shelf id (or 0)
}

Then write this function:

assignBox :: BoxId -> ShelfId -> Configuration -> Configuration

For efficiency you can also write something like:

assignBoxes :: [BoxId] -> ShelfId -> Configuration -> Configuration

and feel free to write other optimized functions for other mass updates to a Configuration which come up through your use cases.

You might find it handy to have the BoxId in the Box structure (and ditto for the ShelfId/Shelf structure)... but you don't necessarily need it. But the relationships between the boxes and shelves are better handled with separate maps.

I defined boxesForShelf as a IntMap [BoxId] simply because it sounds like there will only be a small number of boxes on each shelf. Maybe that's not valid.

Hope this helps.

The algorithm would probably semi-manual. The user might drag-and drop boxes of the same category, which will fill up automatically (chosing the best orientation) a chain of shelf (bays) etc ... The main constraints is as it's real warehouse. You can't just shift 1000 of boxes because the algorithm has decided so. Somes boxes need to stay were they are even though it's not optimal. — mb14, Dec 12 '14 at 21:02
It's about 20 boxes per shelf, so not much. Is there anything much efficient that map (like Array as there is no gap)? — mb14, Dec 12 '14 at 21:55
@mb14, `Data.IntMap` is a Patricia trie. It's not as compact as an array, but it allow efficient modification persistently. Mutable arrays or vectors are always an option, but using them will push your algorithm into `ST` or `IO`, limiting flexibility. If you decide to use mutable arrays/vectors, you should use the "unboxed" versions both for compactness and to avoid slowing down the garbage collector. — dfeuer, Dec 13 '14 at 15:17

gxtaillon · Answer 2 · 2014-12-14T16:55:16.157

Why not use persistent? I've put together a sample application in the form of a cabal package for you to use here https://github.com/gxtaillon/Shelves.

Persistent follows the guiding principles of type safety and concise, declarative syntax. Some other nice features are:

Database-agnostic. There is first class support for PostgreSQL, SQLite, MySQL and MongoDB, with experimental Redis support.

Convenient data modeling. Persistent lets you model relationships and use them in type-safe ways. The default type-safe persistent API does not support joins, allowing support for a wider number of storage layers. Joins and other SQL specific functionality can be achieved through using a raw SQL layer (with very little type safety). An additional library, Esqueleto, builds on top of the Persistent data model, adding type-safe joins and SQL functionality.

Automatically perform database migrations

As soon as you will know what data needs to be stored, you will have a working database and will be able to start working on that optimization algorithm without worrying about performance, scalability or reinventing the wheel.

models - A file containing the definition of your database schema

Shelf
    hrId Text
    length Double
    width Double
    height Double
    UniqueShelf hrId
    deriving Show
Box
    hrId Text
    length Double
    width Double
    height Double
    UniqueBox hrId
    deriving Show
Storage
    shelfId ShelfId
    boxId BoxId
    UniqueStorage shelfId boxId
    deriving Show

Model.hs - Where you import `models` and generate the corresponding types

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
module Model where
import Database.Persist.Quasi
import Database.Persist.TH
import ClassyPrelude

share [mkPersist sqlSettings, mkMigrate "migrateAll"]
    $(persistFileWith upperCaseSettings "models")

Main.hs - A sample application

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Model
import Control.Monad.IO.Class  (liftIO)
import Database.Persist.Sqlite hiding (master, Connection)

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll

    myShelfId <- insert $ Shelf "ABCD.123" 10.0 1.5 2.0
    thatBoxId <- insert $ Box "ZXY.098" 1.0 1.0 1.0
    thisBoxId <- insert $ Box "ZXY.099" 2.0 1.0 1.0

    _ <- insert $ Storage myShelfId thatBoxId
    _ <- insert $ Storage myShelfId thisBoxId

    myStorage <- selectList [StorageShelfId ==. myShelfId] []
    liftIO $ print (myStorage :: [Entity Storage])

    update myShelfId [ShelfWidth +=. 0.5]

    thatBox <- get thatBoxId
    liftIO $ print (thatBox :: Maybe Box)
    myShelf <- get myShelfId
    liftIO $ print (myShelf :: Maybe Shelf)

Which would output something along those lines:

Migrating: [...]
[Entity {entityKey = StorageKey {unStorageKey = SqlBackendKey {unSqlBackendKey = 1}}, entityVal = Storage {storageShelfId = ShelfKey {unShelfKey = SqlBackendKey {unSqlBackendKey = 1}}, storageBoxId = BoxKey {unBoxKey = SqlBackendKey {unSqlBackendKey = 1}}}},Entity {entityKey = StorageKey {unStorageKey = SqlBackendKey {unSqlBackendKey = 2}}, entityVal = Storage {storageShelfId = ShelfKey {unShelfKey = SqlBackendKey {unSqlBackendKey = 1}}, storageBoxId = BoxKey {unBoxKey = SqlBackendKey {unSqlBackendKey = 2}}}}]
Just (Box {boxHrId = "ZXY.098", boxLength = 1.0, boxWidth = 1.0, boxHeight = 1.0})
Just (Shelf {shelfHrId = "ABCD.123", shelfLength = 10.0, shelfWidth = 2.0, shelfHeight = 2.0})

score 0 · Answer 3 · answered Dec 12 '14 at 20:06

So let's take the simplest case: rectangles which need to hold some rectangles, with no stacking allowed. Then we can provide a new "shelf" when we put a rectangle in the old "shelf":

newtype Box = Box Int Int Int deriving (Eq, Ord, Show)
newtype Shelf = Shelf Int Int Int deriving Show
type ShelfID = Int
type BoxID = Int
type ShelfBox = (ShelfID, BoxID)

fitsOn :: (Int, Box) -> (Int, Shelf) -> Maybe (ShelfID, Shelf)
fitsOn (bid, Box bw bh) (sid, Shelf sw sh) 
   | bw <= sw && bh <= sh = Just (sid, Shelf (sw - bw) sh)
   | otherwise = Nothing

Probably it's most efficient to do a depth-first search starting with the widest boxes:

import Data.IntMap.Strict (IntMap, (!))
import Data.IntMap.Strict as IntMap
import Data.List (sort)

collect (mx : mxs) = case mx of 
    Just x -> x : collect mxs
    Nothing -> collect mxs

-- need to feed something like `IntMap.fromList $ zip [0..] $ sort bs` to `boxes`:
search :: IntMap Box -> IntMap Shelf -> [ShelfBox] -> Maybe [ShelfBox]
search boxes shelves acc 
    | IntMap.empty boxes = Just acc
    | otherwise = case collect $ (map process) options of
        [] -> Nothing
        (x : xs) -> Just x
    where (box, boxes') = IntMap.deleteFindMax boxes
          options = collect [box `fitsOn` shelf | shelf <- IntMap.toList shelves]
          process (sid, s') = search boxes' (IntMap.insert sid s') ((sid, fst box) : acc)

Now how can we put, say, two shelves above each other with total height H but independent heights otherwise? We can write the two together onto our list of shelves:

vert_shelves other_shelves h w = [Shelf w (h - i) : Shelf w i : other_shelves | i <- [0..h - 1]]

If you then want boxes-on-boxes, you'll start yielding two rectangles from fitsOn (above & beside) and trying to aggregate the "above" box into any other boxes which come from the same shelf and have the same elevation, which will take a bit of redesigning this thing. You might also want an indefinite number of boxes, which would be a bit tricky without rewriting how those maybes get passed around.

This is not a XY problem. I'm not asking for a solution to optimisation problem but how to represent real world data. See my update — mb14, Dec 12 '14 at 20:21
Also, the `[ShelfBox]` representation doesn't prevent a box to belong to more than one shelf. — mb14, Dec 12 '14 at 20:40
Just the working-through of the toy example above reveals that probably the data structures you want are something like a map from ShelfID to a list of BoxIDs, a map from ShelfID to the floor dimensions of the shelf, a list of unshelved boxes, and each shelf probably needs a position in space. Your mutable-depth requirement can perhaps be stored best by having a Shelf live in a 2D ShelfSpace box which holds both of them? SQL databases would enforce "A box doesn't belong to more than one shelf" well, but that's less natural in Haskell, I think. You'll probably want that in the logic. — CR Drost, Dec 12 '14 at 20:52

score 0 · Answer 4 · answered Dec 14 '14 at 16:21

It sounds like you need a Relational Database Management System (RDBMS). If you don't want to implement one in Haskell (which you probably don't and shoudn't) then I suggest you use the excellent, free Postgres and communicate with it from Haskell in a typesafe and composable manner using Opaleye. [Disclaimer: I wrote Opaleye.]

How to model relation efficiently in Haskell with mutable data

Update

4 Answers4

models - A file containing the definition of your database schema

Model.hs - Where you import models and generate the corresponding types

Main.hs - A sample application

Model.hs - Where you import `models` and generate the corresponding types