Optimal transport on large networks: a practitioner guide

Arthur Charpentier

Alfred Galichon

Lucas Vernet

04 June 2019

Abstract

This page is a (online) guide to illustrate the article (available from Arvix), that presents a set of tools for the modeling of a spatial allocation problem in a large geographic market and gives examples of applications. In our settings, the market is described by a network that maps the cost of travel between each pair of adjacent locations. Two types of agents are located at the nodes of this network. The buyers choose the most competitive sellers depending on their prices and the cost to reach them. Their utility is assumed additive in both these quantities. Each seller, taking as given other sellers prices, sets her own price to have a demand equal to the one we observed. We give a linear programming formulation for the equilibrium conditions. After formally introducing our model we apply it on two examples: prices offered by petrol stations and quality of services provided by maternity wards (only the later is described here for privacy issues). These examples illustrate the applicability of our model to aggregate demand, rank prices and estimate cost structure over the network. We insist on the possibility of applications to large scale data sets using modern linear programming solvers such as Gurobi.

Contact : Arthur Charpentier charpentier.arthur@uqam.ca

rm(list=ls())

We will use some some network packages (mainly to illustrate)

library(igraph)
library(shp2graph)

as well as some packages for spatial analysis, and visualization

library(geosphere)
library(rgdal)
library(leaflet)
library(scales)
library(lwgeom)
library(shapefiles)
library(sp)
library(rgeos)
library(osmdata)
library(sf)
library(ggmap)
library(maptools)

For technical reasons, we need additional packages

library(Matrix)
library(tripack)
library(RColorBrewer)
library(units)
library(pals)
library(classInt)

In the applications, we will use gurobi. It is a commercial optimization solver for linear programming, quadratic programming (among many other), that is available for free in academia. Installation can be a little bit complicated (see the vignette online),

library('gurobi')

We will consider three applications in this guide

the toy example just to explain the four components in our approach : the nodes and the edges (a road network for instance), the location of sellers and buyers. The first part is to match locations of sellers and buyers with nodes. Then we solve the linear programming problem.
the metro example to illustrate the use of a real network (here the metro in Paris), with a single buyer and several sellers (hospitals here)
the maternity example to illustrate on read data, with all maternities in France (sellers) and all pregnant women (buyers).

1 The Theoretical Model

The geometry of the geographic market is described by a connected graph $G=\left(V,E\right)$ where $V$ is a set of vertices and $E$ is a set of directed edges. Vertices are the different locations; edges are road segments linking two adjacent locations.

Agents are located at the vertices of our graph (if not, as we will see, we will match them with the closest vertice).We make the following assumptions on their consumption preferences:

(A1) Consumers with the same location have the same outside option,
(A2) Consumers have the same cost of travel along a given edge,
(A3) There is no capacity constraints on the supply side.

A path $\pi_{vw}$ between two nodes $v,w\in V$ is a sequence of connected edges starting in $v$ and ending in $w$. We note $\Pi_{vw}$ the set of paths between $v$ and $w$.

For each edge $e\in E$, we define $c_{e}$ the cost of travel along this arc. A seller located in $w\in V$ sells at an endogenous price $p_{w}$. Therefore a consumer located in $v$ chooses to buy from the seller located in $w$ that minimizes his total cost $p_{v}$ \[ p_{v}=\underset{w\in V\text{, }\pi_{vw}\in\Pi_{vw}}{\min}\left[p_{w}+\underset{e\in\pi_{vw}}{\sum}c_{e}\right] \]

Let’s denote$u_{v}$ the reservation utility of agents located in $v$ and $p_{v}$ the price for these agents (including the cost of transportation). Therefore, if $p_{v}>u_{v}$ a consumer in $v$ his outside option, else he buys from the most competitive seller depending on his location in the market.

This outside option can be modeled by a fictitious node $0$, where the price of the commodity is set to $p_{0}=0$ and the cost of travel is $c_{v0}=u_{v}$.

Hence the excess demand $z_{v}$ at a node $v\in V$ does not depend on the price. We are solving a matching problem over a network between sellers and consumers.

At equilibrium each consumer chooses the most competitive seller depending on his location; furthermore the demand received by each seller must equal his supply. If we define $\mu_{e}$ the number of agents traveling along edge $e$, these equilibrium conditions can be written: \[ \left\{ \begin{array}{l} \forall vw\in E\text{, }p_{w}-p_{v}\leq c_{vw}\\ \forall vw\in E\text{ such that }\mu_{vw}>0\text{, }p_{w}-p_{v}=c_{vw}\\ \forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array}\right. \]

These equations are the complementary slackness conditions for a minimum cost flow problem over network $G$. Hence, if the graph is connected and if there isn’t any loop of negative cost in $G$, there exists an equilibrium which is the solution of the min-cost flow problem \[ \begin{array}{cl} \mu=\text{argmin}\underset{e\in E}{\sum}c_{e}\mu_{e}\\ & \mu\text{ s.t. }\forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array} \]

Models used in trade theory usually use iceberg transport costs. These costs are linear in distance and paid in the transported good once arrived - hence the metaphor of the iceberg that melts as you transport it.

Our model can incorporate these costs applying a log-transformation to the equilibrium conditions. Let’s define for all $vw\in E$ iceberg costs $\tau_{vw}$. An equilibrium $\left(\mu,p\right)$ should satisfy \[ \left\{ \begin{array}{l} \forall vw\in E\text{, }\left(1-\tau_{vw}\right)p_{w}\leq p_{v}\\ \forall vw\in E\text{ such that }\mu_{vw}>0\text{, }\left(1-\tau_{vw}\right)p_{w}=p_{v}\\ \forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array}\right. \] which becomes \[ \left\{ \begin{array}{l} \forall vw\in E\text{, }\log p_{w}-\log p_{v}\leq-\log\left(1-\tau_{vw}\right)\\ \forall vw\in E\text{ such that }\mu_{vw}>0\text{, }\log p_{w}-\log p_{v}=-\log\left(1-\tau_{vw}\right)\\ \forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array}\right. \]

Therefore this problem is equivalent to the one presented earlier with $\forall vw\in E$, $c_{vw}=-\log\left(1-\tau_{vw}\right)$, and replacing $p$ by its logarithm.

2 A Toy Example

2.1 Networks and Shapefiles

Consider a shapefile of roads, or train lines : each road is a collection of segments, or connected nodes. Our shapefile is, so fact, only in a dataframe dd

dd <- data.frame(Id=c(1,1,1,2,2,2,3,3,4,4),
                 X=c(3,5,8,5,7,8,3,5,5,6),
                 Y=c(9,8,3,4,7,4,8,9,7,8))

Here are the (given) knots and the four roads

par(mfrow=c(1,2))
plot(dd$X,dd$Y,col="white",xlab="",ylab="")
for(i in unique(dd[,1])) lines(dd$X[dd$Id==i],dd$Y[dd$Id==i])
points(dd$X,dd$Y,pch=19,col="red")
library(RColorBrewer)
darkcols = brewer.pal(8, "Dark2")
plot(dd$X,dd$Y,xlab="",ylab="")
for(i in unique(dd[,"Id"])) lines(dd$X[dd$Id==i],dd$Y[dd$Id==i],col=darkcols[i],lwd=2)

Let us create a shapefile from those segments

Show code for the extend_shp() function

dd2 <- extend_shp(dd)

Intersections of roads are now added, as knots of the network

par(mfrow=c(1,2))
plot(dd$X,dd$Y,col="white",xlab="",ylab="")
for(i in unique(dd[,1])) lines(dd$X[dd$Id==i],dd$Y[dd$Id==i])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="blue")
points(dd$X,dd$Y,pch=19,col="red")
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE,xlim=c(2.6,8.4))
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO)
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,cex=3)

We can also visualize all knots of the network

library(RColorBrewer)
darkcols = brewer.pal(8, "Dark2")
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in 1:(nrow(dd2$connected)/2)) segments(dd2[["knots"]]$X[dd2[["connected"]]$knot1[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot1[i]],
                                             dd2[["knots"]]$X[dd2[["connected"]]$knot2[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot2[i]],
                                             col=darkcols[dd2[["connected"]]$Id[i]])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="black",cex=3)
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO,col="white")

2.2 Adding Agents

We can now create agents, buyers and sellers

agents=data.frame(X=c(5.5,3.5,6.5,6.5,4.5),
                  Y=c(4,9,8,5.5,9),
                  Type=as.factor(c("Demand","Demand","Supply","Supply","Supply")))
forme=c(15,19,0,1)
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in 1:(nrow(dd2$connected)/2)) segments(dd2[["knots"]]$X[dd2[["connected"]]$knot1[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot1[i]],
                                             dd2[["knots"]]$X[dd2[["connected"]]$knot2[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot2[i]])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="white",cex=3)
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=1,col="black",cex=3)
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO,col="black")
points(agents$X,agents$Y,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3)

2.3 Minimal Distance (Euclidean)

Here, we want to connect the buyers (green squares) and the sellers (orange bullets). Let use visualize, as a starting point, euclidean distances

plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in which(agents$Type=="Demand")) segments(agents[i,"X"],agents[i,"Y"],agents[agents$Type=="Supply","X"],agents[agents$Type=="Supply","Y"])
points(agents$X,agents$Y,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3)

We use the closest distance to match demand and supply

close_supply <- function(coord,knots=agents[agents$Type=="Supply",]){
  knots[which.min(distHaversine( coord[,c("X","Y")] ,  knots[,c("X","Y")] , r =6378.137)),]
}
list_demand <- which(agents$Type=="Demand")
close_supply(agents[list_demand[1],])

FALSE     X   Y   Type
FALSE 4 6.5 5.5 Supply

plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in which(agents$Type=="Demand")) segments(agents[i,"X"],agents[i,"Y"],agents[agents$Type=="Supply","X"],agents[agents$Type=="Supply","Y"])
for(i in list_demand){
     supply=unlist(close_supply(agents[i,]))
     segments(agents[i,"X"],agents[i,"Y"],supply["X"],supply["Y"],col=darkcols[3],lwd=6)
}
points(agents$X,agents$Y,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3.2)
text(agents$X,agents$Y,1:nrow(agents),pch=forme[agents$Type],col="white")

or with a more classical bipartite graph with buyers on the left (demand), and sellers on the right (supply)

Show code for the bipartite graph

2.4 Minimal Distance on a Network

From now on, let us forget about that shortest Euclidean distance, and let us consider the distance on the network (using roads).

2.4.1 Agents Location and Knots

The first step is to match the agents with a knot of the network (we use here the Euclidean distance)

close_knot=function(coord,knots=dd2$knots){
  knots[which.min(distHaversine( coord[,c("X","Y")] ,  knots[,c("X","Y")] , r =6378.137)),]
}
shift_agents=agents
shift_agents$Xk=shift_agents$X
shift_agents$Yk=shift_agents$Y
shift_agents$NO=0
for(i in 1:nrow(shift_agents)) shift_agents[i,c("Xk","Yk","NO")]=close_knot(agents[i,])
shift_agents

FALSE     X   Y   Type       Xk       Yk NO
FALSE 1 5.5 4.0 Demand 5.000000 4.000000  7
FALSE 2 3.5 9.0 Demand 3.000000 9.000000  1
FALSE 3 6.5 8.0 Supply 6.000000 8.000000 13
FALSE 4 6.5 5.5 Supply 6.263158 5.894737  6
FALSE 5 4.5 9.0 Supply 5.000000 9.000000 11

plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in 1:(nrow(dd2$connected)/2)) segments(dd2[["knots"]]$X[dd2[["connected"]]$knot1[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot1[i]],
                                             dd2[["knots"]]$X[dd2[["connected"]]$knot2[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot2[i]])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="white",cex=3)
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=1,col="black",cex=3)
points(agents$Xk,agents$Yk,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3)
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO,col="black")
points(shift_agents$Xk,shift_agents$Yk,pch=forme[shift_agents$Type],col=darkcols[shift_agents$Type],cex=3.2)
points(shift_agents$X,shift_agents$Y,pch=forme[2+as.numeric(shift_agents$Type)],col=darkcols[as.numeric(shift_agents$Type)],cex=3)

2.4.2 Distance on a Network

Then we compute the distance between nodes, manually here (we will use igraph and gurobi in the next section)

plot(0:1,0:1,col="white",xlab="",ylab="",axes=FALSE)
text(.125,.5,"DEMAND",srt=90,col=darkcols[1])
text(.875,.5,"SUPPLY",srt=270,col=darkcols[2])
i=which(agents$Type=="Demand")
y1=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
i=which(agents$Type=="Supply")
y2=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
id=which(agents$Type=="Demand")
for(i in 1:sum(agents$Type=="Demand")){
  d=distHaversine( agents[id[i],c("X","Y")] ,  agents[agents$Type=="Supply",c("X","Y")] , r =6378.137)
  segments(.25,y1[i],.75,y2,col=c("black",darkcols[3])[1+(d==min(d))],lwd=c(1,6)[1+(d==min(d))])
  text(.5,(y2+y1[i])/2,round(d),pos=1,col=c("black",darkcols[3])[1+(d==min(d))])
}
i=which(agents$Type=="Demand")
points(rep(.25,length(i)),y1,
       pch=forme[1],col=darkcols[1],cex=3.2)
text(rep(.25,length(i)),y1,i,col="white")
i=which(agents$Type=="Supply")
points(rep(.75,length(i)),y2,
       pch=forme[2],col=darkcols[2],cex=3.2)
text(rep(.75,length(i)),y2,i,col="white")

2.5 Using `igraph`

One can use library(igraph) (see http://igraph.org) to find the closest supply location

Demand = shift_agents[shift_agents$Type=="Demand",c("NO","X","Y")]
Supply = shift_agents[shift_agents$Type=="Supply",c("NO","X","Y")]
Nodes = dd2$knots[,c("NO","X","Y")]
Arcs = dd2$connected
library(igraph)
iarcs <- make_graph(edges = as.vector(rbind(Arcs[,1],Arcs[,2])),directed=TRUE)
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.2)

iarcs = set_edge_attr(iarcs,"distance", value=Arcs[,"distance"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")

FALSE The graph is connex : TRUE

If the graph is not connex, we need to keep the largest strongly connected subgraph

if(!is.connected(iarcs,mode="strong")){
nodestokeep=which(clusters(iarcs,mode="strong")[[1]]==which(clusters(iarcs,mode="strong")[[2]]==max(clusters(iarcs,mode="strong")[[2]])))
nodestokeep
Arcs = Arcs[(Arcs$knot1 %in% nodestokeep)&(Arcs$knot2 %in% nodestokeep),]
Nodes = Nodes[Nodes$NO %in% nodestokeep, ]
iarcs = graph.edgelist(as.matrix(Arcs[,c("knot1","knot2")]), directed=TRUE)
iarcs = set_edge_attr(iarcs,"distance", value=Arcs[,"distance"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")
plot(iarcs,layout=as.matrix(dd2$knots[dd2$knots$NO%in%nodestokeep,c("X","Y")]),edge.arrow.size=.2)}

We need to keep the largest strongly connected subgraph (this technical assumption will be important later on):

AP=all_shortest_paths(iarcs,
from=Demand[1,"NO"],
to=Supply[3,"NO"])
L=AP$res[[1]]
n=length(L)
V(iarcs)$color="yellow"
V(iarcs)$color[L[2:(n-1)]]="light blue"
V(iarcs)$color[L[c(1,n)]]="blue"
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.2)

dist_graph = function(i,j) dd2$connected[(dd2$connected$knot1==i)&(dd2$connected$knot2==j),"distance"][1]

D=function(dem,sup){
AP=all_shortest_paths(iarcs,
from=Demand[dem,"NO"],
to=Supply[sup,"NO"],weights=dd2$connected$distance)
AP_path = AP$res[[1]]
d=0
for(u in 1:(length(AP_path)-1)) d=d+dist_graph(AP_path[u],AP_path[u+1])
d}
DF=expand.grid(dem=1:nrow(Demand),sup=1:nrow(Supply))
M=matrix(Vectorize(function(i) D(DF[i,"dem"],DF[i,"sup"]))(1:(nrow(Demand)*nrow(Supply))),nrow(Demand),nrow(Supply))
M

FALSE          [,1]     [,2]     [,3]
FALSE [1,] 542.9799 253.2018 772.7032
FALSE [2,] 425.6273 519.5013 246.6023

idx=apply(M,1,which.min)
for(i in 1:length(idx)){
AP=all_shortest_paths(iarcs,
from=Demand[i,"NO"],
to=Supply[idx[i],"NO"])
L=AP$res[[1]]
n=length(L)
V(iarcs)$color="yellow"
V(iarcs)$color[L[2:(n-1)]]="light blue"
V(iarcs)$color[L[c(1,n)]]="blue"
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.2,
     main=paste("Shortest Path, demand ",Demand[i,"NO"],", supply ",Supply[idx[i],"NO"],sep=""))
}

2.6 Using `gurobi`

One can also use library(gurobi) (see http://gurobi.com/) to solve a linear programing problem. Very generally, the gurobi interface enables to solve problems of the form $\mathbf{x}^{\text{T}}Q\mathbf{x}+c^{\text{T}}\mathbf{x}+\alpha$ subject to bound constraints $\mathbf{\ell}\leq\mathbf{x}\leq \text{u}$, some quadratic constraints $\mathbf{x}^{\text{T}}Qc\mathbf{x}+q^{\text{T}}\mathbf{x}\leq\beta$, some integrality constraints, some ordered set constraints, etc.

Here,the syntax will be the following (for gurobi functions)

a sparse matrix $\nabla$ - Nabla - of size the number of edges (raws) times the number of nodes (columns). Each raw contains zeros, except for the starting node (+1) and the arrival node (-1). Note that $|\nabla^{\text{T}}|$ is the incidence matrix of the network,
a cost vector $\Phi$ - Phi - of size the number of edges. Here, we use a linear cost function, so $\Phi$ is the vector of length (in km) of edges,
an in/out flow vector $n$ - n - of size the number of nodes
an objective called modelsense : here we want to maximize the flow.

result = gurobi ( list(A=t(Nabla), obj=Phi, modelsense='max', rhs=n, sense='=', start=matrix(0,nbArcs,1)), params=NULL)

The outputs of the gurobi call will be

the flows on each edge $\pi$, pi = result$x
the prices for each node $p$ (in the same unit as matrix $\Phi$)

library(gurobi)
arcs = dd2$connected
nbArcs=nrow(arcs)
Nabla =  sparseMatrix(i=1:nbArcs,j=arcs[,"knot1"],x=-1) + 
           sparseMatrix(i=1:nbArcs,j=arcs[,"knot2"],x=1)
Phi = -arcs[,"distance"]
n=rep(0,nrow(dd2$knots))  
n[Supply$NO]=+nrow(Demand)
n[Demand$NO]=-nrow(Supply)
sum(n)

FALSE [1] 0

result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)

FALSE Optimize a model with 13 rows, 26 columns and 52 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [6e+01, 6e+02]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [2e+00, 3e+00]
FALSE Presolve removed 10 rows and 20 columns
FALSE Presolve time: 0.01s
FALSE Presolved: 3 rows, 6 columns, 12 nonzeros
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE        0   -1.6954446e+03   4.000000e+00   0.000000e+00      0s
FALSE        2   -1.9682155e+03   0.000000e+00   0.000000e+00      0s
FALSE 
FALSE Solved in 2 iterations and 0.01 seconds
FALSE Optimal objective -1.968215514e+03

result

FALSE $status
FALSE [1] "OPTIMAL"
FALSE 
FALSE $runtime
FALSE [1] 0.01120687
FALSE 
FALSE $itercount
FALSE [1] 2
FALSE 
FALSE $baritercount
FALSE [1] 0
FALSE 
FALSE $nodecount
FALSE [1] 0
FALSE 
FALSE $objval
FALSE [1] -1968.216
FALSE 
FALSE $x
FALSE  [1] 3 1 0 1 0 0 3 0 0 0 2 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0
FALSE 
FALSE $slack
FALSE  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
FALSE 
FALSE $rc
FALSE  [1]    0.0000    0.0000 -383.6504    0.0000 -383.6521    0.0000    0.0000
FALSE  [8]    0.0000    0.0000    0.0000    0.0000    0.0000    0.0000 -246.6023
FALSE [15] -246.8586 -912.8559 -161.8896    0.0000 -750.9680 -506.4036 -295.1692
FALSE [22] -703.7146 -246.8586 -246.6023 -117.6091 -195.9041
FALSE 
FALSE $pi
FALSE  [1]  246.7304715  123.4293099    0.0000000 -264.6027722  -80.9448213
FALSE  [6]  110.8812242  364.0830417  -36.7033854 -388.5606944  246.8586198
FALSE [11]    0.1281483  -22.1402962 -178.8968596
FALSE 
FALSE $vbasis
FALSE  [1]  0  0 -1  0 -1  0  0  0  0  0  0  0  0 -1 -1 -1 -1  0 -1 -1 -1 -1 -1
FALSE [24] -1 -1 -1
FALSE 
FALSE $cbasis
FALSE  [1] -1 -1  0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
FALSE 
FALSE $objbound
FALSE [1] -1968.216

pi = result$x
distance = -result$objval
prices = result$pi

library(pals)
COL = rev(brewer.rdbu(13))
V(iarcs)$color=COL[rank(prices)]
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.1)

The legend for those prices is here

par(mfrow=c(1,1))
plot(1:13,rep(1,13),col=COL,pch=15,cex=9,ylim=c(.98,1.02),xlab="",ylab="",axes=FALSE)
text(1:13,rep(1,13),round(sort(prices)),cex=.9,col=c(rep("white",3),rep("black",7),rep("white",3)))

FALSE     X   Y   Type       Xk       Yk NO        price
FALSE 1 5.5 4.0 Demand 5.000000 4.000000  7  364.0830417
FALSE 2 3.5 9.0 Demand 3.000000 9.000000  1  246.7304715
FALSE 3 6.5 8.0 Supply 6.000000 8.000000 13 -178.8968596
FALSE 4 6.5 5.5 Supply 6.263158 5.894737  6  110.8812242
FALSE 5 4.5 9.0 Supply 5.000000 9.000000 11    0.1281483

3 Subway in Paris

For this application, we will use network data of Paris’ Metro with a dataset of edges/arcs and a dataset with locations of nodes,

arcs = read.csv(file = "arcs-metro.csv",sep=";", header=FALSE) 
dim(arcs)

FALSE [1] 868   3

nodes = read.csv("nodes-metro.csv",sep=";", header=FALSE) # loads the data
dim(nodes)

FALSE [1] 366   6

subarcs=arcs[(arcs$V1<303)&(arcs$V2<303),]
plot(nodes[1:302,c("V3","V4")],pch=19,xlab="",ylab="",axes=FALSE)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
                                nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"])

The following code can be used to visualize our network, with background information obtained from OpenStreetMap, as described in caRtosm’s github page.

#library(OpenStreetMap)
#library(devtools)
#install_github("ropensci/osmdata")
library(osmdata)
library(sf)
library(units)
library(scales)
#map=openmap(c(lat=48.9,lon=2.26),c(lat=48.8,lon=2.45))
#map=openproj(map,projection="+init=epsg:4326")
#plot(map)
# see https://rcarto.github.io/caRtosm/index.html
q0 = opq(bbox = c(2.2247, 48.8188, 2.4611, 48.9019)) 
q1 = add_osm_feature(opq = q0, key = 'name', value = "Paris")
res1 = osmdata_sf(q1)

Show code to import Paris background information

The following function use use to create a background to visualize the metro network

back_paris=function(a=1,background=FALSE){
plot(paris, col=alpha("#D9D0C9",a), border = NA, add=background)
plot(parc, col=alpha("#CDEBB2",a), border = NA, add=TRUE)
plot(seine, col=alpha("#AAD3DF",a), add=TRUE, lwd = 4)
plot(st_geometry(quartier), col = NA,lty = 2, lwd = 0.2, add=TRUE)
}

back_paris()
points(nodes[1:302,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
                                nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"])

Consider for instance locations of some hospitals in Paris (one could actually use the geocode function of library(ggmap), from the adresses of hospitals, but here, we have locations, whith longitudes - lon - and latitudes - lat)

library(ggmap)
adresses <- c("47 boulevard de l'hopital Paris","20 rue leblanc Paris","2 rue ambroise pare paris","1 place parvis notre dame paris","149 rue de sevres paris","1 avenue claude vellafaux paris","184 faubourg saint antoine paris","46 rue henri huchard paris")
# hospital <- lapply(adresses,geocode)
hospital = list(c(lon=2.3610258,lat=48.8401821),
                c(lon=2.2719068,lat=48.838736),
                c(lon=2.3507367,lat=48.8824963),
                c(lon=2.346142,lat=48.8539541),
                c(lon=2.3125916,lat=48.8455934),
                c(lon=2.3676277,lat=48.873488),
                c(lon=2.3806986,lat=48.8500029),
                c(lon=2.3300414,lat=48.8982232))

and consider someone sick, located Place Denfert-Rochereau

adresses <- c("Place Denfert-Rochereau, Paris, France")
# sick <- lapply(adresses,geocode)
sick = list(c(lat=48.8342694,lon=2.3300964))

We can plot them on the map

back_paris()
points(nodes[1:302,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"],col="grey")
points(unlist(hospital)[names(unlist(hospital))=="lon"],unlist(hospital)[names(unlist(hospital))=="lat"],pch=forme[2],col=darkcols[2],cex=2)
points(unlist(sick)[names(unlist(sick))=="lon"],unlist(sick)[names(unlist(sick))=="lat"],
       pch=forme[1],col=darkcols[1],cex=2)

Let us move all those locations to the closest subsway station (the nodes of our network)

Show code

back_paris()
points(nodes[1:302,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
                                nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"],col="grey")
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)

3.1 Using `igraph`

Using library(igraph), it is possible to compute distances, for instance to get to the first hospital in our list (from the location of the sick person)

library(igraph)
g <- make_graph(edges = as.vector(rbind(arcs[,1],arcs[,2])),directed=FALSE)
sp <- shortest_paths(g, newsick[1,"NO"], newhospital[1,"NO"], weights= arcs[,3])
pth <- as.numeric(sp$vpath[[1]])
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
                                nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[pth,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(pth)-1)) segments(nodes[pth[i],"V3"],nodes[pth[i],"V4"],
                                nodes[pth[i+1],"V3"],nodes[pth[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[1]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)

or the third one

library(igraph)
g <- make_graph(edges = as.vector(rbind(arcs[,1],arcs[,2])),directed=FALSE)
sp <- shortest_paths(g, newsick[1,"NO"], newhospital[3,"NO"], weights= arcs[,3])
pth <- as.numeric(sp$vpath[[1]])
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
                                nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[pth,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(pth)-1)) segments(nodes[pth[i],"V3"],nodes[pth[i],"V4"],
                                nodes[pth[i+1],"V3"],nodes[pth[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[3]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)

Distances are respectively

distances(g, newsick[1,"NO"], newhospital[1,"NO"], weights= arcs[,3])

FALSE          [,1]
FALSE [1,] 2872.404

distances(g, newsick[1,"NO"], newhospital[3,"NO"], weights= arcs[,3])

FALSE          [,1]
FALSE [1,] 6029.571

On a graph, we get

plot(0:1,0:1,col="white",xlab="",ylab="",axes=FALSE)
text(.125,.5,"DEMAND",srt=90,col=darkcols[1])
text(.875,.5,"SUPPLY",srt=270,col=darkcols[2])
agents=data.frame(X=c(newhospital$X,newsick$X),Y=c(newhospital$Y,newsick$Y),
                  Type=c(rep("Supply",nrow(newhospital)),c(rep("Demand",nrow(newsick)))))
i=which(agents$Type=="Demand")
y1=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
i=which(agents$Type=="Supply")
y2=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
id=which(agents$Type=="Demand")
for(i in 1:sum(agents$Type=="Demand")){
  d=rep(NA,sum(agents$Type=="Supply"))
  for(j in 1:sum(agents$Type=="Supply")){
    d[j]=distances(g, newsick[i,"NO"], newhospital[j,"NO"], weights= arcs[,3])
  }
  segments(.25,y1[i],.75,y2,col=c("black",darkcols[3])[1+(d==min(d))],lwd=c(1,6)[1+(d==min(d))])
  text(.5,(y2+y1[i])/2,round(d),pos=1,col=c("black",darkcols[3])[1+(d==min(d))])
}
i=which(agents$Type=="Demand")
points(rep(.25,length(i)),y1,
       pch=forme[1],col=darkcols[1],cex=3.2)
text(rep(.25,length(i)),y1,i,col="white")
i=which(agents$Type=="Supply")
points(rep(.75,length(i)),y2,
       pch=forme[2],col=darkcols[2],cex=3.2)
text(rep(.75,length(i)),y2,i,col="white")

Here is the closest hopital (here the th one), for our sick agent

which(d==min(d))

FALSE [1] 5

library(igraph)
g <- make_graph(edges = as.vector(rbind(arcs[,1],arcs[,2])),directed=FALSE)
sp <- shortest_paths(g, newsick[1,"NO"], newhospital[which.min(d),"NO"], weights= arcs[,3])
pth <- as.numeric(sp$vpath[[1]])
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
                                nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[pth,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(pth)-1)) segments(nodes[pth[i],"V3"],nodes[pth[i],"V4"],
                                nodes[pth[i+1],"V3"],nodes[pth[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[which(d==min(d))]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)

3.2 Using `gurobi`

As discussed in the previous example, it is necessary to reshape the data to fit with inputs of the gurobi function. Very generally, the gurobi interface enables to solve problems of the form $\mathbf{x}^{\text{T}}Q\mathbf{x}+c^{\text{T}}\mathbf{x}+\alpha$ subject to bound constraints $\mathbf{\ell}\leq\mathbf{x}\leq \text{u}$, some quadratic constraints $\mathbf{x}^{\text{T}}Qc\mathbf{x}+q^{\text{T}}\mathbf{x}\leq\beta$, some integrality constraints, some ordered set constraints, etc. We have here

a sparse matrix $\nabla$ - Nabla - of size the number of edges (raws) times the number of nodes (columns). Each raw contains zeros, except for the starting node (+1) and the arrival node (-1). Note that $|\nabla^{\text{T}}|$ is the incidence matrix of the network,
a cost vector $\Phi$ - Phi - of size the number of edges. Here, we use a linear cost function, so $\Phi$ is the vector of length (in km) of edges,
an in/out flow vector $n$ - n - of size the number of nodes
an objective called modelsense : here we want to maximize the flow.

result = gurobi ( list(A=t(Nabla), obj=Phi, modelsense='max', rhs=n, sense='=', start=matrix(0,nbArcs,1)), params=NULL)

The outputs of the gurobi call will be

the flows on each edge $\pi$, pi = result$x
the prices for each node $p$ (in the same unit as matrix $\Phi$)

  originNode = newsick[1,"NO"]
  destinationNode = newhospital[3,"NO"]
  nbNodes = max(arcs[,"V1"]) #dim(nodes)[1]
  nbArcs = dim(arcs)[1]
  n = rep(0,nbNodes)
  n[c(originNode,destinationNode)]=c(-1,1)
  Nabla =  sparseMatrix(i=1:nbArcs,j=arcs[,"V1"],x=-1) + 
           sparseMatrix(i=1:nbArcs,j=arcs[,"V2"],x=1)
  Phi = -arcs[,3]
  
  result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)

FALSE Optimize a model with 366 rows, 868 columns and 1736 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [6e+00, 6e+03]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [1e+00, 1e+00]
FALSE Presolve removed 300 rows and 600 columns
FALSE Presolve time: 0.00s
FALSE Presolved: 66 rows, 268 columns, 536 nonzeros
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE        0   -0.0000000e+00   2.000000e+00   0.000000e+00      0s
FALSE       31   -6.0295714e+03   0.000000e+00   0.000000e+00      0s
FALSE 
FALSE Solved in 31 iterations and 0.00 seconds
FALSE Optimal objective -6.029571414e+03

  result$status

FALSE [1] "OPTIMAL"

  (distance = -result$objval)

FALSE [1] 6029.571

  pi = result$x
  which(pi>0)

FALSE [1] 231 276 763 852 854 856 859 862

  cont = TRUE
  i = originNode
  eqpath = which(pi>0)
  rank = 0
  nodespath=c(0)
  nodespath[rank+1]=i
  while(cont)
  { 
    rank = rank+1
    leavingi = which(Nabla[,i]==-1)
    a = intersect(eqpath,leavingi)[1]
    j = which(Nabla[a,]==1)[1]
    i = j
    nodespath[rank+1]=i
    if(j==destinationNode) {cont<-FALSE}  
  }
nodespath

FALSE [1]  78 359 360 361 362 316 363  93  37

back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
                                nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[nodespath,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(nodespath)-1)) segments(nodes[nodespath[i],"V3"],nodes[nodespath[i],"V4"],      nodes[nodespath[i+1],"V3"],nodes[nodespath[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[3]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)

3.3 Voronoi sets

To conclude, notice that it is also possible to create some sort of Voronoi diagrams. The classical one is based on euclidean distances,

back_paris()
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
library(tripack)
loc=agents[agents$Type=="Supply",c("X","Y")]
names(loc)=c("x","y")
m=voronoi.mosaic(loc)
plot(m,add=TRUE,col="red",lwd=2)

affect1=function(x,y){
  which.min(distHaversine( c(x,y) ,  loc[,c("x","y")] , r =6378.137))
}
vx=seq(2.214296,2.479586,length=251)
vy=seq(48.786979,48.930752,length=251)
vz1=matrix(NA,251,251)
for(i in 1:251){
  for(j in 1:251){
    vz1[i,j]=affect1(vx[i],vy[j])
  } 
}
lightcols <- brewer.pal(8, "Pastel2")
image(vx,vy,vz1,xlim=c(2.214296,2.479586),ylim=c(48.786979,48.930752),xlab="",ylab="",axes=FALSE,col=lightcols)
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
back_paris(a=.2,background = TRUE)

Now, if we use subway-based distances

Show code

image(vx,vy,vz2,xlim=c(2.26,2.45),ylim=c(48.8,48.9),xlab="",ylab="",axes=FALSE,col=lightcols)
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
back_paris(a=.2,background = TRUE)

3.4 Min-Cost Flow

Assume there is one sick person in each metro station. Suppose also that hospitals admit the same number of sick people.

  nbNodes = max(arcs[,"V1"]) 
  nbArcs = dim(arcs)[1]
  n = rep(-1,nbNodes)
  n[newhospital[,"NO"]]=-(sum(n)+length(newhospital[,"NO"]))/length(newhospital[,"NO"])
  n[30:120]

FALSE  [1] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 44.75 -1.00 -1.00 -1.00
FALSE [12] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [23] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [34] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [45] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [56] -1.00 -1.00 44.75 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [67] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [78] -1.00 -1.00 -1.00 -1.00 44.75 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [89] -1.00 -1.00 -1.00

  sum(n)

FALSE [1] 0

Then we use a min-cost flow algorithm

  Nabla =  sparseMatrix(i=1:nbArcs,j=arcs[,"V1"],x=-1) + 
           sparseMatrix(i=1:nbArcs,j=arcs[,"V2"],x=1)
  Phi = -arcs[,3]
  
  result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)

FALSE Optimize a model with 366 rows, 868 columns and 1736 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [6e+00, 6e+03]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [1e+00, 4e+01]
FALSE Presolve removed 147 rows and 294 columns
FALSE Presolve time: 0.00s
FALSE Presolved: 219 rows, 574 columns, 1148 nonzeros
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE        0   -1.0169517e+06   7.030000e+02   0.000000e+00      0s
FALSE      317   -1.7186206e+06   0.000000e+00   0.000000e+00      0s
FALSE 
FALSE Solved in 317 iterations and 0.00 seconds
FALSE Optimal objective -1.718620640e+06

pi = result$x
distance = -result$objval
prices = result$pi

COL=brewer.pal(n = 11, name = "RdBu")
B=quantile(prices,(0:11)/11)
B[1]=B[1]-1
rang_prices = cut(prices,breaks=B,labels=COL)

The black points below are the hospitals… colors here are related to the price, from (dark) red to (dark) blue

back_paris()
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],cex=2)
points(nodes[,c("V3","V4")],pch=19,col=as.character(COL))

The legend for those prices is here

par(mfrow=c(1,1))
COL=brewer.pal(n = 11, name = "RdBu")
plot(1:11,rep(1,11),col=COL,pch=15,cex=9,ylim=c(.98,1.02),xlab="",ylab="",axes=FALSE)
lbl = paste("[",round(B[1:11]),";",round(B[1+(1:11)]),"]",sep="")
text(1:11,rep(1,11),lbl,cex=.85)
text(1:11,rep(1,11),lbl,cex=.85,col=c(rep("white",3),rep("black",7),rep("white",3)))

par(mfrow=c(1,1))

Show code

  nbNodes = max(arcs[,"V1"]) 
  nbArcs = dim(arcs)[1]
  n = rep(1,nbNodes)
  n[newhospital[,"NO"]]=-(sum(n)-length(newhospital[,"NO"]))/length(newhospital[,"NO"])
  n[30:120]

FALSE  [1]   1.00   1.00   1.00   1.00   1.00   1.00   1.00 -44.75   1.00   1.00
FALSE [11]   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [21]   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [31]   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [41]   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [51]   1.00   1.00   1.00   1.00   1.00   1.00   1.00 -44.75   1.00   1.00
FALSE [61]   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [71]   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [81]   1.00 -44.75   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
FALSE [91]   1.00

  sum(n)

FALSE [1] 0

  Nabla =  sparseMatrix(i=1:nbArcs,j=arcs[,"V1"],x=-1) + 
           sparseMatrix(i=1:nbArcs,j=arcs[,"V2"],x=1)
  Phi = -arcs[,3] 
  result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)

FALSE Optimize a model with 366 rows, 868 columns and 1736 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [6e+00, 6e+03]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [1e+00, 4e+01]
FALSE Presolve removed 147 rows and 294 columns
FALSE Presolve time: 0.00s
FALSE Presolved: 219 rows, 574 columns, 1148 nonzeros
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE        0   -1.0169517e+06   6.760000e+02   0.000000e+00      0s
FALSE      319   -1.7186206e+06   0.000000e+00   0.000000e+00      0s
FALSE 
FALSE Solved in 319 iterations and 0.00 seconds
FALSE Optimal objective -1.718620640e+06

pi = result$x
distance = -result$objval
prices = result$pi

COL=brewer.pal(n = 11, name = "RdBu")
B=quantile(prices,(0:11)/11)
B[1]=B[1]-1
rang_prices = cut(prices,breaks=B,labels=COL)

back_paris()
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],cex=2)
points(nodes[,c("V3","V4")],pch=19,col=as.character(COL))

4 Natality

In this section, we plan to match pregnent women (demand) and maternities (supply), in France.

4.1 Preliminary work on the datasets

Shapefile of roads, where roads are the edges of the network, and intersections are the nodes, obtained from ign. Two files arcs.csv and nodes.csv were created.
Maternities details where obtained from data.gouv (to get the locations of all maternitives) and data.drees.sante.gouv.fr to get yearly statistics. From those files, three files were created: PERINAT_2016r.csv with birth counts, ID_2016r.csv with information about maternities, and etalab_stock_et_20181231.csv with spatial locations of hospitals,
Births in France, where obtained from data.gouv, and has been transfered to two files naissance_commune.csv and naissance_arrondissement.csv

For visualisation, we will use various shapefiles

Shapefiles of communes (cities) where obtained from actualitix.com but one can also use data.gouv
Shapefiles of departements where obtained from actualitix.com There is one shapefile per departement

4.1.1 Road network

arcsFrance = read.csv("Arcs.csv")
nodesFrance = read.csv("Nodes.csv")
birthsFrance = read.csv("Births.csv")
arcsFrance = arcsFrance[sort.list(arcsFrance[,"ID_RTE500"]), ]

4.1.2 Birth dataset

The birth file contains the number of births per city (ZIP code) in France. A first step can be to visualize those data,

Show code

We can get the number of birth per city (commune) even within Paris (per arrondissement)

Show code

For the region Île-de-France (containing the city of Paris, north-central and the most populous one in France, with 18.2% of the population). Colors are here just based on raw number of births, per commune (red is law while blue is high)

plot(dp75,xlim=c(1.5,3.5),ylim=c(48,49.25),col=x75)
plot(dp77,add=TRUE,col=x77)
plot(dp78,add=TRUE,col=x78)
plot(dp91,add=TRUE,col=x91)
plot(dp92,add=TRUE,col=x92)
plot(dp93,add=TRUE,col=x93)
plot(dp94,add=TRUE,col=x94)
plot(dp95,add=TRUE,col=x95)

It is also possible to zoom in, on Paris (75), Hauts de Seine (92), Seine Saint-Denis (93) and https://en.wikipedia.org/wiki/Val-de-Marne (94), the three départements with a frontier with Paris

Show code

plot(dp75,xlim=c(2.2,2.7),ylim=c(48.7,49),col=x75)
plot(dp92,add=TRUE,col=x92)
plot(dp93,add=TRUE,col=x93)
plot(dp94,add=TRUE,col=x94)

4.1.3 Births and Network

Consider the Hauts de Seine (92) departement, we can visualize the 36 communes (the polygons) and all the node (road intersections)

dp92 <- spTransform(dp92, CRS("+init=epsg:2154"))
plot(dp92,col=x92)
idx=floor(as.numeric(as.character(nodesFrance$INSEE_COMM))/1000)==92
points(nodesFrance[idx,"POINT_X"],nodesFrance[idx,"POINT_Y"])

For each polygon, we have the number of birth, for a given year. It is then possible to allocate them to knots inside each polygon.

4.1.4 Maternity dataset

Information about maternities is downloaded from data.gouv

Show code

LOC2="PERINAT_2016r.csv"
LOC3="ID_2016r.csv"
LOC4="etalab_stock_et_20181231.csv"
base2=read.csv2(LOC2)
base3=read.csv2(LOC3)
base4=read.csv2(LOC4)
dim(base4)

FALSE [1] 96956    42

stupidcomma = Vectorize(function(x){
    x=as.character(x)
    for(i in 1:10){x=sub(",","",as.character(x))}
    return(as.numeric(x))})
base4$coordx = stupidcomma(base4$coordx)
base4$coordy = stupidcomma(base4$coordy)

dim(base2)

FALSE [1] 664  49

base = base2[, c("FI","ACCUN","ACCMU")]
base = merge(base,base3,by.x="FI", by.y="fi")
dim(base)

FALSE [1] 664  73

base = base[, c("FI","ACCUN","ACCMU","dep","COMINSEE")]
base = merge(base,base4,by.x="FI", by.y="nofinesset")
dim(base)

FALSE [1] 653  46

base = base[, c("FI","ACCUN","ACCMU","dep","COMINSEE","coordx","coordy")]
base=base[!is.na(base$coordx),]
dim(base)

FALSE [1] 653   7

listdep = c(paste("0",1:9,sep=""),10:95)
base=base[base$dep %in% listdep,]
dim(base)

FALSE [1] 626   7

base$birth = base$ACCUN + base$ACCMU
base=base[!is.na(base$birth),]

mindist = Vectorize(function(i){
  nodesFrance[which.min((base$coordx[i]-nodesFrance[,"POINT_X"])^2+
  (base$coordy[i]-nodesFrance[,"POINT_Y"])^2),"ID_RTE500"]
})
base$ID_RTE500 = base$NEAR_FID = mindist(1:nrow(base))
maternitesFrance = base

We have a collection of 578 maternities, that can be ploted below

str(maternitesFrance)

FALSE 'data.frame': 578 obs. of  10 variables:
FALSE  $ FI       : Factor w/ 664 levels "010000024","010000032",..: 1 2 3 4 6 7 8 9 10 11 ...
FALSE  $ ACCUN    : int  2112 413 566 801 1233 882 1089 599 600 574 ...
FALSE  $ ACCMU    : int  40 5 3 5 18 12 12 7 8 7 ...
FALSE  $ dep      : Factor w/ 101 levels "01","02","03",..: 1 1 1 1 2 2 2 2 2 2 ...
FALSE  $ COMINSEE : Factor w/ 1794 levels "01004","01034",..: 3 2 11 1 32 28 33 23 22 32 ...
FALSE  $ coordx   : num  871098 908660 902284 882724 719368 ...
FALSE  $ coordy   : num  6569398 6521504 6578385 6542588 6973708 ...
FALSE  $ birth    : int  2152 418 569 806 1251 894 1101 606 608 581 ...
FALSE  $ NEAR_FID : int  151065 194305 206357 90300 39307 90743 42038 15097 158438 193937 ...
FALSE  $ ID_RTE500: int  151065 194305 206357 90300 39307 90743 42038 15097 158438 193937 ...

FR = readRDS("FRA_adm0.rds")
FR = spTransform(FR, CRS("+init=epsg:2154"))
P1 = FR@polygons[[1]]@Polygons[[355]]@coords
plot(P1,type="n",xlab="",ylab="",axes=FALSE,asp=1)
polygon(P1)
dpt=NULL
for(i in 1:nrow(maternitesFrance)){
  noeud = maternitesFrance$NEAR_FID[i]
  idx = which(nodesFrance$ID_RTE500 == noeud)
  points(nodesFrance[idx,"POINT_X"],nodesFrance[idx,"POINT_Y"])
  dpt=c(dpt,floor(as.numeric(as.character(nodesFrance$INSEE_COMM[idx]))/1000))
}

Per département, the total number of maternities is the following

table(dpt)

FALSE dpt
FALSE  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 
FALSE  4  6  5  2  2 11  4  3  2  3  4  5 17  8  4  4  7  4  4  5  6  1  4  4  6 
FALSE 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 
FALSE  7  5  9  6  8  2 12  8  7  3  7 10  3  3  4  8  2  7  5  3  4  1  5  6  7 
FALSE 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 
FALSE  3  3  6  3  7  8  3 22  6  4 12  4  7  4  3  8  8 16  1  5  3  5  7 19 11 
FALSE 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 
FALSE 10 11  3  4  4  3  9  7  7  3  4  5  3  1 13 13 14  9 11

4.2 Matching Homes and Maternities

The first step is a transformation of the edge file (some roads are one way, while others are two way - we duplicate the two way road)

data_edge = arcsFrance
data_node = nodesFrance
data_birth = birthsFrance
data_mater = maternitesFrance
n=nrow(data_edge)
arcs=data_edge[,c("ID_RTE500","ID_RTE500_","ID_RTE500_","LONGUEUR","SENS")]
names(arcs)[2:3]=c("START","END")
arcs[,"END"]=c(arcs[2:n,"END"],NA)
arcs$index = c(arcs[2:n,"ID_RTE500"]==arcs[1:(n-1),"ID_RTE500"],FALSE)
arcs[,"SENS"] = (data_edge[,"SENS"]=="Double sens")*1 + 
                (data_edge[,"SENS"]=="Sens inverse")*2 +
                (data_edge[,"SENS"]=="Sens unique")*3
arcs = arcs[arcs$index,1:5]

Then transformation of the node file (names of nodes)

nodes=data_node[,c("ID_RTE500","POINT_X","POINT_Y","INSEE_COMM","POPULATION","SUPERFICIE")]
nodes[,"INSEE_COMM"]=as.numeric(as.character(nodes[,"INSEE_COMM"]))
nodes=nodes[!is.na(nodes[,"INSEE_COMM"]),]
  # pb with corse e.g. "2A259"

A transformation of the birth (demand) file (again, the names of variables)

names(data_birth)[which(names(data_birth)=="CODGEO")]="INSEE_COMM"
data_birth[,"INSEE_COMM"]=as.numeric(as.character(data_birth[,"INSEE_COMM"]))
data_birth=data_birth[!is.na(data_birth[,"INSEE_COMM"]),]
nodes=merge(nodes,data_birth[,c("INSEE_COMM","NAISD14")],all.x=TRUE)

And a transformation of the maternity (supply) file

names(data_mater)[which(names(data_mater)=="NEAR_FID")]="ID_RTE500"
data_mater=data_mater[order(data_mater$ID_RTE500),]
data_mater$index = c(data_mater[2:nrow(data_mater),"ID_RTE500"]!=data_mater[1:(nrow(data_mater)-1),"ID_RTE500"],TRUE)
data_mater=data_mater[data_mater$index,]
data_mater=data_mater[,-9]
#nodes=merge(nodes,data_mater[,c("ID_RTE500","birth","V5","V6","V7","V8","V9","V10","V11","NEAR_DIST")],all.x=TRUE)
nodes=merge(nodes,data_mater,all.x=TRUE)

We do not have homes of pregnent mothers, only their city - code postal. Sometimes, nodes can be on two cities. For technical reasons, it is necessary to keep nodes only once.

numNodes=nrow(nodes)
idx=which(nodes[numNodes:2,"ID_RTE500"]==nodes[(numNodes-1):1,"ID_RTE500"])
length(idx)

FALSE [1] 980

for (i in idx) {
    nodes[numNodes-i,]=nodes[numNodes-i+1,]
}
nodes=nodes[!duplicated(nodes),]

dim(arcs)

FALSE [1] 320763      5

idx=which((arcs[,"START"] %in% nodes[,"ID_RTE500"])&(arcs[,"END"] %in% nodes[,"ID_RTE500"]))
length(idx)

FALSE [1] 310742

arcs=arcs[idx,]
arcs3=arcs[which(arcs[,"SENS"]==3),]
arcs2=arcs[which(arcs[,"SENS"]==2),c(1,3,2,4,5)]
names(arcs2)[2:3]=names(arcs3)[2:3]
arcs1=arcs[which(arcs[,"SENS"]==1),c(1,3,2,4,5)]
names(arcs1)[2:3]=names(arcs3)[2:3]
arcs=rbind(arcs3,arcs2,arcs1,arcs[which(arcs[,"SENS"]==1),])
dim(arcs)

FALSE [1] 611002      5

Let us check the connexity of the graph

iarcs = graph.edgelist(as.matrix(arcs[,c("START","END")]), directed=TRUE)
iarcs = set_vertex_attr(iarcs,"x", value=nodes[,"POINT_X"])
iarcs = set_vertex_attr(iarcs,"y", value=nodes[,"POINT_Y"])
iarcs = set_edge_attr(iarcs,"distance", value=arcs[,"LONGUEUR"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")

FALSE The graph is connex : FALSE

Unfortunately, the graph is not connected. We will keep the largest strongly connected subgraph,

Show code

if (!is.connected(iarcs,mode="strong")) {
  nodestokeep=which(clusters(iarcs,mode="strong")[[1]]==which(clusters(iarcs,mode="strong")[[2]]==max(clusters(iarcs,mode="strong")[[2]])))
  arcs = arcs[(arcs$START %in% nodestokeep)&(arcs$END %in% nodestokeep),]
  nodes = nodes[nodes$ID_RTE500 %in% nodestokeep, ]
      nouvelle_valeur = rep(NA,max(nodestokeep))
      nouvelle_valeur[nodestokeep]=1:length(nodestokeep)
      arcs$START=nouvelle_valeur[arcs$START]
      arcs$END=nouvelle_valeur[arcs$END]
      nodes$ID_RTE500=nouvelle_valeur[nodes$ID_RTE500]
iarcs = graph.edgelist(as.matrix(arcs[,c("START","END")]), directed=TRUE)
iarcs = set_vertex_attr(iarcs,"x", value=nodes[,"POINT_X"])
iarcs = set_vertex_attr(iarcs,"y", value=nodes[,"POINT_Y"])
iarcs = set_edge_attr(iarcs,"distance", value=arcs[,"LONGUEUR"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")
}

FALSE The graph is connex : TRUE

numMat=sum(!is.na(nodes[,"birth"]))

We have now 564 maternities.

Now, we need to balance the flow : the number of pregnent mothers should be equal to the number of deliveries in maternies, again for technical reasons. But the first step is to allocate births within a city among all the nodes. We will do that uniformely

T=table(nodes$INSEE_COMM)
nodes_insee = data.frame(INSEE_COMM=as.numeric(as.character(names(T))),
                         NBNODES = T)[,c(1,3)]
nodes=merge(nodes,nodes_insee,all.x=TRUE)
nodes$NAISSANCE = nodes$NAISD14 / nodes$NBNODES.Freq
nodes[is.na(nodes[,"NAISSANCE"]),"NAISSANCE"]=0

For instance, in the Hauts de Seine region, we have

dp92 <- spTransform(dp92, CRS("+init=epsg:2154"))
plot(dp92,border="grey",lwd=2)
idx=floor(as.numeric(as.character(nodes$INSEE_COMM))/1000)==92
class <- classIntervals(nodes[idx,"NAISSANCE"], 7, style="quantile")
colcode <- findColours(class, colours)
points(nodes[idx,"POINT_X"],nodes[idx,"POINT_Y"],col=colcode)

Observe that the number of births per node is not necessarily an integer, but it is not (technically at least) a major issue here.

Let us now balance to insure that the in-coming flow equals the out-coming flow

numBirths <- sum(nodes$NAISSANCE)
numBirthsInMat <- sum(data_birth$NAISD14[!is.na(data_birth$NAISD14)])

Number of births in the demand dataset (number of pregnent mothers) is 7.3893310^{5}, while the number of births according to the maternity dataset should be 806341. The difference is here significant (9.1%), but since the goal here is to illustrate the technique, let us no discuss further this issue here.

nodes$NAISSANCE <- nodes$NAISSANCE*numBirthsInMat/numBirths

This will be our balanced in-coming flows

BalancedIncomingFlows <- nodes$NAISSANCE

In order to allocate mothers to hospital, define the following shortest path function shortestPath(arcs, nodes, originNode, destinationNode), based on gurobi.

Show code for the shortestPath() function

For instance, consider two random nodes, 41 as the starting one, and 50115 as the destination one. Here is the shortest road,

FR = readRDS("FRA_adm0.rds")
FR = spTransform(FR, CRS("+init=epsg:2154"))
P1 = FR@polygons[[1]]@Polygons[[355]]@coords
plot(P1,type="n",xlab="",ylab="",axes=FALSE)
polygon(P1)
nodespath=shortestPath(arcs,nodes,41,50115,viz=TRUE)

FALSE Optimize a model with 209105 rows, 605318 columns and 1210636 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [1e-02, 4e+01]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [1e+00, 1e+00]
FALSE 
FALSE Concurrent LP optimizer: dual simplex and barrier
FALSE Showing barrier log only...
FALSE 
FALSE Presolve removed 46809 rows and 93850 columns
FALSE Presolve time: 2.19s
FALSE Presolved: 162296 rows, 511468 columns, 1022930 nonzeros
FALSE 
FALSE Ordering time: 0.00s
FALSE 
FALSE Barrier statistics:
FALSE  AA' NZ     : 2.519e+05
FALSE  Factor NZ  : 2.634e+06 (roughly 300 MBytes of memory)
FALSE  Factor Ops : 9.147e+07 (less than 1 second per iteration)
FALSE  Threads    : 1
FALSE 
FALSE                   Objective                Residual
FALSE Iter       Primal          Dual         Primal    Dual     Compl     Time
FALSE    0  -6.07058023e+05  1.27199149e+00  4.37e+00 2.60e+00  1.51e+01     3s
FALSE    1  -4.55634553e+05 -2.60680691e+01  5.72e-01 4.99e+00  4.08e+00     3s
FALSE    2  -2.14918787e+05 -4.83183251e+01  2.04e-02 2.84e-01  5.55e-01     3s
FALSE    3  -3.53126137e+04 -1.04241845e+02  2.02e-03 5.19e-02  7.74e-02     4s
FALSE    4  -1.42523713e+04 -2.11013242e+02  7.07e-04 2.36e-02  2.94e-02     4s
FALSE    5  -5.25404704e+03 -3.04486913e+02  2.10e-04 1.42e-02  1.02e-02     4s
FALSE    6  -2.94902504e+03 -4.90216105e+02  9.32e-05 1.09e-02  4.98e-03     4s
FALSE    7  -2.00978381e+03 -6.19302115e+02  5.03e-05 5.12e-03  2.78e-03     5s
FALSE    8  -1.50040656e+03 -6.49211353e+02  2.84e-05 3.34e-03  1.70e-03     5s
FALSE    9  -1.31510556e+03 -6.66366203e+02  2.09e-05 2.23e-03  1.29e-03     5s
FALSE   10  -1.19957750e+03 -6.87874872e+02  1.64e-05 1.98e-03  1.02e-03     5s
FALSE   11  -1.09070090e+03 -6.92678533e+02  1.24e-05 3.36e-03  7.89e-04     5s
FALSE   12  -9.27352798e+02 -7.06689238e+02  6.60e-06 1.51e-03  4.36e-04     6s
FALSE   13  -7.48638047e+02 -7.09681943e+02  9.05e-07 8.38e-04  7.69e-05     6s
FALSE   14  -7.14444353e+02 -7.13293883e+02  8.96e-09 8.45e-06  2.25e-06     6s
FALSE   15  -7.13341458e+02 -7.13338658e+02  1.18e-11 1.10e-13  5.48e-09     6s
FALSE   16  -7.13339997e+02 -7.13339997e+02  2.78e-11 1.10e-13  1.25e-14     7s
FALSE 
FALSE Barrier solved model in 16 iterations and 6.54 seconds
FALSE Optimal objective -7.13339997e+02
FALSE 
FALSE Crossover log...
FALSE 
FALSE   161975 DPushes remaining with DInf 0.0000000e+00                 7s
FALSE        0 DPushes remaining with DInf 6.2022124e-12                 9s
FALSE 
FALSE        1 PPushes remaining with PInf 0.0000000e+00                 9s
FALSE        0 PPushes remaining with PInf 0.0000000e+00                 9s
FALSE 
FALSE   Push phase complete: Pinf 0.0000000e+00, Dinf 6.2022124e-12      9s
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE   161979   -7.1334000e+02   0.000000e+00   0.000000e+00      9s
FALSE 
FALSE Solved with barrier
FALSE Solved in 161979 iterations and 9.24 seconds
FALSE Optimal objective -7.133399975e+02

Here, maternities are distributed as follows

plot(P1,type="n",xlab="",ylab="",axes=FALSE, asp=1)
polygon(P1)
points(nodes[!is.na(nodes$birth),"POINT_X"],nodes[!is.na(nodes$birth),"POINT_Y"],pch=19,col=rgb(0,0,1,.4))

For the min-cost flow analysis, use the following code

nbNodes = max(arcs[,"START"])
nbArcs = dim(arcs)[1]
n = nodes$NAISSANCE
n[!is.na(nodes$birth)] = n[!is.na(nodes$birth)]-sum(nodes$NAISSANCE)/sum(!is.na(nodes$birth))
sum(n)

FALSE [1] 2.388758e-10

The in-coming and out-coming flows are here equal so we can use gurobi()

Nabla =  sparseMatrix(i=1:nbArcs,j=arcs[,"START"],x=-1) + 
           sparseMatrix(i=1:nbArcs,j=arcs[,"END"],x=1)
Phi = -arcs[,"LONGUEUR"]
  
result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)

FALSE Optimize a model with 209105 rows, 605318 columns and 1210636 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [1e-02, 4e+01]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [5e-02, 1e+03]
FALSE 
FALSE Concurrent LP optimizer: dual simplex and barrier
FALSE Showing barrier log only...
FALSE 
FALSE Presolve removed 12131 rows and 23783 columns
FALSE Presolve time: 1.78s
FALSE Presolved: 196974 rows, 581535 columns, 1163064 nonzeros
FALSE 
FALSE Ordering time: 0.00s
FALSE 
FALSE Barrier statistics:
FALSE  AA' NZ     : 2.890e+05
FALSE  Factor NZ  : 3.003e+06 (roughly 340 MBytes of memory)
FALSE  Factor Ops : 9.776e+07 (less than 1 second per iteration)
FALSE  Threads    : 1
FALSE 
FALSE                   Objective                Residual
FALSE Iter       Primal          Dual         Primal    Dual     Compl     Time
FALSE    0  -1.47600095e+09 -1.18765362e+05  8.80e+03 1.77e+00  2.60e+04     3s
FALSE    1  -1.06983833e+09 -3.10921683e+06  1.52e+03 3.34e+00  6.88e+03     3s
FALSE    2  -4.72815511e+08 -5.82705536e+06  1.15e+02 1.49e-01  9.65e+02     4s
FALSE    3  -1.04226535e+08 -1.20454452e+07  1.31e+01 1.28e-02  1.66e+02     4s
FALSE    4  -6.10772196e+07 -1.69639690e+07  5.62e+00 5.50e-03  7.78e+01     4s
FALSE    5  -4.50557406e+07 -2.08870328e+07  2.98e+00 5.16e-03  4.22e+01     4s
FALSE    6  -3.85653932e+07 -2.22695869e+07  1.91e+00 2.92e-03  2.84e+01     5s
FALSE    7  -3.73820196e+07 -2.31777180e+07  1.72e+00 2.58e-03  2.47e+01     5s
FALSE    8  -3.25649020e+07 -2.38170487e+07  9.47e-01 1.92e-03  1.52e+01     5s
FALSE    9  -2.93672014e+07 -2.49830366e+07  4.48e-01 9.39e-04  7.60e+00     5s
FALSE   10  -2.77523198e+07 -2.54155134e+07  1.99e-01 6.13e-04  4.05e+00     5s
FALSE   11  -2.72155369e+07 -2.58793383e+07  1.18e-01 2.86e-04  2.31e+00     6s
FALSE   12  -2.70252887e+07 -2.60649004e+07  8.97e-02 1.65e-04  1.66e+00     6s
FALSE   13  -2.66488543e+07 -2.61851594e+07  3.55e-02 8.88e-05  8.03e-01     6s
FALSE   14  -2.65325627e+07 -2.62911117e+07  1.74e-02 4.14e-05  4.18e-01     6s
FALSE   15  -2.64802350e+07 -2.63627024e+07  9.40e-03 1.13e-05  2.03e-01     7s
FALSE   16  -2.64480568e+07 -2.63829883e+07  5.07e-03 4.82e-06  1.12e-01     7s
FALSE   17  -2.64299368e+07 -2.63951109e+07  2.65e-03 1.73e-06  6.02e-02     7s
FALSE   18  -2.64170805e+07 -2.64031819e+07  9.52e-04 6.86e-10  2.40e-02     7s
FALSE   19  -2.64112650e+07 -2.64074925e+07  2.23e-04 1.72e-10  6.52e-03     8s
FALSE   20  -2.64097741e+07 -2.64086210e+07  5.88e-05 5.27e-11  1.99e-03     8s
FALSE   21  -2.64093544e+07 -2.64089704e+07  1.62e-05 1.83e-11  6.64e-04     8s
FALSE   22  -2.64092458e+07 -2.64091285e+07  5.75e-06 3.69e-12  2.02e-04     8s
FALSE   23  -2.64091952e+07 -2.64091670e+07  1.15e-06 8.53e-13  4.88e-05     9s
FALSE   24  -2.64091831e+07 -2.64091807e+07  1.10e-06 2.87e-10  4.21e-06     9s
FALSE   25  -2.64091817e+07 -2.64091814e+07  9.73e-07 5.49e-10  4.07e-07     9s
FALSE   26  -2.64091816e+07 -2.64091816e+07  3.00e-07 1.20e-10  1.02e-07     9s
FALSE   27  -2.64091816e+07 -2.64091816e+07  1.25e-07 3.02e-11  3.26e-08     9s
FALSE   28  -2.64091816e+07 -2.64091816e+07  4.09e-08 7.53e-12  9.87e-09    10s
FALSE 
FALSE Barrier solved model in 28 iterations and 9.68 seconds
FALSE Optimal objective -2.64091816e+07
FALSE 
FALSE Crossover log...
FALSE 
FALSE     2406 DPushes remaining with DInf 0.0000000e+00                10s
FALSE        0 DPushes remaining with DInf 9.3240346e-12                11s
FALSE 
FALSE     1477 PPushes remaining with PInf 0.0000000e+00                11s
FALSE        0 PPushes remaining with PInf 0.0000000e+00                11s
FALSE 
FALSE   Push phase complete: Pinf 0.0000000e+00, Dinf 9.2009386e-12     11s
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE     3886   -2.6409182e+07   0.000000e+00   0.000000e+00     12s
FALSE 
FALSE Solved with barrier
FALSE Solved in 3886 iterations and 11.85 seconds
FALSE Optimal objective -2.640918159e+07

#result
pi = result$x
distance = -result$objval
prices = result$pi

plot(P1,type="n",xlab="",ylab="",axes=FALSE,xlim=c(100000,1100000),asp=1)
polygon(P1)
library(pals)
library(scales)
COL = alpha(rev(brewer.rdbu(100)),.5)
price_rank = rank(prices)*100/length(prices)
points(nodes[,"POINT_X"],nodes[,"POINT_Y"],pch=19,cex=.2,col=COL[1+floor(price_rank)])

We can actually zoom in, e.g. on South-East part of France

FR2=readRDS("FRA_adm2.rds")
FR2 = spTransform(FR2, CRS("+init=epsg:2154"))
plot(FR2,xlim=c(750000,1100000),ylim=c(6200000,6380000),col="grey",border="white",lwd=2)
points(nodes[,"POINT_X"],nodes[,"POINT_Y"],pch=19,cex=.35,col=COL[1+floor(price_rank)])

or on Paris

FR2=readRDS("FRA_adm2.rds")
FR2 = spTransform(FR2, CRS("+init=epsg:2154"))
plot(FR2,xlim=c(630000,660000),ylim=c(6840000,6880000),col="grey",border="white",lwd=2)
points(nodes[,"POINT_X"],nodes[,"POINT_Y"],pch=19,cex=.7,col=COL[1+floor(price_rank)])

4.3 Overall Perspective

One can also obtain easily chloropleth maps, but it is also possible to smooth, to visualize prices

dpt_nodes = data.frame(dpt=floor(nodes$INSEE_COMM/1000),price=prices)
dpt_data  = aggregate(dpt_nodes$price,by=list(dpt_nodes$dpt),FUN=mean)
names(dpt_data) = c("department","price")
FR2@data = data.frame(FR2@data, 
                      dpt_data[match(as.numeric(as.character(FR2@data[,"CCA_2"])), 
                      dpt_data[,"department"]),])

Let us define a grid on France

Show code

where we keep only points within the French territory.

Let us use k-nearest neighbors, on the k closest nodes (here k=20)

library(geosphere)
simbase2=as.data.frame(simbase)
names(df1)=c("long1","lat1")
df1$indice=indic
knn=function(i,k=20){
d=distHaversine(grille2[i,1:2],(simbase2[,c("long2","lat2")]), r=6378.137)
  r=rank(d)
  ind=which(r<=k)
  mean(simbase2[ind,"Y"])
}
df1$y=Vectorize(knn)(1:nrow(df1))
df1$z=df1$y
df1$z[indic==0]=NA
mz=matrix(df1$z, taille+1, taille+1)
cols = rev(brewer.rdbu(40))

image(seq(min(simbase$long1),max(simbase$long1),length=taille+1),
      seq(min(simbase$lat1),max(simbase$lat1),length=taille+1),mz,axes=FALSE,col=cols,xlab="",ylab="", asp=1)
polygon(P1,lwd=3)

Again, we can zoom in, for instance on the south-east of France, with maternities in black dots

image(seq(min(simbase$long1),max(simbase$long1),length=taille+1),
      seq(min(simbase$lat1),max(simbase$lat1),length=taille+1),mz,axes=FALSE,col=cols,
      xlim=c(750000,1100000),ylim=c(6200000,6380000),xlab="",ylab="", asp=1)
plot(FR2,border="white",lwd=2,add=TRUE)
points(nodes[!is.na(nodes$birth),"POINT_X"],nodes[!is.na(nodes$birth),"POINT_Y"],pch=1)

or on Paris

image(seq(min(simbase$long1),max(simbase$long1),length=taille+1),
      seq(min(simbase$lat1),max(simbase$lat1),length=taille+1),mz,axes=FALSE,col=cols,
      xlim=c(630000,660000),ylim=c(6840000,6880000),xlab="",ylab="", asp=1)
plot(FR2,border="white",lwd=2,add=TRUE)
points(nodes[!is.na(nodes$birth),"POINT_X"],nodes[!is.na(nodes$birth),"POINT_Y"],pch=1)

5 Generic R Function(s)

5.1 Dataset Preparation

We need four datasets here as inputs

the nodes dataset nodes, with locations (latitude, longitude)
the edge dataset arcs, with (directed) connexions between nodes
the supply dataset supply, with supply locations (latitude, longitude)
the demand dataset demand, with demand locations (latitude, longitude)

The tidy(arcs,nodes,supply,demand) function that reshape datasets * to make sure that supply and demand location are located on one node * to make sure that the graph is connected * to balance supply and demand (or skew symmetry and conservation) : the total flow entering must equal the total flow leaving

It will return four clean datasets arcs, nodes, supply and demand.

Show code for the tidy() function

Show code for the pricing() function

5.2 Application to Paris Region

borders = matrix(c(630000,680000,6840000,6880000),2,2)
plot(P1,type="n",xlab="",ylab="",axes=FALSE, asp=1)
polygon(P1,col="grey",border="white")
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
     col="red",density=20,border=NA)

plot(FR2,xlim=c(630000,680000),ylim=c(6840000,6880000),col="grey",border="white",lwd=2)
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
     col="red",density=20,border=NA)

Consider the sub-graph of nodes and edges, located in that rectangle,

index_sub_nodes = (nodes$POINT_X >= borders[1,1]) & (nodes$POINT_X <= borders[2,1]) &
                  (nodes$POINT_Y >= borders[1,2]) & (nodes$POINT_Y <= borders[2,2])
list_INSEE = unique(nodes[index_sub_nodes,"INSEE_COMM"])

There are 333 ‘cities’ remaining.

sub_nodes = nodes[nodes$INSEE_COMM %in% list_INSEE,]
list_ID = unique(sub_nodes$ID_RTE500)
sub_arcs = arcs[(arcs$START %in% list_ID)&(arcs$END %in% list_ID),]

Just to check, just look at

Show code

sub_supply= maternitesFrance[maternitesFrance$NEAR_FID %in% list_ID,]
sub_demand = birthsFrance[birthsFrance$CODGEO %in% list_INSEE,]

Show code

tidy_data = tidy(arcs = sub_arcs, nodes = sub_nodes,
                 supply = sub_supply, demand = sub_demand)

FALSE The graph is connex : TRUE 
FALSE Number of births in the dataset 110502.7 
FALSE Number of births in the original dataset 116341

sub_arcs = tidy_data$arcs
sub_nodes = tidy_data$nodes
sub_supply = tidy_data$supply
sub_demand = tidy_data$demand
prices = pricing(arcs = sub_arcs, nodes = sub_nodes,
                 supply = sub_supply, demand = sub_demand)

FALSE Optimize a model with 5870 rows, 16763 columns and 33526 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [2e-02, 1e+01]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [5e-01, 2e+03]
FALSE 
FALSE Concurrent LP optimizer: dual simplex and barrier
FALSE Showing barrier log only...
FALSE 
FALSE Presolve removed 682 rows and 1227 columns
FALSE Presolve time: 0.02s
FALSE Presolved: 5188 rows, 15536 columns, 31066 nonzeros
FALSE 
FALSE Ordering time: 0.00s
FALSE 
FALSE Barrier statistics:
FALSE  AA' NZ     : 7.980e+03
FALSE  Factor NZ  : 6.870e+04 (roughly 9 MBytes of memory)
FALSE  Factor Ops : 1.128e+06 (less than 1 second per iteration)
FALSE  Threads    : 1
FALSE 
FALSE                   Objective                Residual
FALSE Iter       Primal          Dual         Primal    Dual     Compl     Time
FALSE    0  -1.89663320e+07 -1.81853711e+04  1.30e+04 5.44e-01  8.25e+03     0s
FALSE    1  -1.16300912e+07 -1.35587838e+05  1.38e+03 5.55e-01  1.85e+03     0s
FALSE    2  -4.48074363e+06 -2.71303299e+05  1.51e+02 2.67e-02  3.01e+02     0s
FALSE    3  -1.19086829e+06 -4.47468278e+05  1.39e+01 3.18e-03  4.91e+01     0s
FALSE    4  -8.43879661e+05 -5.67403872e+05  4.93e+00 8.15e-04  1.80e+01     0s
FALSE    5  -7.48400630e+05 -6.05003501e+05  2.47e+00 2.40e-04  9.32e+00     0s
FALSE    6  -6.91290527e+05 -6.27283866e+05  9.79e-01 2.37e-04  4.16e+00     0s
FALSE    7  -6.71611362e+05 -6.42826488e+05  4.55e-01 2.22e-15  1.87e+00     0s
FALSE    8  -6.60203379e+05 -6.49436873e+05  1.49e-01 4.44e-15  6.97e-01     0s
FALSE    9  -6.55943204e+05 -6.52343538e+05  4.02e-02 3.55e-15  2.33e-01     0s
FALSE   10  -6.54753489e+05 -6.53489592e+05  1.24e-02 2.66e-15  8.19e-02     0s
FALSE   11  -6.54356385e+05 -6.53900187e+05  4.42e-03 2.66e-15  2.95e-02     0s
FALSE   12  -6.54203504e+05 -6.54048087e+05  1.57e-03 5.33e-15  1.01e-02     0s
FALSE   13  -6.54125050e+05 -6.54113314e+05  7.72e-05 2.66e-15  7.58e-04     0s
FALSE   14  -6.54118609e+05 -6.54118322e+05  2.67e-06 6.06e-15  1.85e-05     0s
FALSE   15  -6.54118348e+05 -6.54118344e+05  5.06e-07 3.71e-14  2.23e-07     0s
FALSE   16  -6.54118346e+05 -6.54118345e+05  5.47e-08 8.09e-12  6.01e-08     0s
FALSE   17  -6.54118346e+05 -6.54118346e+05  3.98e-08 8.83e-12  6.60e-09     0s
FALSE 
FALSE Barrier solved model in 17 iterations and 0.11 seconds
FALSE Optimal objective -6.54118346e+05
FALSE 
FALSE Crossover log...
FALSE 
FALSE        0 DPushes remaining with DInf 9.7102881e-14                 0s
FALSE 
FALSE       43 PPushes remaining with PInf 0.0000000e+00                 0s
FALSE        0 PPushes remaining with PInf 0.0000000e+00                 0s
FALSE 
FALSE   Push phase complete: Pinf 0.0000000e+00, Dinf 9.3897112e-14      0s
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE      266   -6.5411835e+05   0.000000e+00   0.000000e+00      0s
FALSE 
FALSE Solved with barrier
FALSE Solved in 266 iterations and 0.12 seconds
FALSE Optimal objective -6.541183460e+05

COL = alpha(rev(brewer.rdbu(100)),.5)
price_rank = rank(prices$prices)/length(prices$prices)*100

FR2=readRDS("FRA_adm2.rds")
FR2 = spTransform(FR2, CRS("+init=epsg:2154"))

plot(FR2,xlim=c(630000,660000),ylim=c(6840000,6880000),col="grey",border="white",lwd=2)
points(sub_nodes[,"POINT_X"],sub_nodes[,"POINT_Y"],pch=19,cex=.7,
       col=COL[1+floor(price_rank)])

If we focus only on maternity (there are 24 maternities in this region), we obtain the following map,

list_mater=which((!is.na(sub_nodes$birth)))
plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
COL = alpha(rev(brewer.rdbu(length(list_mater))),.5)
points(sub_nodes[list_mater,"POINT_X"],sub_nodes[list_mater,"POINT_Y"],pch=19,cex=1.25,col=COL[rank(prices$prices[list_mater])])

5.3 Application to South-East France

borders = matrix(c(650000,1100000,6200000,6680000),2,2)
plot(P1,type="n",xlab="",ylab="",axes=FALSE, asp=1)
polygon(P1,col="grey",border="white")
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
     col="red",density=20,border=NA)

plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
     col="red",density=20,border=NA)

Consider the sub-graph of nodes and edges, located in that rectangle,

index_sub_nodes = (nodes$POINT_X >= borders[1,1]) & (nodes$POINT_X <= borders[2,1]) &
                  (nodes$POINT_Y >= borders[1,2]) & (nodes$POINT_Y <= borders[2,2])
list_INSEE = unique(nodes[index_sub_nodes,"INSEE_COMM"])

There are 8216 ‘cities’ remaining.

sub_nodes = nodes[nodes$INSEE_COMM %in% list_INSEE,]
list_ID = unique(sub_nodes$ID_RTE500)
sub_arcs = arcs[(arcs$START %in% list_ID)&(arcs$END %in% list_ID),]

Just to check, just look at

Show code

sub_supply= maternitesFrance[maternitesFrance$NEAR_FID %in% list_ID,]
sub_demand = birthsFrance[birthsFrance$CODGEO %in% list_INSEE,]

Show code

tidy_data = tidy(arcs = sub_arcs, nodes = sub_nodes,
                 supply = sub_supply, demand = sub_demand)

FALSE The graph is connex : TRUE 
FALSE Number of births in the dataset 194645.2 
FALSE Number of births in the original dataset 166786

sub_arcs = tidy_data$arcs
sub_nodes = tidy_data$nodes
sub_supply = tidy_data$supply
sub_demand = tidy_data$demand
prices = pricing(arcs = sub_arcs, nodes = sub_nodes,
                 supply = sub_supply, demand = sub_demand)

FALSE Optimize a model with 54659 rows, 153672 columns and 307344 nonzeros
FALSE Coefficient statistics:
FALSE   Matrix range     [1e+00, 1e+00]
FALSE   Objective range  [2e-02, 3e+01]
FALSE   Bounds range     [0e+00, 0e+00]
FALSE   RHS range        [4e-02, 1e+03]
FALSE 
FALSE Concurrent LP optimizer: dual simplex and barrier
FALSE Showing barrier log only...
FALSE 
FALSE Presolve removed 4112 rows and 8086 columns
FALSE Presolve time: 0.33s
FALSE Presolved: 50547 rows, 145586 columns, 291170 nonzeros
FALSE 
FALSE Ordering time: 0.00s
FALSE 
FALSE Barrier statistics:
FALSE  AA' NZ     : 7.206e+04
FALSE  Factor NZ  : 6.719e+05 (roughly 80 MBytes of memory)
FALSE  Factor Ops : 1.334e+07 (less than 1 second per iteration)
FALSE  Threads    : 1
FALSE 
FALSE                   Objective                Residual
FALSE Iter       Primal          Dual         Primal    Dual     Compl     Time
FALSE    0  -5.61828325e+08 -8.30472890e+04  1.31e+04 4.20e-01  2.19e+04     1s
FALSE    1  -3.08601732e+08 -5.75633028e+05  2.14e+03 1.55e+00  5.24e+03     1s
FALSE    2  -1.00590964e+08 -1.27715354e+06  2.33e+02 7.65e-02  7.75e+02     1s
FALSE    3  -1.84404219e+07 -3.22411685e+06  2.02e+01 1.07e-02  1.08e+02     1s
FALSE    4  -1.07419964e+07 -4.70198719e+06  7.04e+00 1.24e-03  4.22e+01     1s
FALSE    5  -8.73530199e+06 -5.21040093e+06  3.92e+00 2.66e-14  2.45e+01     1s
FALSE    6  -7.59473454e+06 -5.53732445e+06  2.19e+00 9.97e-04  1.43e+01     1s
FALSE    7  -6.69170040e+06 -5.77811554e+06  8.32e-01 6.45e-04  6.32e+00     1s
FALSE    8  -6.31232895e+06 -5.93059995e+06  2.62e-01 3.38e-14  2.64e+00     1s
FALSE    9  -6.21548519e+06 -6.03195473e+06  1.27e-01 2.84e-14  1.27e+00     1s
FALSE   10  -6.16542662e+06 -6.06108786e+06  6.28e-02 2.84e-14  7.20e-01     1s
FALSE   11  -6.13968723e+06 -6.08461867e+06  3.35e-02 3.02e-14  3.80e-01     1s
FALSE   12  -6.12198476e+06 -6.09740897e+06  1.31e-02 2.49e-14  1.70e-01     1s
FALSE   13  -6.11590533e+06 -6.10382655e+06  6.08e-03 2.66e-14  8.34e-02     1s
FALSE   14  -6.11172127e+06 -6.10723039e+06  1.66e-03 2.84e-14  3.10e-02     1s
FALSE   15  -6.11032547e+06 -6.10926869e+06  2.72e-04 3.20e-14  7.30e-03     1s
FALSE   16  -6.11005855e+06 -6.10977519e+06  5.79e-05 2.84e-14  1.96e-03     1s
FALSE   17  -6.11000406e+06 -6.10991873e+06  1.98e-05 2.66e-14  5.89e-04     1s
FALSE   18  -6.10998100e+06 -6.10996058e+06  4.65e-06 2.66e-14  1.41e-04     1s
FALSE   19  -6.10997312e+06 -6.10997176e+06  2.67e-07 2.49e-14  9.39e-06     2s
FALSE   20  -6.10997217e+06 -6.10997209e+06  5.21e-07 1.52e-11  5.63e-07     2s
FALSE   21  -6.10997216e+06 -6.10997215e+06  2.01e-07 3.20e-14  7.68e-08     2s
FALSE   22  -6.10997216e+06 -6.10997216e+06  2.85e-08 4.25e-12  8.76e-09     2s
FALSE 
FALSE Barrier solved model in 22 iterations and 1.65 seconds
FALSE Optimal objective -6.10997216e+06
FALSE 
FALSE Crossover log...
FALSE 
FALSE        0 DPushes remaining with DInf 2.9092145e-12                 2s
FALSE 
FALSE      192 PPushes remaining with PInf 0.0000000e+00                 2s
FALSE        0 PPushes remaining with PInf 0.0000000e+00                 2s
FALSE 
FALSE   Push phase complete: Pinf 0.0000000e+00, Dinf 2.8958363e-12      2s
FALSE 
FALSE Iteration    Objective       Primal Inf.    Dual Inf.      Time
FALSE      832   -6.1099722e+06   0.000000e+00   0.000000e+00      2s
FALSE 
FALSE Solved with barrier
FALSE Solved in 832 iterations and 1.94 seconds
FALSE Optimal objective -6.109972161e+06

COL = alpha(rev(brewer.rdbu(100)),.5)
price_rank = rank(prices$prices)/length(prices$prices)*100

FR2=readRDS("FRA_adm2.rds")
FR2 = spTransform(FR2, CRS("+init=epsg:2154"))

plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
points(sub_nodes[,"POINT_X"],sub_nodes[,"POINT_Y"],pch=19,cex=.3,
       col=COL[1+floor(price_rank)])

If we focus only on maternity (there are 163 maternities in this region - or 156 depending on the dataset we use…), we obtain the following map,

list_mater=which((!is.na(sub_nodes$birth)))
plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
COL = alpha(rev(brewer.rdbu(length(list_mater))),.5)
points(sub_nodes[list_mater,"POINT_X"],sub_nodes[list_mater,"POINT_Y"],pch=19,cex=1.25,col=COL[rank(prices$prices[list_mater])])

5.4 Border Issue

5.4.1 In/Out Difference

We did mention it previously, but it is necessary here that the sum of deliveries in (pregnant women, birth registered in France) equals the sum of deliveries out (birth in hospitals), \[ \sum_{\text{node } i:\text{ pregnant woman}}\overline{d}_i = \sum_{\text{node } j:\text{ maternity}}d_j \] Unfortunately, because we use two different sources, both not not equal. So it is necessay to use a correction \[ \tilde{d}_k=\frac{\displaystyle{\sum_{\text{node } j:\text{ maternity}}d_j}}{\displaystyle{\sum_{\text{node } i:\text{ pregnant woman}}\overline{d}_i}}\cdot\overline{d}_k \] Unfortunately, this approximation is uniform over all France. To overcome that issue, local pricing was considered.

5.4.2 Local Approaches

Using the generic function mentioned previously, it is possible to compute prices on any territory. The strategy was to dived France in small territories, and to compute prices on each of them (the area below). In order to avoid border problems when we reconcile, we did actually use slighly larger regions.

pic_ani = function(x,y){
P0=P1[seq(1,nrow(P1),by=10),]
plot(P1,type="n",xlab="",ylab="",axes=FALSE,asp=1)
polygon(P0,col=rgb(1,0,0,.2))
VX=seq(min(P1[,1]),max(P1[,1]),length=13)
abline(v=VX,col="white")
VY=seq(min(P1[,2]),max(P1[,2]),length=13)
abline(h=VY,col="white")
abline(v=VX[c(1,5,9,13)])
abline(h=VY[c(1,5,9,13)])
i=x
j=y
rect(VX[i],VY[j],VX[i+4],VY[j+4],border=NA,col=rgb(0,0,1,.3))
rect(VX[max(i-1,1)],VY[max(j-1,1)],
     VX[min(i+5,13)],VY[min(j+5,13)],border=NA,col=rgb(0,0,1,.3))
}

Optimal transport on large networks: a practitioner guide

Arthur Charpentier

Alfred Galichon

Lucas Vernet

04 June 2019

1 The Theoretical Model

2 A Toy Example

2.1 Networks and Shapefiles

2.2 Adding Agents

2.3 Minimal Distance (Euclidean)

2.4 Minimal Distance on a Network

2.4.1 Agents Location and Knots

2.4.2 Distance on a Network

2.5 Using igraph

2.6 Using gurobi

3 Subway in Paris

3.1 Using igraph

3.2 Using gurobi

3.3 Voronoi sets

3.4 Min-Cost Flow

4 Natality

4.1 Preliminary work on the datasets

4.1.1 Road network

4.1.2 Birth dataset

4.1.3 Births and Network

4.1.4 Maternity dataset

4.2 Matching Homes and Maternities

4.3 Overall Perspective

5 Generic R Function(s)

5.1 Dataset Preparation

5.2 Application to Paris Region

5.3 Application to South-East France

5.4 Border Issue

5.4.1 In/Out Difference

5.4.2 Local Approaches

2.5 Using `igraph`

2.6 Using `gurobi`

3.1 Using `igraph`

3.2 Using `gurobi`