Abstract
This page is a (online) guide to illustrate the article (available from Arvix), that presents a set of tools for the modeling of a spatial allocation problem in a large geographic market and gives examples of applications. In our settings, the market is described by a network that maps the cost of travel between each pair of adjacent locations. Two types of agents are located at the nodes of this network. The buyers choose the most competitive sellers depending on their prices and the cost to reach them. Their utility is assumed additive in both these quantities. Each seller, taking as given other sellers prices, sets her own price to have a demand equal to the one we observed. We give a linear programming formulation for the equilibrium conditions. After formally introducing our model we apply it on two examples: prices offered by petrol stations and quality of services provided by maternity wards (only the later is described here for privacy issues). These examples illustrate the applicability of our model to aggregate demand, rank prices and estimate cost structure over the network. We insist on the possibility of applications to large scale data sets using modern linear programming solvers such as Gurobi.Contact : Arthur Charpentier charpentier.arthur@uqam.ca
We will use some some network packages (mainly to illustrate)
as well as some packages for spatial analysis, and visualization
library(geosphere)
library(rgdal)
library(leaflet)
library(scales)
library(lwgeom)
library(shapefiles)
library(sp)
library(rgeos)
library(osmdata)
library(sf)
library(ggmap)
library(maptools)
For technical reasons, we need additional packages
library(Matrix)
library(tripack)
library(RColorBrewer)
library(units)
library(pals)
library(classInt)
In the applications, we will use gurobi. It is a commercial optimization solver for linear programming, quadratic programming (among many other), that is available for free in academia. Installation can be a little bit complicated (see the vignette online),
We will consider three applications in this guide
The geometry of the geographic market is described by a connected graph \(G=\left(V,E\right)\) where \(V\) is a set of vertices and \(E\) is a set of directed edges. Vertices are the different locations; edges are road segments linking two adjacent locations.
Agents are located at the vertices of our graph (if not, as we will see, we will match them with the closest vertice).We make the following assumptions on their consumption preferences:
A path \(\pi_{vw}\) between two nodes \(v,w\in V\) is a sequence of connected edges starting in \(v\) and ending in \(w\). We note \(\Pi_{vw}\) the set of paths between \(v\) and \(w\).
For each edge \(e\in E\), we define \(c_{e}\) the cost of travel along this arc. A seller located in \(w\in V\) sells at an endogenous price \(p_{w}\). Therefore a consumer located in \(v\) chooses to buy from the seller located in \(w\) that minimizes his total cost \(p_{v}\) \[ p_{v}=\underset{w\in V\text{, }\pi_{vw}\in\Pi_{vw}}{\min}\left[p_{w}+\underset{e\in\pi_{vw}}{\sum}c_{e}\right] \]
Let’s denote\(u_{v}\) the reservation utility of agents located in \(v\) and \(p_{v}\) the price for these agents (including the cost of transportation). Therefore, if \(p_{v}>u_{v}\) a consumer in \(v\) his outside option, else he buys from the most competitive seller depending on his location in the market.
This outside option can be modeled by a fictitious node \(0\), where the price of the commodity is set to \(p_{0}=0\) and the cost of travel is \(c_{v0}=u_{v}\).
Hence the excess demand \(z_{v}\) at a node \(v\in V\) does not depend on the price. We are solving a matching problem over a network between sellers and consumers.
At equilibrium each consumer chooses the most competitive seller depending on his location; furthermore the demand received by each seller must equal his supply. If we define \(\mu_{e}\) the number of agents traveling along edge \(e\), these equilibrium conditions can be written: \[ \left\{ \begin{array}{l} \forall vw\in E\text{, }p_{w}-p_{v}\leq c_{vw}\\ \forall vw\in E\text{ such that }\mu_{vw}>0\text{, }p_{w}-p_{v}=c_{vw}\\ \forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array}\right. \]
These equations are the complementary slackness conditions for a minimum cost flow problem over network \(G\). Hence, if the graph is connected and if there isn’t any loop of negative cost in \(G\), there exists an equilibrium which is the solution of the min-cost flow problem \[ \begin{array}{cl} \mu=\text{argmin}\underset{e\in E}{\sum}c_{e}\mu_{e}\\ & \mu\text{ s.t. }\forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array} \]
Models used in trade theory usually use iceberg transport costs. These costs are linear in distance and paid in the transported good once arrived - hence the metaphor of the iceberg that melts as you transport it.
Our model can incorporate these costs applying a log-transformation to the equilibrium conditions. Let’s define for all \(vw\in E\) iceberg costs \(\tau_{vw}\). An equilibrium \(\left(\mu,p\right)\) should satisfy \[ \left\{ \begin{array}{l} \forall vw\in E\text{, }\left(1-\tau_{vw}\right)p_{w}\leq p_{v}\\ \forall vw\in E\text{ such that }\mu_{vw}>0\text{, }\left(1-\tau_{vw}\right)p_{w}=p_{v}\\ \forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array}\right. \] which becomes \[ \left\{ \begin{array}{l} \forall vw\in E\text{, }\log p_{w}-\log p_{v}\leq-\log\left(1-\tau_{vw}\right)\\ \forall vw\in E\text{ such that }\mu_{vw}>0\text{, }\log p_{w}-\log p_{v}=-\log\left(1-\tau_{vw}\right)\\ \forall v\in V\text{, }\underset{w\in V}{\sum}\mu_{vw}-\underset{u\in V}{\sum}\mu_{uv}=z_{v} \end{array}\right. \]
Therefore this problem is equivalent to the one presented earlier with \(\forall vw\in E\), \(c_{vw}=-\log\left(1-\tau_{vw}\right)\), and replacing \(p\) by its logarithm.
Consider a shapefile of roads, or train lines : each road is a collection of segments, or connected nodes. Our shapefile is, so fact, only in a dataframe dd
Here are the (given) knots and the four roads
par(mfrow=c(1,2))
plot(dd$X,dd$Y,col="white",xlab="",ylab="")
for(i in unique(dd[,1])) lines(dd$X[dd$Id==i],dd$Y[dd$Id==i])
points(dd$X,dd$Y,pch=19,col="red")
library(RColorBrewer)
darkcols = brewer.pal(8, "Dark2")
plot(dd$X,dd$Y,xlab="",ylab="")
for(i in unique(dd[,"Id"])) lines(dd$X[dd$Id==i],dd$Y[dd$Id==i],col=darkcols[i],lwd=2)
Let us create a shapefile from those segments
Show code for the extend_shp() functionIntersections of roads are now added, as knots of the network
par(mfrow=c(1,2))
plot(dd$X,dd$Y,col="white",xlab="",ylab="")
for(i in unique(dd[,1])) lines(dd$X[dd$Id==i],dd$Y[dd$Id==i])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="blue")
points(dd$X,dd$Y,pch=19,col="red")
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE,xlim=c(2.6,8.4))
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO)
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,cex=3)
We can also visualize all knots of the network
library(RColorBrewer)
darkcols = brewer.pal(8, "Dark2")
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in 1:(nrow(dd2$connected)/2)) segments(dd2[["knots"]]$X[dd2[["connected"]]$knot1[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot1[i]],
dd2[["knots"]]$X[dd2[["connected"]]$knot2[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot2[i]],
col=darkcols[dd2[["connected"]]$Id[i]])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="black",cex=3)
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO,col="white")
We can now create agents, buyers and sellers
agents=data.frame(X=c(5.5,3.5,6.5,6.5,4.5),
Y=c(4,9,8,5.5,9),
Type=as.factor(c("Demand","Demand","Supply","Supply","Supply")))
forme=c(15,19,0,1)
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in 1:(nrow(dd2$connected)/2)) segments(dd2[["knots"]]$X[dd2[["connected"]]$knot1[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot1[i]],
dd2[["knots"]]$X[dd2[["connected"]]$knot2[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot2[i]])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="white",cex=3)
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=1,col="black",cex=3)
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO,col="black")
points(agents$X,agents$Y,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3)
Here, we want to connect the buyers (green squares) and the sellers (orange bullets). Let use visualize, as a starting point, euclidean distances
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in which(agents$Type=="Demand")) segments(agents[i,"X"],agents[i,"Y"],agents[agents$Type=="Supply","X"],agents[agents$Type=="Supply","Y"])
points(agents$X,agents$Y,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3)
We use the closest distance to match demand and supply
close_supply <- function(coord,knots=agents[agents$Type=="Supply",]){
knots[which.min(distHaversine( coord[,c("X","Y")] , knots[,c("X","Y")] , r =6378.137)),]
}
list_demand <- which(agents$Type=="Demand")
close_supply(agents[list_demand[1],])
FALSE X Y Type
FALSE 4 6.5 5.5 Supply
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in which(agents$Type=="Demand")) segments(agents[i,"X"],agents[i,"Y"],agents[agents$Type=="Supply","X"],agents[agents$Type=="Supply","Y"])
for(i in list_demand){
supply=unlist(close_supply(agents[i,]))
segments(agents[i,"X"],agents[i,"Y"],supply["X"],supply["Y"],col=darkcols[3],lwd=6)
}
points(agents$X,agents$Y,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3.2)
text(agents$X,agents$Y,1:nrow(agents),pch=forme[agents$Type],col="white")
or with a more classical bipartite graph with buyers on the left (demand), and sellers on the right (supply)
Show code for the bipartite graphFrom now on, let us forget about that shortest Euclidean distance, and let us consider the distance on the network (using roads).
The first step is to match the agents with a knot of the network (we use here the Euclidean distance)
close_knot=function(coord,knots=dd2$knots){
knots[which.min(distHaversine( coord[,c("X","Y")] , knots[,c("X","Y")] , r =6378.137)),]
}
shift_agents=agents
shift_agents$Xk=shift_agents$X
shift_agents$Yk=shift_agents$Y
shift_agents$NO=0
for(i in 1:nrow(shift_agents)) shift_agents[i,c("Xk","Yk","NO")]=close_knot(agents[i,])
shift_agents
FALSE X Y Type Xk Yk NO
FALSE 1 5.5 4.0 Demand 5.000000 4.000000 7
FALSE 2 3.5 9.0 Demand 3.000000 9.000000 1
FALSE 3 6.5 8.0 Supply 6.000000 8.000000 13
FALSE 4 6.5 5.5 Supply 6.263158 5.894737 6
FALSE 5 4.5 9.0 Supply 5.000000 9.000000 11
plot(dd2[["shp"]]$X,dd2[["shp"]]$Y,col="white",xlab="",ylab="",axes=FALSE)
for(i in 1:(nrow(dd2$connected)/2)) segments(dd2[["knots"]]$X[dd2[["connected"]]$knot1[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot1[i]],
dd2[["knots"]]$X[dd2[["connected"]]$knot2[i]],dd2[["knots"]]$Y[dd2[["connected"]]$knot2[i]])
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=19,col="white",cex=3)
points(dd2[["knots"]]$X,dd2[["knots"]]$Y,pch=1,col="black",cex=3)
points(agents$Xk,agents$Yk,pch=forme[agents$Type],col=darkcols[agents$Type],cex=3)
text(dd2[["knots"]]$X,dd2[["knots"]]$Y,dd2[["knots"]]$NO,col="black")
points(shift_agents$Xk,shift_agents$Yk,pch=forme[shift_agents$Type],col=darkcols[shift_agents$Type],cex=3.2)
points(shift_agents$X,shift_agents$Y,pch=forme[2+as.numeric(shift_agents$Type)],col=darkcols[as.numeric(shift_agents$Type)],cex=3)
Then we compute the distance between nodes, manually here (we will use igraph
and gurobi
in the next section)
plot(0:1,0:1,col="white",xlab="",ylab="",axes=FALSE)
text(.125,.5,"DEMAND",srt=90,col=darkcols[1])
text(.875,.5,"SUPPLY",srt=270,col=darkcols[2])
i=which(agents$Type=="Demand")
y1=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
i=which(agents$Type=="Supply")
y2=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
id=which(agents$Type=="Demand")
for(i in 1:sum(agents$Type=="Demand")){
d=distHaversine( agents[id[i],c("X","Y")] , agents[agents$Type=="Supply",c("X","Y")] , r =6378.137)
segments(.25,y1[i],.75,y2,col=c("black",darkcols[3])[1+(d==min(d))],lwd=c(1,6)[1+(d==min(d))])
text(.5,(y2+y1[i])/2,round(d),pos=1,col=c("black",darkcols[3])[1+(d==min(d))])
}
i=which(agents$Type=="Demand")
points(rep(.25,length(i)),y1,
pch=forme[1],col=darkcols[1],cex=3.2)
text(rep(.25,length(i)),y1,i,col="white")
i=which(agents$Type=="Supply")
points(rep(.75,length(i)),y2,
pch=forme[2],col=darkcols[2],cex=3.2)
text(rep(.75,length(i)),y2,i,col="white")
igraph
One can use library(igraph)
(see http://igraph.org) to find the closest supply location
Demand = shift_agents[shift_agents$Type=="Demand",c("NO","X","Y")]
Supply = shift_agents[shift_agents$Type=="Supply",c("NO","X","Y")]
Nodes = dd2$knots[,c("NO","X","Y")]
Arcs = dd2$connected
library(igraph)
iarcs <- make_graph(edges = as.vector(rbind(Arcs[,1],Arcs[,2])),directed=TRUE)
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.2)
iarcs = set_edge_attr(iarcs,"distance", value=Arcs[,"distance"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")
FALSE The graph is connex : TRUE
If the graph is not connex, we need to keep the largest strongly connected subgraph
if(!is.connected(iarcs,mode="strong")){
nodestokeep=which(clusters(iarcs,mode="strong")[[1]]==which(clusters(iarcs,mode="strong")[[2]]==max(clusters(iarcs,mode="strong")[[2]])))
nodestokeep
Arcs = Arcs[(Arcs$knot1 %in% nodestokeep)&(Arcs$knot2 %in% nodestokeep),]
Nodes = Nodes[Nodes$NO %in% nodestokeep, ]
iarcs = graph.edgelist(as.matrix(Arcs[,c("knot1","knot2")]), directed=TRUE)
iarcs = set_edge_attr(iarcs,"distance", value=Arcs[,"distance"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")
plot(iarcs,layout=as.matrix(dd2$knots[dd2$knots$NO%in%nodestokeep,c("X","Y")]),edge.arrow.size=.2)}
We need to keep the largest strongly connected subgraph (this technical assumption will be important later on):
AP=all_shortest_paths(iarcs,
from=Demand[1,"NO"],
to=Supply[3,"NO"])
L=AP$res[[1]]
n=length(L)
V(iarcs)$color="yellow"
V(iarcs)$color[L[2:(n-1)]]="light blue"
V(iarcs)$color[L[c(1,n)]]="blue"
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.2)
dist_graph = function(i,j) dd2$connected[(dd2$connected$knot1==i)&(dd2$connected$knot2==j),"distance"][1]
D=function(dem,sup){
AP=all_shortest_paths(iarcs,
from=Demand[dem,"NO"],
to=Supply[sup,"NO"],weights=dd2$connected$distance)
AP_path = AP$res[[1]]
d=0
for(u in 1:(length(AP_path)-1)) d=d+dist_graph(AP_path[u],AP_path[u+1])
d}
DF=expand.grid(dem=1:nrow(Demand),sup=1:nrow(Supply))
M=matrix(Vectorize(function(i) D(DF[i,"dem"],DF[i,"sup"]))(1:(nrow(Demand)*nrow(Supply))),nrow(Demand),nrow(Supply))
M
FALSE [,1] [,2] [,3]
FALSE [1,] 542.9799 253.2018 772.7032
FALSE [2,] 425.6273 519.5013 246.6023
idx=apply(M,1,which.min)
for(i in 1:length(idx)){
AP=all_shortest_paths(iarcs,
from=Demand[i,"NO"],
to=Supply[idx[i],"NO"])
L=AP$res[[1]]
n=length(L)
V(iarcs)$color="yellow"
V(iarcs)$color[L[2:(n-1)]]="light blue"
V(iarcs)$color[L[c(1,n)]]="blue"
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.2,
main=paste("Shortest Path, demand ",Demand[i,"NO"],", supply ",Supply[idx[i],"NO"],sep=""))
}
gurobi
One can also use library(gurobi)
(see http://gurobi.com/) to solve a linear programing problem. Very generally, the gurobi interface enables to solve problems of the form \(\mathbf{x}^{\text{T}}Q\mathbf{x}+c^{\text{T}}\mathbf{x}+\alpha\) subject to bound constraints \(\mathbf{\ell}\leq\mathbf{x}\leq \text{u}\), some quadratic constraints \(\mathbf{x}^{\text{T}}Qc\mathbf{x}+q^{\text{T}}\mathbf{x}\leq\beta\), some integrality constraints, some ordered set constraints, etc.
Here,the syntax will be the following (for gurobi functions)
Nabla
- of size the number of edges (raws) times the number of nodes (columns). Each raw contains zeros, except for the starting node (+1) and the arrival node (-1). Note that \(|\nabla^{\text{T}}|\) is the incidence matrix of the network,Phi
- of size the number of edges. Here, we use a linear cost function, so \(\Phi\) is the vector of length (in km) of edges,n
- of size the number of nodesmodelsense
: here we want to maximize the flow.result = gurobi ( list(A=t(Nabla), obj=Phi, modelsense='max', rhs=n, sense='=', start=matrix(0,nbArcs,1)), params=NULL)
The outputs of the gurobi
call will be
pi = result$x
library(gurobi)
arcs = dd2$connected
nbArcs=nrow(arcs)
Nabla = sparseMatrix(i=1:nbArcs,j=arcs[,"knot1"],x=-1) +
sparseMatrix(i=1:nbArcs,j=arcs[,"knot2"],x=1)
Phi = -arcs[,"distance"]
n=rep(0,nrow(dd2$knots))
n[Supply$NO]=+nrow(Demand)
n[Demand$NO]=-nrow(Supply)
sum(n)
FALSE [1] 0
FALSE Optimize a model with 13 rows, 26 columns and 52 nonzeros
FALSE Coefficient statistics:
FALSE Matrix range [1e+00, 1e+00]
FALSE Objective range [6e+01, 6e+02]
FALSE Bounds range [0e+00, 0e+00]
FALSE RHS range [2e+00, 3e+00]
FALSE Presolve removed 10 rows and 20 columns
FALSE Presolve time: 0.01s
FALSE Presolved: 3 rows, 6 columns, 12 nonzeros
FALSE
FALSE Iteration Objective Primal Inf. Dual Inf. Time
FALSE 0 -1.6954446e+03 4.000000e+00 0.000000e+00 0s
FALSE 2 -1.9682155e+03 0.000000e+00 0.000000e+00 0s
FALSE
FALSE Solved in 2 iterations and 0.01 seconds
FALSE Optimal objective -1.968215514e+03
FALSE $status
FALSE [1] "OPTIMAL"
FALSE
FALSE $runtime
FALSE [1] 0.01120687
FALSE
FALSE $itercount
FALSE [1] 2
FALSE
FALSE $baritercount
FALSE [1] 0
FALSE
FALSE $nodecount
FALSE [1] 0
FALSE
FALSE $objval
FALSE [1] -1968.216
FALSE
FALSE $x
FALSE [1] 3 1 0 1 0 0 3 0 0 0 2 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0
FALSE
FALSE $slack
FALSE [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
FALSE
FALSE $rc
FALSE [1] 0.0000 0.0000 -383.6504 0.0000 -383.6521 0.0000 0.0000
FALSE [8] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 -246.6023
FALSE [15] -246.8586 -912.8559 -161.8896 0.0000 -750.9680 -506.4036 -295.1692
FALSE [22] -703.7146 -246.8586 -246.6023 -117.6091 -195.9041
FALSE
FALSE $pi
FALSE [1] 246.7304715 123.4293099 0.0000000 -264.6027722 -80.9448213
FALSE [6] 110.8812242 364.0830417 -36.7033854 -388.5606944 246.8586198
FALSE [11] 0.1281483 -22.1402962 -178.8968596
FALSE
FALSE $vbasis
FALSE [1] 0 0 -1 0 -1 0 0 0 0 0 0 0 0 -1 -1 -1 -1 0 -1 -1 -1 -1 -1
FALSE [24] -1 -1 -1
FALSE
FALSE $cbasis
FALSE [1] -1 -1 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
FALSE
FALSE $objbound
FALSE [1] -1968.216
library(pals)
COL = rev(brewer.rdbu(13))
V(iarcs)$color=COL[rank(prices)]
plot(iarcs,layout=as.matrix(dd2$knots[,c("X","Y")]),edge.arrow.size=.1)
The legend for those prices is here
par(mfrow=c(1,1))
plot(1:13,rep(1,13),col=COL,pch=15,cex=9,ylim=c(.98,1.02),xlab="",ylab="",axes=FALSE)
text(1:13,rep(1,13),round(sort(prices)),cex=.9,col=c(rep("white",3),rep("black",7),rep("white",3)))
FALSE X Y Type Xk Yk NO price
FALSE 1 5.5 4.0 Demand 5.000000 4.000000 7 364.0830417
FALSE 2 3.5 9.0 Demand 3.000000 9.000000 1 246.7304715
FALSE 3 6.5 8.0 Supply 6.000000 8.000000 13 -178.8968596
FALSE 4 6.5 5.5 Supply 6.263158 5.894737 6 110.8812242
FALSE 5 4.5 9.0 Supply 5.000000 9.000000 11 0.1281483
For this application, we will use network data of Paris’ Metro with a dataset of edges/arcs and a dataset with locations of nodes,
FALSE [1] 868 3
FALSE [1] 366 6
subarcs=arcs[(arcs$V1<303)&(arcs$V2<303),]
plot(nodes[1:302,c("V3","V4")],pch=19,xlab="",ylab="",axes=FALSE)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"])
The following code can be used to visualize our network, with background information obtained from OpenStreetMap, as described in caRtosm’s github page.
#library(OpenStreetMap)
#library(devtools)
#install_github("ropensci/osmdata")
library(osmdata)
library(sf)
library(units)
library(scales)
#map=openmap(c(lat=48.9,lon=2.26),c(lat=48.8,lon=2.45))
#map=openproj(map,projection="+init=epsg:4326")
#plot(map)
# see https://rcarto.github.io/caRtosm/index.html
q0 = opq(bbox = c(2.2247, 48.8188, 2.4611, 48.9019))
q1 = add_osm_feature(opq = q0, key = 'name', value = "Paris")
res1 = osmdata_sf(q1)
The following function use use to create a background to visualize the metro network
back_paris=function(a=1,background=FALSE){
plot(paris, col=alpha("#D9D0C9",a), border = NA, add=background)
plot(parc, col=alpha("#CDEBB2",a), border = NA, add=TRUE)
plot(seine, col=alpha("#AAD3DF",a), add=TRUE, lwd = 4)
plot(st_geometry(quartier), col = NA,lty = 2, lwd = 0.2, add=TRUE)
}
back_paris()
points(nodes[1:302,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"])
Consider for instance locations of some hospitals in Paris (one could actually use the geocode
function of library(ggmap)
, from the adresses of hospitals, but here, we have locations, whith longitudes - lon
- and latitudes - lat
)
library(ggmap)
adresses <- c("47 boulevard de l'hopital Paris","20 rue leblanc Paris","2 rue ambroise pare paris","1 place parvis notre dame paris","149 rue de sevres paris","1 avenue claude vellafaux paris","184 faubourg saint antoine paris","46 rue henri huchard paris")
# hospital <- lapply(adresses,geocode)
hospital = list(c(lon=2.3610258,lat=48.8401821),
c(lon=2.2719068,lat=48.838736),
c(lon=2.3507367,lat=48.8824963),
c(lon=2.346142,lat=48.8539541),
c(lon=2.3125916,lat=48.8455934),
c(lon=2.3676277,lat=48.873488),
c(lon=2.3806986,lat=48.8500029),
c(lon=2.3300414,lat=48.8982232))
and consider someone sick, located Place Denfert-Rochereau
adresses <- c("Place Denfert-Rochereau, Paris, France")
# sick <- lapply(adresses,geocode)
sick = list(c(lat=48.8342694,lon=2.3300964))
We can plot them on the map
back_paris()
points(nodes[1:302,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"],col="grey")
points(unlist(hospital)[names(unlist(hospital))=="lon"],unlist(hospital)[names(unlist(hospital))=="lat"],pch=forme[2],col=darkcols[2],cex=2)
points(unlist(sick)[names(unlist(sick))=="lon"],unlist(sick)[names(unlist(sick))=="lat"],
pch=forme[1],col=darkcols[1],cex=2)
Let us move all those locations to the closest subsway station (the nodes of our network)
Show codeback_paris()
points(nodes[1:302,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[subarcs$V1[i],"V3"],nodes[subarcs$V1[i],"V4"],
nodes[subarcs$V2[i],"V3"],nodes[subarcs$V2[i],"V4"],col="grey")
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)
igraph
Using library(igraph)
, it is possible to compute distances, for instance to get to the first hospital in our list (from the location of the sick person)
library(igraph)
g <- make_graph(edges = as.vector(rbind(arcs[,1],arcs[,2])),directed=FALSE)
sp <- shortest_paths(g, newsick[1,"NO"], newhospital[1,"NO"], weights= arcs[,3])
pth <- as.numeric(sp$vpath[[1]])
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[pth,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(pth)-1)) segments(nodes[pth[i],"V3"],nodes[pth[i],"V4"],
nodes[pth[i+1],"V3"],nodes[pth[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[1]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)
or the third one
library(igraph)
g <- make_graph(edges = as.vector(rbind(arcs[,1],arcs[,2])),directed=FALSE)
sp <- shortest_paths(g, newsick[1,"NO"], newhospital[3,"NO"], weights= arcs[,3])
pth <- as.numeric(sp$vpath[[1]])
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[pth,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(pth)-1)) segments(nodes[pth[i],"V3"],nodes[pth[i],"V4"],
nodes[pth[i+1],"V3"],nodes[pth[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[3]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)
Distances are respectively
FALSE [,1]
FALSE [1,] 2872.404
FALSE [,1]
FALSE [1,] 6029.571
On a graph, we get
plot(0:1,0:1,col="white",xlab="",ylab="",axes=FALSE)
text(.125,.5,"DEMAND",srt=90,col=darkcols[1])
text(.875,.5,"SUPPLY",srt=270,col=darkcols[2])
agents=data.frame(X=c(newhospital$X,newsick$X),Y=c(newhospital$Y,newsick$Y),
Type=c(rep("Supply",nrow(newhospital)),c(rep("Demand",nrow(newsick)))))
i=which(agents$Type=="Demand")
y1=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
i=which(agents$Type=="Supply")
y2=rev(seq(1/(2*length(i)),1-1/(2*length(i)),by=1/(2*length(i)))[seq(1,2*length(i),by=2)])
id=which(agents$Type=="Demand")
for(i in 1:sum(agents$Type=="Demand")){
d=rep(NA,sum(agents$Type=="Supply"))
for(j in 1:sum(agents$Type=="Supply")){
d[j]=distances(g, newsick[i,"NO"], newhospital[j,"NO"], weights= arcs[,3])
}
segments(.25,y1[i],.75,y2,col=c("black",darkcols[3])[1+(d==min(d))],lwd=c(1,6)[1+(d==min(d))])
text(.5,(y2+y1[i])/2,round(d),pos=1,col=c("black",darkcols[3])[1+(d==min(d))])
}
i=which(agents$Type=="Demand")
points(rep(.25,length(i)),y1,
pch=forme[1],col=darkcols[1],cex=3.2)
text(rep(.25,length(i)),y1,i,col="white")
i=which(agents$Type=="Supply")
points(rep(.75,length(i)),y2,
pch=forme[2],col=darkcols[2],cex=3.2)
text(rep(.75,length(i)),y2,i,col="white")
Here is the closest hopital (here the th one), for our sick agent
FALSE [1] 5
library(igraph)
g <- make_graph(edges = as.vector(rbind(arcs[,1],arcs[,2])),directed=FALSE)
sp <- shortest_paths(g, newsick[1,"NO"], newhospital[which.min(d),"NO"], weights= arcs[,3])
pth <- as.numeric(sp$vpath[[1]])
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[pth,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(pth)-1)) segments(nodes[pth[i],"V3"],nodes[pth[i],"V4"],
nodes[pth[i+1],"V3"],nodes[pth[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[which(d==min(d))]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)
gurobi
As discussed in the previous example, it is necessary to reshape the data to fit with inputs of the gurobi
function. Very generally, the gurobi interface enables to solve problems of the form \(\mathbf{x}^{\text{T}}Q\mathbf{x}+c^{\text{T}}\mathbf{x}+\alpha\) subject to bound constraints \(\mathbf{\ell}\leq\mathbf{x}\leq \text{u}\), some quadratic constraints \(\mathbf{x}^{\text{T}}Qc\mathbf{x}+q^{\text{T}}\mathbf{x}\leq\beta\), some integrality constraints, some ordered set constraints, etc. We have here
Nabla
- of size the number of edges (raws) times the number of nodes (columns). Each raw contains zeros, except for the starting node (+1) and the arrival node (-1). Note that \(|\nabla^{\text{T}}|\) is the incidence matrix of the network,Phi
- of size the number of edges. Here, we use a linear cost function, so \(\Phi\) is the vector of length (in km) of edges,n
- of size the number of nodesmodelsense
: here we want to maximize the flow.result = gurobi ( list(A=t(Nabla), obj=Phi, modelsense='max', rhs=n, sense='=', start=matrix(0,nbArcs,1)), params=NULL)
The outputs of the gurobi
call will be
pi = result$x
originNode = newsick[1,"NO"]
destinationNode = newhospital[3,"NO"]
nbNodes = max(arcs[,"V1"]) #dim(nodes)[1]
nbArcs = dim(arcs)[1]
n = rep(0,nbNodes)
n[c(originNode,destinationNode)]=c(-1,1)
Nabla = sparseMatrix(i=1:nbArcs,j=arcs[,"V1"],x=-1) +
sparseMatrix(i=1:nbArcs,j=arcs[,"V2"],x=1)
Phi = -arcs[,3]
result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)
FALSE Optimize a model with 366 rows, 868 columns and 1736 nonzeros
FALSE Coefficient statistics:
FALSE Matrix range [1e+00, 1e+00]
FALSE Objective range [6e+00, 6e+03]
FALSE Bounds range [0e+00, 0e+00]
FALSE RHS range [1e+00, 1e+00]
FALSE Presolve removed 300 rows and 600 columns
FALSE Presolve time: 0.00s
FALSE Presolved: 66 rows, 268 columns, 536 nonzeros
FALSE
FALSE Iteration Objective Primal Inf. Dual Inf. Time
FALSE 0 -0.0000000e+00 2.000000e+00 0.000000e+00 0s
FALSE 31 -6.0295714e+03 0.000000e+00 0.000000e+00 0s
FALSE
FALSE Solved in 31 iterations and 0.00 seconds
FALSE Optimal objective -6.029571414e+03
FALSE [1] "OPTIMAL"
FALSE [1] 6029.571
FALSE [1] 231 276 763 852 854 856 859 862
cont = TRUE
i = originNode
eqpath = which(pi>0)
rank = 0
nodespath=c(0)
nodespath[rank+1]=i
while(cont)
{
rank = rank+1
leavingi = which(Nabla[,i]==-1)
a = intersect(eqpath,leavingi)[1]
j = which(Nabla[a,]==1)[1]
i = j
nodespath[rank+1]=i
if(j==destinationNode) {cont<-FALSE}
}
nodespath
FALSE [1] 78 359 360 361 362 316 363 93 37
back_paris()
points(nodes[,c("V3","V4")],pch=19)
for(i in 1:nrow(arcs)) segments(nodes[arcs$V1[i],"V3"],nodes[arcs$V1[i],"V4"],
nodes[arcs$V2[i],"V3"],nodes[arcs$V2[i],"V4"],col="grey")
points(nodes[nodespath,c("V3","V4")],col=darkcols[3],pch=19)
for(i in 1:(length(nodespath)-1)) segments(nodes[nodespath[i],"V3"],nodes[nodespath[i],"V4"], nodes[nodespath[i+1],"V3"],nodes[nodespath[i+1],"V4"],col=darkcols[3],lwd=5)
PCH=rep(1,nrow(newhospital))
PCH[3]=forme[2]
points(newhospital[,"X"],newhospital[,"Y"],pch=PCH,col=darkcols[2],cex=2,lwd=2)
points(newsick[,"X"],newsick[,"Y"],pch=forme[1],col=darkcols[1],cex=2)
To conclude, notice that it is also possible to create some sort of Voronoi diagrams. The classical one is based on euclidean distances,
back_paris()
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
library(tripack)
loc=agents[agents$Type=="Supply",c("X","Y")]
names(loc)=c("x","y")
m=voronoi.mosaic(loc)
plot(m,add=TRUE,col="red",lwd=2)
affect1=function(x,y){
which.min(distHaversine( c(x,y) , loc[,c("x","y")] , r =6378.137))
}
vx=seq(2.214296,2.479586,length=251)
vy=seq(48.786979,48.930752,length=251)
vz1=matrix(NA,251,251)
for(i in 1:251){
for(j in 1:251){
vz1[i,j]=affect1(vx[i],vy[j])
}
}
lightcols <- brewer.pal(8, "Pastel2")
image(vx,vy,vz1,xlim=c(2.214296,2.479586),ylim=c(48.786979,48.930752),xlab="",ylab="",axes=FALSE,col=lightcols)
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],col=darkcols[2],cex=2)
back_paris(a=.2,background = TRUE)
Now, if we use subway-based distances
Show codeAssume there is one sick person in each metro station. Suppose also that hospitals admit the same number of sick people.
nbNodes = max(arcs[,"V1"])
nbArcs = dim(arcs)[1]
n = rep(-1,nbNodes)
n[newhospital[,"NO"]]=-(sum(n)+length(newhospital[,"NO"]))/length(newhospital[,"NO"])
n[30:120]
FALSE [1] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 44.75 -1.00 -1.00 -1.00
FALSE [12] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [23] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [34] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [45] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [56] -1.00 -1.00 44.75 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [67] -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [78] -1.00 -1.00 -1.00 -1.00 44.75 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
FALSE [89] -1.00 -1.00 -1.00
FALSE [1] 0
Then we use a min-cost flow algorithm
Nabla = sparseMatrix(i=1:nbArcs,j=arcs[,"V1"],x=-1) +
sparseMatrix(i=1:nbArcs,j=arcs[,"V2"],x=1)
Phi = -arcs[,3]
result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)
FALSE Optimize a model with 366 rows, 868 columns and 1736 nonzeros
FALSE Coefficient statistics:
FALSE Matrix range [1e+00, 1e+00]
FALSE Objective range [6e+00, 6e+03]
FALSE Bounds range [0e+00, 0e+00]
FALSE RHS range [1e+00, 4e+01]
FALSE Presolve removed 147 rows and 294 columns
FALSE Presolve time: 0.00s
FALSE Presolved: 219 rows, 574 columns, 1148 nonzeros
FALSE
FALSE Iteration Objective Primal Inf. Dual Inf. Time
FALSE 0 -1.0169517e+06 7.030000e+02 0.000000e+00 0s
FALSE 317 -1.7186206e+06 0.000000e+00 0.000000e+00 0s
FALSE
FALSE Solved in 317 iterations and 0.00 seconds
FALSE Optimal objective -1.718620640e+06
pi = result$x
distance = -result$objval
prices = result$pi
COL=brewer.pal(n = 11, name = "RdBu")
B=quantile(prices,(0:11)/11)
B[1]=B[1]-1
rang_prices = cut(prices,breaks=B,labels=COL)
The black points below are the hospitals… colors here are related to the price, from (dark) red to (dark) blue
back_paris()
points(newhospital[,"X"],newhospital[,"Y"],pch=forme[2],cex=2)
points(nodes[,c("V3","V4")],pch=19,col=as.character(COL))
The legend for those prices is here
par(mfrow=c(1,1))
COL=brewer.pal(n = 11, name = "RdBu")
plot(1:11,rep(1,11),col=COL,pch=15,cex=9,ylim=c(.98,1.02),xlab="",ylab="",axes=FALSE)
lbl = paste("[",round(B[1:11]),";",round(B[1+(1:11)]),"]",sep="")
text(1:11,rep(1,11),lbl,cex=.85)
text(1:11,rep(1,11),lbl,cex=.85,col=c(rep("white",3),rep("black",7),rep("white",3)))
In this section, we plan to match pregnent women (demand) and maternities (supply), in France.
arcs.csv
and nodes.csv
were created.PERINAT_2016r.csv
with birth counts, ID_2016r.csv
with information about maternities, and etalab_stock_et_20181231.csv
with spatial locations of hospitals,naissance_commune.csv
and naissance_arrondissement.csv
For visualisation, we will use various shapefiles
The birth
file contains the number of births per city (ZIP code) in France. A first step can be to visualize those data,
We can get the number of birth per city (commune) even within Paris (per arrondissement)
Show codeFor the region ÃŽle-de-France (containing the city of Paris, north-central and the most populous one in France, with 18.2% of the population). Colors are here just based on raw number of births, per commune (red is law while blue is high)
plot(dp75,xlim=c(1.5,3.5),ylim=c(48,49.25),col=x75)
plot(dp77,add=TRUE,col=x77)
plot(dp78,add=TRUE,col=x78)
plot(dp91,add=TRUE,col=x91)
plot(dp92,add=TRUE,col=x92)
plot(dp93,add=TRUE,col=x93)
plot(dp94,add=TRUE,col=x94)
plot(dp95,add=TRUE,col=x95)
It is also possible to zoom in, on Paris (75), Hauts de Seine (92), Seine Saint-Denis (93) and https://en.wikipedia.org/wiki/Val-de-Marne (94), the three départements with a frontier with Paris
Show codeConsider the Hauts de Seine (92
) departement, we can visualize the 36 communes (the polygons) and all the node (road intersections)
dp92 <- spTransform(dp92, CRS("+init=epsg:2154"))
plot(dp92,col=x92)
idx=floor(as.numeric(as.character(nodesFrance$INSEE_COMM))/1000)==92
points(nodesFrance[idx,"POINT_X"],nodesFrance[idx,"POINT_Y"])
For each polygon, we have the number of birth, for a given year. It is then possible to allocate them to knots inside each polygon.
Information about maternities is downloaded from data.gouv
Show codeWe have a collection of 578 maternities, that can be ploted below
FALSE 'data.frame': 578 obs. of 10 variables:
FALSE $ FI : Factor w/ 664 levels "010000024","010000032",..: 1 2 3 4 6 7 8 9 10 11 ...
FALSE $ ACCUN : int 2112 413 566 801 1233 882 1089 599 600 574 ...
FALSE $ ACCMU : int 40 5 3 5 18 12 12 7 8 7 ...
FALSE $ dep : Factor w/ 101 levels "01","02","03",..: 1 1 1 1 2 2 2 2 2 2 ...
FALSE $ COMINSEE : Factor w/ 1794 levels "01004","01034",..: 3 2 11 1 32 28 33 23 22 32 ...
FALSE $ coordx : num 871098 908660 902284 882724 719368 ...
FALSE $ coordy : num 6569398 6521504 6578385 6542588 6973708 ...
FALSE $ birth : int 2152 418 569 806 1251 894 1101 606 608 581 ...
FALSE $ NEAR_FID : int 151065 194305 206357 90300 39307 90743 42038 15097 158438 193937 ...
FALSE $ ID_RTE500: int 151065 194305 206357 90300 39307 90743 42038 15097 158438 193937 ...
FR = readRDS("FRA_adm0.rds")
FR = spTransform(FR, CRS("+init=epsg:2154"))
P1 = FR@polygons[[1]]@Polygons[[355]]@coords
plot(P1,type="n",xlab="",ylab="",axes=FALSE,asp=1)
polygon(P1)
dpt=NULL
for(i in 1:nrow(maternitesFrance)){
noeud = maternitesFrance$NEAR_FID[i]
idx = which(nodesFrance$ID_RTE500 == noeud)
points(nodesFrance[idx,"POINT_X"],nodesFrance[idx,"POINT_Y"])
dpt=c(dpt,floor(as.numeric(as.character(nodesFrance$INSEE_COMM[idx]))/1000))
}
Per département, the total number of maternities is the following
FALSE dpt
FALSE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26
FALSE 4 6 5 2 2 11 4 3 2 3 4 5 17 8 4 4 7 4 4 5 6 1 4 4 6
FALSE 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
FALSE 7 5 9 6 8 2 12 8 7 3 7 10 3 3 4 8 2 7 5 3 4 1 5 6 7
FALSE 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
FALSE 3 3 6 3 7 8 3 22 6 4 12 4 7 4 3 8 8 16 1 5 3 5 7 19 11
FALSE 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
FALSE 10 11 3 4 4 3 9 7 7 3 4 5 3 1 13 13 14 9 11
The first step is a transformation of the edge file (some roads are one way, while others are two way - we duplicate the two way road)
data_edge = arcsFrance
data_node = nodesFrance
data_birth = birthsFrance
data_mater = maternitesFrance
n=nrow(data_edge)
arcs=data_edge[,c("ID_RTE500","ID_RTE500_","ID_RTE500_","LONGUEUR","SENS")]
names(arcs)[2:3]=c("START","END")
arcs[,"END"]=c(arcs[2:n,"END"],NA)
arcs$index = c(arcs[2:n,"ID_RTE500"]==arcs[1:(n-1),"ID_RTE500"],FALSE)
arcs[,"SENS"] = (data_edge[,"SENS"]=="Double sens")*1 +
(data_edge[,"SENS"]=="Sens inverse")*2 +
(data_edge[,"SENS"]=="Sens unique")*3
arcs = arcs[arcs$index,1:5]
Then transformation of the node file (names of nodes)
nodes=data_node[,c("ID_RTE500","POINT_X","POINT_Y","INSEE_COMM","POPULATION","SUPERFICIE")]
nodes[,"INSEE_COMM"]=as.numeric(as.character(nodes[,"INSEE_COMM"]))
nodes=nodes[!is.na(nodes[,"INSEE_COMM"]),]
# pb with corse e.g. "2A259"
A transformation of the birth (demand) file (again, the names of variables)
names(data_birth)[which(names(data_birth)=="CODGEO")]="INSEE_COMM"
data_birth[,"INSEE_COMM"]=as.numeric(as.character(data_birth[,"INSEE_COMM"]))
data_birth=data_birth[!is.na(data_birth[,"INSEE_COMM"]),]
nodes=merge(nodes,data_birth[,c("INSEE_COMM","NAISD14")],all.x=TRUE)
And a transformation of the maternity (supply) file
names(data_mater)[which(names(data_mater)=="NEAR_FID")]="ID_RTE500"
data_mater=data_mater[order(data_mater$ID_RTE500),]
data_mater$index = c(data_mater[2:nrow(data_mater),"ID_RTE500"]!=data_mater[1:(nrow(data_mater)-1),"ID_RTE500"],TRUE)
data_mater=data_mater[data_mater$index,]
data_mater=data_mater[,-9]
#nodes=merge(nodes,data_mater[,c("ID_RTE500","birth","V5","V6","V7","V8","V9","V10","V11","NEAR_DIST")],all.x=TRUE)
nodes=merge(nodes,data_mater,all.x=TRUE)
We do not have homes of pregnent mothers, only their city - code postal. Sometimes, nodes can be on two cities. For technical reasons, it is necessary to keep nodes only once.
numNodes=nrow(nodes)
idx=which(nodes[numNodes:2,"ID_RTE500"]==nodes[(numNodes-1):1,"ID_RTE500"])
length(idx)
FALSE [1] 980
FALSE [1] 320763 5
idx=which((arcs[,"START"] %in% nodes[,"ID_RTE500"])&(arcs[,"END"] %in% nodes[,"ID_RTE500"]))
length(idx)
FALSE [1] 310742
arcs=arcs[idx,]
arcs3=arcs[which(arcs[,"SENS"]==3),]
arcs2=arcs[which(arcs[,"SENS"]==2),c(1,3,2,4,5)]
names(arcs2)[2:3]=names(arcs3)[2:3]
arcs1=arcs[which(arcs[,"SENS"]==1),c(1,3,2,4,5)]
names(arcs1)[2:3]=names(arcs3)[2:3]
arcs=rbind(arcs3,arcs2,arcs1,arcs[which(arcs[,"SENS"]==1),])
dim(arcs)
FALSE [1] 611002 5
Let us check the connexity of the graph
iarcs = graph.edgelist(as.matrix(arcs[,c("START","END")]), directed=TRUE)
iarcs = set_vertex_attr(iarcs,"x", value=nodes[,"POINT_X"])
iarcs = set_vertex_attr(iarcs,"y", value=nodes[,"POINT_Y"])
iarcs = set_edge_attr(iarcs,"distance", value=arcs[,"LONGUEUR"])
cat("The graph is connex :", is.connected(iarcs,mode="strong"),"\n")
FALSE The graph is connex : FALSE
Unfortunately, the graph is not connected. We will keep the largest strongly connected subgraph,
Show codeWe have now 564 maternities.
Now, we need to balance the flow : the number of pregnent mothers should be equal to the number of deliveries in maternies, again for technical reasons. But the first step is to allocate births within a city among all the nodes. We will do that uniformely
T=table(nodes$INSEE_COMM)
nodes_insee = data.frame(INSEE_COMM=as.numeric(as.character(names(T))),
NBNODES = T)[,c(1,3)]
nodes=merge(nodes,nodes_insee,all.x=TRUE)
nodes$NAISSANCE = nodes$NAISD14 / nodes$NBNODES.Freq
nodes[is.na(nodes[,"NAISSANCE"]),"NAISSANCE"]=0
For instance, in the Hauts de Seine region, we have
dp92 <- spTransform(dp92, CRS("+init=epsg:2154"))
plot(dp92,border="grey",lwd=2)
idx=floor(as.numeric(as.character(nodes$INSEE_COMM))/1000)==92
class <- classIntervals(nodes[idx,"NAISSANCE"], 7, style="quantile")
colcode <- findColours(class, colours)
points(nodes[idx,"POINT_X"],nodes[idx,"POINT_Y"],col=colcode)
Observe that the number of births per node is not necessarily an integer, but it is not (technically at least) a major issue here.
Let us now balance to insure that the in-coming flow equals the out-coming flow
numBirths <- sum(nodes$NAISSANCE)
numBirthsInMat <- sum(data_birth$NAISD14[!is.na(data_birth$NAISD14)])
Number of births in the demand dataset (number of pregnent mothers) is 7.3893310^{5}, while the number of births according to the maternity dataset should be 806341. The difference is here significant (9.1%), but since the goal here is to illustrate the technique, let us no discuss further this issue here.
This will be our balanced in-coming flows
In order to allocate mothers to hospital, define the following shortest path function shortestPath(arcs, nodes, originNode, destinationNode)
, based on gurobi
.
For instance, consider two random nodes, 41
as the starting one, and 50115
as the destination one. Here is the shortest road,
FR = readRDS("FRA_adm0.rds")
FR = spTransform(FR, CRS("+init=epsg:2154"))
P1 = FR@polygons[[1]]@Polygons[[355]]@coords
plot(P1,type="n",xlab="",ylab="",axes=FALSE)
polygon(P1)
nodespath=shortestPath(arcs,nodes,41,50115,viz=TRUE)
FALSE Optimize a model with 209105 rows, 605318 columns and 1210636 nonzeros
FALSE Coefficient statistics:
FALSE Matrix range [1e+00, 1e+00]
FALSE Objective range [1e-02, 4e+01]
FALSE Bounds range [0e+00, 0e+00]
FALSE RHS range [1e+00, 1e+00]
FALSE
FALSE Concurrent LP optimizer: dual simplex and barrier
FALSE Showing barrier log only...
FALSE
FALSE Presolve removed 46809 rows and 93850 columns
FALSE Presolve time: 2.19s
FALSE Presolved: 162296 rows, 511468 columns, 1022930 nonzeros
FALSE
FALSE Ordering time: 0.00s
FALSE
FALSE Barrier statistics:
FALSE AA' NZ : 2.519e+05
FALSE Factor NZ : 2.634e+06 (roughly 300 MBytes of memory)
FALSE Factor Ops : 9.147e+07 (less than 1 second per iteration)
FALSE Threads : 1
FALSE
FALSE Objective Residual
FALSE Iter Primal Dual Primal Dual Compl Time
FALSE 0 -6.07058023e+05 1.27199149e+00 4.37e+00 2.60e+00 1.51e+01 3s
FALSE 1 -4.55634553e+05 -2.60680691e+01 5.72e-01 4.99e+00 4.08e+00 3s
FALSE 2 -2.14918787e+05 -4.83183251e+01 2.04e-02 2.84e-01 5.55e-01 3s
FALSE 3 -3.53126137e+04 -1.04241845e+02 2.02e-03 5.19e-02 7.74e-02 4s
FALSE 4 -1.42523713e+04 -2.11013242e+02 7.07e-04 2.36e-02 2.94e-02 4s
FALSE 5 -5.25404704e+03 -3.04486913e+02 2.10e-04 1.42e-02 1.02e-02 4s
FALSE 6 -2.94902504e+03 -4.90216105e+02 9.32e-05 1.09e-02 4.98e-03 4s
FALSE 7 -2.00978381e+03 -6.19302115e+02 5.03e-05 5.12e-03 2.78e-03 5s
FALSE 8 -1.50040656e+03 -6.49211353e+02 2.84e-05 3.34e-03 1.70e-03 5s
FALSE 9 -1.31510556e+03 -6.66366203e+02 2.09e-05 2.23e-03 1.29e-03 5s
FALSE 10 -1.19957750e+03 -6.87874872e+02 1.64e-05 1.98e-03 1.02e-03 5s
FALSE 11 -1.09070090e+03 -6.92678533e+02 1.24e-05 3.36e-03 7.89e-04 5s
FALSE 12 -9.27352798e+02 -7.06689238e+02 6.60e-06 1.51e-03 4.36e-04 6s
FALSE 13 -7.48638047e+02 -7.09681943e+02 9.05e-07 8.38e-04 7.69e-05 6s
FALSE 14 -7.14444353e+02 -7.13293883e+02 8.96e-09 8.45e-06 2.25e-06 6s
FALSE 15 -7.13341458e+02 -7.13338658e+02 1.18e-11 1.10e-13 5.48e-09 6s
FALSE 16 -7.13339997e+02 -7.13339997e+02 2.78e-11 1.10e-13 1.25e-14 7s
FALSE
FALSE Barrier solved model in 16 iterations and 6.54 seconds
FALSE Optimal objective -7.13339997e+02
FALSE
FALSE Crossover log...
FALSE
FALSE 161975 DPushes remaining with DInf 0.0000000e+00 7s
FALSE 0 DPushes remaining with DInf 6.2022124e-12 9s
FALSE
FALSE 1 PPushes remaining with PInf 0.0000000e+00 9s
FALSE 0 PPushes remaining with PInf 0.0000000e+00 9s
FALSE
FALSE Push phase complete: Pinf 0.0000000e+00, Dinf 6.2022124e-12 9s
FALSE
FALSE Iteration Objective Primal Inf. Dual Inf. Time
FALSE 161979 -7.1334000e+02 0.000000e+00 0.000000e+00 9s
FALSE
FALSE Solved with barrier
FALSE Solved in 161979 iterations and 9.24 seconds
FALSE Optimal objective -7.133399975e+02
Here, maternities are distributed as follows
plot(P1,type="n",xlab="",ylab="",axes=FALSE, asp=1)
polygon(P1)
points(nodes[!is.na(nodes$birth),"POINT_X"],nodes[!is.na(nodes$birth),"POINT_Y"],pch=19,col=rgb(0,0,1,.4))
For the min-cost flow analysis, use the following code
nbNodes = max(arcs[,"START"])
nbArcs = dim(arcs)[1]
n = nodes$NAISSANCE
n[!is.na(nodes$birth)] = n[!is.na(nodes$birth)]-sum(nodes$NAISSANCE)/sum(!is.na(nodes$birth))
sum(n)
FALSE [1] 2.388758e-10
The in-coming and out-coming flows are here equal so we can use gurobi()
Nabla = sparseMatrix(i=1:nbArcs,j=arcs[,"START"],x=-1) +
sparseMatrix(i=1:nbArcs,j=arcs[,"END"],x=1)
Phi = -arcs[,"LONGUEUR"]
result = gurobi ( list(A=t(Nabla),obj=Phi,modelsense='max',rhs=n,sense='=',start=matrix(0,nbArcs,1)), params=NULL)
FALSE Optimize a model with 209105 rows, 605318 columns and 1210636 nonzeros
FALSE Coefficient statistics:
FALSE Matrix range [1e+00, 1e+00]
FALSE Objective range [1e-02, 4e+01]
FALSE Bounds range [0e+00, 0e+00]
FALSE RHS range [5e-02, 1e+03]
FALSE
FALSE Concurrent LP optimizer: dual simplex and barrier
FALSE Showing barrier log only...
FALSE
FALSE Presolve removed 12131 rows and 23783 columns
FALSE Presolve time: 1.78s
FALSE Presolved: 196974 rows, 581535 columns, 1163064 nonzeros
FALSE
FALSE Ordering time: 0.00s
FALSE
FALSE Barrier statistics:
FALSE AA' NZ : 2.890e+05
FALSE Factor NZ : 3.003e+06 (roughly 340 MBytes of memory)
FALSE Factor Ops : 9.776e+07 (less than 1 second per iteration)
FALSE Threads : 1
FALSE
FALSE Objective Residual
FALSE Iter Primal Dual Primal Dual Compl Time
FALSE 0 -1.47600095e+09 -1.18765362e+05 8.80e+03 1.77e+00 2.60e+04 3s
FALSE 1 -1.06983833e+09 -3.10921683e+06 1.52e+03 3.34e+00 6.88e+03 3s
FALSE 2 -4.72815511e+08 -5.82705536e+06 1.15e+02 1.49e-01 9.65e+02 4s
FALSE 3 -1.04226535e+08 -1.20454452e+07 1.31e+01 1.28e-02 1.66e+02 4s
FALSE 4 -6.10772196e+07 -1.69639690e+07 5.62e+00 5.50e-03 7.78e+01 4s
FALSE 5 -4.50557406e+07 -2.08870328e+07 2.98e+00 5.16e-03 4.22e+01 4s
FALSE 6 -3.85653932e+07 -2.22695869e+07 1.91e+00 2.92e-03 2.84e+01 5s
FALSE 7 -3.73820196e+07 -2.31777180e+07 1.72e+00 2.58e-03 2.47e+01 5s
FALSE 8 -3.25649020e+07 -2.38170487e+07 9.47e-01 1.92e-03 1.52e+01 5s
FALSE 9 -2.93672014e+07 -2.49830366e+07 4.48e-01 9.39e-04 7.60e+00 5s
FALSE 10 -2.77523198e+07 -2.54155134e+07 1.99e-01 6.13e-04 4.05e+00 5s
FALSE 11 -2.72155369e+07 -2.58793383e+07 1.18e-01 2.86e-04 2.31e+00 6s
FALSE 12 -2.70252887e+07 -2.60649004e+07 8.97e-02 1.65e-04 1.66e+00 6s
FALSE 13 -2.66488543e+07 -2.61851594e+07 3.55e-02 8.88e-05 8.03e-01 6s
FALSE 14 -2.65325627e+07 -2.62911117e+07 1.74e-02 4.14e-05 4.18e-01 6s
FALSE 15 -2.64802350e+07 -2.63627024e+07 9.40e-03 1.13e-05 2.03e-01 7s
FALSE 16 -2.64480568e+07 -2.63829883e+07 5.07e-03 4.82e-06 1.12e-01 7s
FALSE 17 -2.64299368e+07 -2.63951109e+07 2.65e-03 1.73e-06 6.02e-02 7s
FALSE 18 -2.64170805e+07 -2.64031819e+07 9.52e-04 6.86e-10 2.40e-02 7s
FALSE 19 -2.64112650e+07 -2.64074925e+07 2.23e-04 1.72e-10 6.52e-03 8s
FALSE 20 -2.64097741e+07 -2.64086210e+07 5.88e-05 5.27e-11 1.99e-03 8s
FALSE 21 -2.64093544e+07 -2.64089704e+07 1.62e-05 1.83e-11 6.64e-04 8s
FALSE 22 -2.64092458e+07 -2.64091285e+07 5.75e-06 3.69e-12 2.02e-04 8s
FALSE 23 -2.64091952e+07 -2.64091670e+07 1.15e-06 8.53e-13 4.88e-05 9s
FALSE 24 -2.64091831e+07 -2.64091807e+07 1.10e-06 2.87e-10 4.21e-06 9s
FALSE 25 -2.64091817e+07 -2.64091814e+07 9.73e-07 5.49e-10 4.07e-07 9s
FALSE 26 -2.64091816e+07 -2.64091816e+07 3.00e-07 1.20e-10 1.02e-07 9s
FALSE 27 -2.64091816e+07 -2.64091816e+07 1.25e-07 3.02e-11 3.26e-08 9s
FALSE 28 -2.64091816e+07 -2.64091816e+07 4.09e-08 7.53e-12 9.87e-09 10s
FALSE
FALSE Barrier solved model in 28 iterations and 9.68 seconds
FALSE Optimal objective -2.64091816e+07
FALSE
FALSE Crossover log...
FALSE
FALSE 2406 DPushes remaining with DInf 0.0000000e+00 10s
FALSE 0 DPushes remaining with DInf 9.3240346e-12 11s
FALSE
FALSE 1477 PPushes remaining with PInf 0.0000000e+00 11s
FALSE 0 PPushes remaining with PInf 0.0000000e+00 11s
FALSE
FALSE Push phase complete: Pinf 0.0000000e+00, Dinf 9.2009386e-12 11s
FALSE
FALSE Iteration Objective Primal Inf. Dual Inf. Time
FALSE 3886 -2.6409182e+07 0.000000e+00 0.000000e+00 12s
FALSE
FALSE Solved with barrier
FALSE Solved in 3886 iterations and 11.85 seconds
FALSE Optimal objective -2.640918159e+07
plot(P1,type="n",xlab="",ylab="",axes=FALSE,xlim=c(100000,1100000),asp=1)
polygon(P1)
library(pals)
library(scales)
COL = alpha(rev(brewer.rdbu(100)),.5)
price_rank = rank(prices)*100/length(prices)
points(nodes[,"POINT_X"],nodes[,"POINT_Y"],pch=19,cex=.2,col=COL[1+floor(price_rank)])
We can actually zoom in, e.g. on South-East part of France
FR2=readRDS("FRA_adm2.rds")
FR2 = spTransform(FR2, CRS("+init=epsg:2154"))
plot(FR2,xlim=c(750000,1100000),ylim=c(6200000,6380000),col="grey",border="white",lwd=2)
points(nodes[,"POINT_X"],nodes[,"POINT_Y"],pch=19,cex=.35,col=COL[1+floor(price_rank)])
or on Paris
One can also obtain easily chloropleth maps, but it is also possible to smooth, to visualize prices
dpt_nodes = data.frame(dpt=floor(nodes$INSEE_COMM/1000),price=prices)
dpt_data = aggregate(dpt_nodes$price,by=list(dpt_nodes$dpt),FUN=mean)
names(dpt_data) = c("department","price")
FR2@data = data.frame(FR2@data,
dpt_data[match(as.numeric(as.character(FR2@data[,"CCA_2"])),
dpt_data[,"department"]),])
Let us define a grid on France
Show codewhere we keep only points within the French territory.
Let us use k
-nearest neighbors, on the k
closest nodes (here k=20
)
library(geosphere)
simbase2=as.data.frame(simbase)
names(df1)=c("long1","lat1")
df1$indice=indic
knn=function(i,k=20){
d=distHaversine(grille2[i,1:2],(simbase2[,c("long2","lat2")]), r=6378.137)
r=rank(d)
ind=which(r<=k)
mean(simbase2[ind,"Y"])
}
df1$y=Vectorize(knn)(1:nrow(df1))
df1$z=df1$y
df1$z[indic==0]=NA
mz=matrix(df1$z, taille+1, taille+1)
cols = rev(brewer.rdbu(40))
image(seq(min(simbase$long1),max(simbase$long1),length=taille+1),
seq(min(simbase$lat1),max(simbase$lat1),length=taille+1),mz,axes=FALSE,col=cols,xlab="",ylab="", asp=1)
polygon(P1,lwd=3)
Again, we can zoom in, for instance on the south-east of France, with maternities in black dots
image(seq(min(simbase$long1),max(simbase$long1),length=taille+1),
seq(min(simbase$lat1),max(simbase$lat1),length=taille+1),mz,axes=FALSE,col=cols,
xlim=c(750000,1100000),ylim=c(6200000,6380000),xlab="",ylab="", asp=1)
plot(FR2,border="white",lwd=2,add=TRUE)
points(nodes[!is.na(nodes$birth),"POINT_X"],nodes[!is.na(nodes$birth),"POINT_Y"],pch=1)
or on Paris
image(seq(min(simbase$long1),max(simbase$long1),length=taille+1),
seq(min(simbase$lat1),max(simbase$lat1),length=taille+1),mz,axes=FALSE,col=cols,
xlim=c(630000,660000),ylim=c(6840000,6880000),xlab="",ylab="", asp=1)
plot(FR2,border="white",lwd=2,add=TRUE)
points(nodes[!is.na(nodes$birth),"POINT_X"],nodes[!is.na(nodes$birth),"POINT_Y"],pch=1)
We need four datasets here as inputs
nodes
, with locations (latitude, longitude)arcs
, with (directed) connexions between nodessupply
, with supply locations (latitude, longitude)demand
, with demand locations (latitude, longitude)The tidy(arcs,nodes,supply,demand)
function that reshape datasets * to make sure that supply and demand location are located on one node * to make sure that the graph is connected * to balance supply and demand (or skew symmetry and conservation) : the total flow entering must equal the total flow leaving
It will return four clean datasets arcs
, nodes
, supply
and demand
.
borders = matrix(c(630000,680000,6840000,6880000),2,2)
plot(P1,type="n",xlab="",ylab="",axes=FALSE, asp=1)
polygon(P1,col="grey",border="white")
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
col="red",density=20,border=NA)
plot(FR2,xlim=c(630000,680000),ylim=c(6840000,6880000),col="grey",border="white",lwd=2)
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
col="red",density=20,border=NA)
Consider the sub-graph of nodes and edges, located in that rectangle,
index_sub_nodes = (nodes$POINT_X >= borders[1,1]) & (nodes$POINT_X <= borders[2,1]) &
(nodes$POINT_Y >= borders[1,2]) & (nodes$POINT_Y <= borders[2,2])
list_INSEE = unique(nodes[index_sub_nodes,"INSEE_COMM"])
There are 333 ‘cities’ remaining.
sub_nodes = nodes[nodes$INSEE_COMM %in% list_INSEE,]
list_ID = unique(sub_nodes$ID_RTE500)
sub_arcs = arcs[(arcs$START %in% list_ID)&(arcs$END %in% list_ID),]
Just to check, just look at
Show codesub_supply= maternitesFrance[maternitesFrance$NEAR_FID %in% list_ID,]
sub_demand = birthsFrance[birthsFrance$CODGEO %in% list_INSEE,]
plot(FR2,xlim=c(630000,660000),ylim=c(6840000,6880000),col="grey",border="white",lwd=2)
points(sub_nodes[,"POINT_X"],sub_nodes[,"POINT_Y"],pch=19,cex=.7,
col=COL[1+floor(price_rank)])
If we focus only on maternity (there are 24 maternities in this region), we obtain the following map,
list_mater=which((!is.na(sub_nodes$birth)))
plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
COL = alpha(rev(brewer.rdbu(length(list_mater))),.5)
points(sub_nodes[list_mater,"POINT_X"],sub_nodes[list_mater,"POINT_Y"],pch=19,cex=1.25,col=COL[rank(prices$prices[list_mater])])
borders = matrix(c(650000,1100000,6200000,6680000),2,2)
plot(P1,type="n",xlab="",ylab="",axes=FALSE, asp=1)
polygon(P1,col="grey",border="white")
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
col="red",density=20,border=NA)
plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
abline(v=borders[,1],col="red")
abline(h=borders[,2],col="red")
rect(borders[1,1],borders[1,2],borders[2,1],borders[2,2],
col="red",density=20,border=NA)
Consider the sub-graph of nodes and edges, located in that rectangle,
index_sub_nodes = (nodes$POINT_X >= borders[1,1]) & (nodes$POINT_X <= borders[2,1]) &
(nodes$POINT_Y >= borders[1,2]) & (nodes$POINT_Y <= borders[2,2])
list_INSEE = unique(nodes[index_sub_nodes,"INSEE_COMM"])
There are 8216 ‘cities’ remaining.
sub_nodes = nodes[nodes$INSEE_COMM %in% list_INSEE,]
list_ID = unique(sub_nodes$ID_RTE500)
sub_arcs = arcs[(arcs$START %in% list_ID)&(arcs$END %in% list_ID),]
Just to check, just look at
Show codesub_supply= maternitesFrance[maternitesFrance$NEAR_FID %in% list_ID,]
sub_demand = birthsFrance[birthsFrance$CODGEO %in% list_INSEE,]
plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
points(sub_nodes[,"POINT_X"],sub_nodes[,"POINT_Y"],pch=19,cex=.3,
col=COL[1+floor(price_rank)])
If we focus only on maternity (there are 163 maternities in this region - or 156 depending on the dataset we use…), we obtain the following map,
list_mater=which((!is.na(sub_nodes$birth)))
plot(FR2,xlim=borders[,1],ylim=borders[,2],col="grey",border="white",lwd=2)
COL = alpha(rev(brewer.rdbu(length(list_mater))),.5)
points(sub_nodes[list_mater,"POINT_X"],sub_nodes[list_mater,"POINT_Y"],pch=19,cex=1.25,col=COL[rank(prices$prices[list_mater])])
We did mention it previously, but it is necessary here that the sum of deliveries in (pregnant women, birth registered in France) equals the sum of deliveries out (birth in hospitals), \[ \sum_{\text{node } i:\text{ pregnant woman}}\overline{d}_i = \sum_{\text{node } j:\text{ maternity}}d_j \] Unfortunately, because we use two different sources, both not not equal. So it is necessay to use a correction \[ \tilde{d}_k=\frac{\displaystyle{\sum_{\text{node } j:\text{ maternity}}d_j}}{\displaystyle{\sum_{\text{node } i:\text{ pregnant woman}}\overline{d}_i}}\cdot\overline{d}_k \] Unfortunately, this approximation is uniform over all France. To overcome that issue, local pricing was considered.
Using the generic function mentioned previously, it is possible to compute prices on any territory. The strategy was to dived France in small territories, and to compute prices on each of them (the area below). In order to avoid border problems when we reconcile, we did actually use slighly larger regions.
pic_ani = function(x,y){
P0=P1[seq(1,nrow(P1),by=10),]
plot(P1,type="n",xlab="",ylab="",axes=FALSE,asp=1)
polygon(P0,col=rgb(1,0,0,.2))
VX=seq(min(P1[,1]),max(P1[,1]),length=13)
abline(v=VX,col="white")
VY=seq(min(P1[,2]),max(P1[,2]),length=13)
abline(h=VY,col="white")
abline(v=VX[c(1,5,9,13)])
abline(h=VY[c(1,5,9,13)])
i=x
j=y
rect(VX[i],VY[j],VX[i+4],VY[j+4],border=NA,col=rgb(0,0,1,.3))
rect(VX[max(i-1,1)],VY[max(j-1,1)],
VX[min(i+5,13)],VY[min(j+5,13)],border=NA,col=rgb(0,0,1,.3))
}