Clustered Configuration Model¶
The Clustered Configuration Model (CCM) extends the standard configuration model to generate networks with both a prescribed degree sequence and a target clustering coefficient \(\phi\).
Reference
The CCM algorithm is described in: Ritchie, M., Berthouze, L., & Kiss, I. Z. (2014). Higher-order structure and epidemic dynamics in clustered networks. Journal of Mathematical Biology, 72(3), 483–511.
The problem¶
The standard configuration model generates networks by randomly pairing "stubs" (half-edges). This produces networks with a given degree sequence but near-zero clustering — the probability that two neighbours of a node are themselves connected is vanishingly small for sparse networks.
Real networks (social, biological, infrastructure) exhibit significant clustering. The CCM addresses this by reserving a fraction of each node's stubs for participation in small, dense substructures (motifs) before pairing the remainder randomly.
Algorithm pipeline¶
The CCM proceeds in five steps:
Step 1: Stub allocation¶
Each node \(i\) receives \(k_i\) stubs (one per unit of degree), sampled from the target degree distribution.
Step 2: Partition into corners and singles¶
For each node, stubs are partitioned into two pools:
- Corners — stubs allocated to motif participation. The number of corners is drawn from \(\text{Binomial}(\lfloor k_i / c \rfloor, \phi)\), where \(c\) is the cardinality of the motif (the degree of each corner within the motif).
- Singles — remaining stubs, paired via the standard configuration model.
The parameter \(\phi\) controls the expected fraction of stubs allocated to motifs, and thereby the resulting clustering coefficient.
Step 3: Form motif instances¶
Corner stubs are grouped and connected according to the motif's adjacency structure. Three strategies handle collisions (when the same node appears twice in a group):
| Strategy | Behaviour | Trade-off |
|---|---|---|
| Repeated | Resample colliding groups individually | Fast, slightly biased |
| Refuse | Reject entire batch on any collision | Unbiased, expensive at high density |
| Erased | Ignore collisions, remove duplicates post-hoc | Fastest, loses edges |
Step 4: Pair single stubs¶
Remaining single stubs are paired using the standard configuration model, avoiding edges that were already created during motif formation.
Step 5: Assemble¶
Motif edges and single edges are combined, symmetrised, and cleaned (self-loops removed, multi-edges clipped) to produce the final adjacency matrix.
Degree-5 optimisation¶
For homogeneous degree-5 networks, the CCM uses an analytical allocation that mixes K4 cliques and triangles for higher clustering accuracy:
where \(p_3\) is the probability that a node participates in both a K4 corner and a triangle corner. This achieves higher clustering than the general triangle-only fallback.
Uniform cardinality constraint¶
The current implementation requires uniform cardinality — all corners
in the motif must have the same degree within the motif. This means
complete graphs (G2/triangle, G8/K4, G29/K5) and cycles (G5, G12, C6)
are supported, but mixed-cardinality motifs like G7 (diamond) and G14
(bowtie) raise MixedCardinalityError.
Usage¶
import numpy as np
from craeft import configuration_model, global_clustering_coefficient
rng = np.random.default_rng(42)
degrees = np.full(1000, 5)
# phi controls clustering: 0.0 = unclustered, higher = more clustered
for phi in [0.0, 0.1, 0.2, 0.3]:
adj = configuration_model(degrees, phi=phi, rng=rng)
cc = global_clustering_coefficient(adj)
print(f"phi={phi:.1f} -> clustering={cc:.3f}")