This issue is just a place for me to document my thought process from looking at the cats_simulation.jl file that Jose sent me. There's no action needed at the moment.
Reading the tea leaves
I ran with attributes=Dict("filter_function" => x -> get_base_voltage((get_from(get_arc(x)))) >= 115.0).
The logs are pretty interesting. Here's an example of one of the solves (it's the second one). They're all pretty similar.
Gurobi Optimizer version 13.0.0 build v13.0.0rc1 (mac64[arm] - Darwin 24.6.0 24G419)
CPU model: Apple M4
Thread count: 10 physical cores, 10 logical processors, using up to 10 threads
Non-default parameters:
MIPGap 0.01
Optimize a model with 261552 rows, 439242 columns and 36062616 nonzeros (Min)
Model fingerprint: 0xb4f75c65
Model has 198408 linear objective coefficients
Variable types: 369690 continuous, 69552 integer (69552 binary)
Coefficient statistics:
Matrix range [1e-07, 1e+01]
Objective range [2e+01, 2e+06]
Bounds range [1e-03, 1e+04]
RHS range [1e-03, 6e+00]
Good to here
User MIP start did not produce a new incumbent solution
MIP start from previous solve did not produce a new incumbent solution
MIP start from previous solve violates constraint R42864 by 1.000000000
Processed MIP start in 1.47 seconds (2.40 work units)
Okay: this part is interesting. We've have both a MIP start from start = in JuMP, and also from the previous solution.
Q: Is there an easy heuristic for how to repair a solution between solves? Can I fix the integers and repair the continuous variables?
Presolve removed 87360 rows and 99588 columns (presolve time = 5s)...
Presolve removed 106046 rows and 166974 columns (presolve time = 10s)...
Presolve removed 115869 rows and 175506 columns (presolve time = 15s)...
Presolve removed 116250 rows and 176454 columns (presolve time = 20s)...
Presolve removed 116538 rows and 176835 columns (presolve time = 25s)...
Presolve removed 106887 rows and 155577 columns
Presolve time: 29.83s
Presolved: 154665 rows, 283665 columns, 6549416 nonzeros
Variable types: 216118 continuous, 67547 integer (67547 binary)
We could dig into the details, but this is ~40% of total constraints and 35% of variables, but almost all integer variables. So I guess this is removing inactive line constraints, and substituting out any variables that are defined by equality constraints like y == f(x).
Answer: this is probably parameter substitution, and also x <= p parameter bound constraints
Deterministic concurrent LP optimizer: primal simplex, dual simplex, and barrier
Showing barrier log only...
Root barrier log...
Ordering time: 6.34s
Barrier statistics:
AA' NZ : 1.417e+07
Factor NZ : 1.112e+08 (roughly 1.0 GB of memory)
Factor Ops : 4.499e+11 (less than 1 second per iteration)
Threads : 7
Objective Residual
Iter Primal Dual Primal Dual Compl Time
0 7.58240757e+11 -1.19433616e+12 5.82e+03 1.57e+05 7.58e+07 50s
1 4.27403075e+11 -8.56155402e+11 6.90e+02 1.12e-08 1.02e+07 53s
2 6.56623234e+10 -3.76892984e+11 6.15e+01 1.41e-07 1.54e+06 56s
3 1.06082178e+10 -8.80903724e+10 5.49e+00 1.54e-07 2.61e+05 59s
4 2.19082762e+09 -2.13341428e+10 7.41e-01 4.62e-08 5.87e+04 62s
5 1.12351604e+09 -9.61108860e+09 3.59e-01 2.11e-08 2.66e+04 65s
6 2.66038895e+08 -4.44896434e+09 6.07e-02 1.04e-08 1.16e+04 69s
7 4.57265610e+07 -4.96035785e+08 6.71e-04 4.34e-09 1.33e+03 72s
8 1.78859716e+07 -8.20373796e+07 2.60e-05 3.41e-09 2.45e+02 75s
9 1.33463034e+07 -1.04879443e+07 2.68e-06 8.54e-08 5.84e+01 79s
10 1.09800056e+07 1.41637346e+06 5.90e-07 1.15e-07 2.34e+01 83s
11 1.01219326e+07 5.53019040e+06 3.45e-07 5.56e-08 1.12e+01 86s
12 8.91238212e+06 7.21179124e+06 8.90e-08 1.75e-07 4.17e+00 90s
13 8.61002350e+06 7.73626215e+06 4.52e-08 1.75e-07 2.14e+00 94s
14 8.47422224e+06 7.90220280e+06 2.74e-08 5.56e-08 1.40e+00 97s
15 8.38279774e+06 8.07440728e+06 1.60e-08 2.94e-07 7.55e-01 101s
16 8.32521656e+06 8.14595413e+06 9.06e-09 2.94e-07 4.39e-01 104s
17 8.29563238e+06 8.19013295e+06 5.54e-09 2.94e-07 2.58e-01 109s
18 8.28090955e+06 8.20776668e+06 3.85e-09 2.94e-07 1.79e-01 113s
19 8.26912915e+06 8.22751831e+06 2.89e-09 2.94e-07 1.02e-01 118s
20 8.26105912e+06 8.23524266e+06 2.33e-09 2.94e-07 6.32e-02 122s
21 8.25693207e+06 8.24124033e+06 1.93e-09 2.94e-07 3.84e-02 127s
22 8.25447566e+06 8.24390594e+06 1.77e-09 2.94e-07 2.59e-02 131s
23 8.25231141e+06 8.24515292e+06 1.30e-09 2.94e-07 1.75e-02 136s
24 8.25059416e+06 8.24595204e+06 1.02e-09 2.94e-07 1.14e-02 141s
25 8.24960189e+06 8.24630870e+06 7.34e-10 2.94e-07 8.07e-03 146s
26 8.24827959e+06 8.24705654e+06 6.71e-10 2.94e-07 3.00e-03 151s
27 8.24800857e+06 8.24771210e+06 1.04e-09 2.05e-07 7.26e-04 157s
28 8.24796755e+06 8.24784919e+06 2.74e-09 1.30e-07 2.90e-04 163s
29 8.24792433e+06 8.24788761e+06 2.48e-09 4.07e-08 8.99e-05 168s
30 8.24790600e+06 8.24789779e+06 2.80e-09 1.27e-08 2.01e-05 174s
31 8.24790205e+06 8.24790089e+06 8.34e-09 1.54e-09 2.84e-06 179s
32 8.24790150e+06 8.24790125e+06 1.13e-09 9.31e-10 6.11e-07 184s
33 8.24790139e+06 8.24790138e+06 6.51e-11 1.86e-09 3.45e-08 189s
34 8.24790138e+06 8.24790138e+06 1.16e-10 9.31e-10 7.01e-10 194s
Barrier solved model in 34 iterations and 194.02 seconds (386.92 work units)
Optimal objective 8.24790138e+06
I wonder if we should turn off concurrent and focus on barrier here.
Root crossover log...
36483 DPushes remaining with DInf 0.0000000e+00 194s
0 DPushes remaining with DInf 0.0000000e+00 195s
18584 PPushes remaining with PInf 0.0000000e+00 195s
7001 PPushes remaining with PInf 0.0000000e+00 195s
0 PPushes remaining with PInf 0.0000000e+00 199s
Push phase complete: Pinf 0.0000000e+00, Dinf 3.3049626e-08 199s
Root simplex log...
Iteration Objective Primal Inf. Dual Inf. Time
46919 8.2479014e+06 0.000000e+00 0.000000e+00 199s
Crossover time: 4.96 seconds (9.01 work units)
Concurrent spin time: 42.80s (can be avoided by choosing Method=3)
Solved with barrier
46919 8.2479014e+06 0.000000e+00 0.000000e+00 242s
Pretty much what this says
Use crossover to convert LP symmetric solution to basic solution...
Root crossover log...
4990 DPushes remaining with DInf 0.0000000e+00 242s
0 DPushes remaining with DInf 0.0000000e+00 242s
26911 PPushes remaining with PInf 0.0000000e+00 242s
0 PPushes remaining with PInf 0.0000000e+00 244s
Push phase complete: Pinf 0.0000000e+00, Dinf 3.6370663e-08 244s
Root simplex log...
Iteration Objective Primal Inf. Dual Inf. Time
78758 8.2479014e+06 0.000000e+00 0.000000e+00 244s
Crossover time: 2.06 seconds (6.19 work units)
78758 8.2479014e+06 0.000000e+00 0.000000e+00 244s
Root relaxation: objective 8.247901e+06, 78758 iterations, 210.47 seconds (321.48 work units)
Do we need a basic solution?
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 8247901.38 0 2178 - 8247901.38 - - 251s
Phew! We solved the root.
Another try with MIP start
H 0 0 8253552.6429 8247901.38 0.07% - 489s
And then took another 200 seconds to run a heuristic to produce an incumbent.
Explored 1 nodes (110266 simplex iterations) in 489.59 seconds (559.39 work units)
Thread count was 10 (of 10 available processors)
Solution count 1: 8.25355e+06
Optimal solution found (tolerance 1.00e-02)
Best objective 8.253552642944e+06, best bound 8.247901381841e+06, gap 0.0685%
User-callback calls 67627, time in user-callback 0.07 sec
Do we need crossover?
Okay, after going through the log, I tried the obvious:
solver_gurobi = JuMP.optimizer_with_attributes(
Gurobi.Optimizer,
"MIPGap" => mip_gap,
"Method" => 2,
"Crossover" => 0,
"NodeMethod" => 2,
)
This did not work well. Here's the first solve (I didn't even make it to the second):
Root relaxation: objective 6.520571e+06, 0 iterations, 115.69 seconds (346.31 work units)
Another try with MIP start
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
H 0 0 1.336987e+07 6520570.85 51.2% - 269s
0 0 6520570.85 0 45290 1.3370e+07 6520570.85 51.2% - 269s
H 0 0 9646126.3248 6520570.85 32.4% - 269s
0 0 6520570.85 0 40174 9646126.32 6520570.85 32.4% - 472s
0 0 6520570.85 0 40172 9646126.32 6520570.85 32.4% - 682s
0 0 6520570.85 0 31257 9646126.32 6520570.85 32.4% - 904s
0 0 6520570.85 0 31257 9646126.32 6520570.85 32.4% - 1105s
In comparison, re-enabling crossover gives:
Root relaxation: objective 6.520571e+06, 77665 iterations, 153.11 seconds (347.19 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 6520570.85 0 1968 - 6520570.85 - - 198s
Another try with MIP start
H 0 0 1.338453e+07 6520570.85 51.3% - 458s
H 0 0 1.336887e+07 6520570.85 51.2% - 495s
0 0 6520570.85 0 44289 1.3369e+07 6520570.85 51.2% - 495s
H 0 0 9641206.1517 6520570.85 32.4% - 495s
H 0 0 6524427.7351 6520570.85 0.06% - 509s
Explored 1 nodes (182662 simplex iterations) in 509.86 seconds (1325.97 work units)
So for some reason, the basic LP solution gives Gurobi a better starting point for whatever heuristic it runs fourth, and that's what works.
Changing to MIPFocus=1 didn't help:
Root relaxation: objective 6.520571e+06, 77665 iterations, 138.67 seconds (347.19 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 6520570.85 0 1968 - 6520570.85 - - 184s
Another try with MIP start
H 0 0 1.338453e+07 6520570.85 51.3% - 457s
H 0 0 1.336887e+07 6520570.85 51.2% - 495s
0 0 6520570.85 0 44289 1.3369e+07 6520570.85 51.2% - 495s
H 0 0 9641206.1517 6520570.85 32.4% - 495s
H 0 0 6524427.7351 6520570.85 0.06% - 510s
Explored 1 nodes (182662 simplex iterations) in 510.79 seconds (1325.97 work units)
Lazy constraints
When I make Lazy=3 for all non-equality constraints, we get:
Root relaxation: objective 6.520571e+06, 60051 iterations, 63.48 seconds (133.55 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 6520570.85 0 150 - 6520570.85 - - 99s
0 0 postponed 0 - 6521300.23 - - 141s
0 0 6521300.23 0 40454 - 6521300.23 - - 142s
0 0 6521300.23 0 89 - 6521300.23 - - 151s
0 0 6521300.23 0 89 - 6521300.23 - - 247s
0 2 6521300.23 0 117 - 6521300.23 - - 295s
1 4 6522583.67 1 388 - 6521300.23 - 4888 318s
3 8 6522807.94 2 384 - 6521405.74 - 3225 629s
7 16 6524228.15 3 386 - 6521544.89 - 1384 638s
15 26 6525650.62 4 357 - 6521683.73 - 1156 678s
25 40 6527071.60 5 357 - 6521718.01 - 2159 714s
39 50 6527232.75 10 342 - 6521718.01 - 1427 749s
49 72 6527371.58 11 330 - 6521718.01 - 1348 781s
71 86 6527371.58 14 316 - 6521718.01 - 983 807s
85 96 6527371.58 15 319 - 6521718.01 - 875 832s
95 120 6527371.66 15 319 - 6521718.01 - 817 853s
119 146 6527371.58 16 319 - 6521718.01 - 671 867s
145 156 6527371.66 16 320 - 6521718.01 - 557 883s
155 176 6527371.58 17 319 - 6521718.01 - 525 896s
175 192 6527371.66 17 319 - 6521718.01 - 466 910s
191 206 6527371.58 18 318 - 6521718.01 - 437 925s
205 220 6527371.66 18 314 - 6521718.01 - 413 937s
219 238 6527371.58 19 318 - 6521718.01 - 387 947s
237 247 6527371.66 19 323 - 6521718.01 - 362 964s
246 264 6527371.58 20 314 - 6521718.01 - 351 977s
Much faster to solve the root, but we can't find an integer feasible solution.
Ditto with
using Gurobi
env = Gurobi.Env()
filename = "model_eee1b616.mps"
modelP = Ref{Ptr{Cvoid}}(C_NULL)
GRBreadmodel(env, filename, modelP)
model = modelP[]
GRBoptimize(model)
pInt = Ref{Cint}()
GRBgetintattr(model, "NumConstrs", pInt)
slack = zeros(Cdouble, pInt[])
GRBgetdblattrarray(model, "Slack", 0, pInt[], slack)
slack_constraints = findall(>(0), slack)
GRBreadmodel(env, filename, modelP)
model2 = modelP[]
for i in slack_constraints
GRBsetintattrelement(model2, "Lazy", i - 1, 3)
end
GRBoptimize(model2)
Root solves quickly, then we get
52082 -1.0765334e+08 0.000000e+00 0.000000e+00 104s
Root relaxation: objective -1.076533e+08, 52082 iterations, 72.39 seconds (139.63 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 -1.077e+08 0 194 - -1.077e+08 - - 105s
The LP solution sucks
using JuMP
model = read_from_file(filename)
x = all_variables(model)
binaries = filter(is_binary, x)
undo = relax_integrality(model)
set_optimizer(model, () -> Gurobi.Optimizer(env))
optimize!(model)
bin_sol = value.(binaries)
set_binary.(binaries)
optimize!(model)
bin_sol2 = value.(binaries)
This takes longer to solve, because the LP relaxation gives a false start. There are many binaries where the LP solution is 0 and the optimal integer solution is 1, and vice versa.
Root simplex log...
Iteration Objective Primal Inf. Dual Inf. Time
78595 6.5205708e+06 0.000000e+00 0.000000e+00 226s
Crossover time: 2.06 seconds (6.12 work units)
78595 6.5205708e+06 0.000000e+00 0.000000e+00 226s
Root relaxation: objective 6.520571e+06, 78595 iterations, 191.81 seconds (403.93 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 postponed 0 - 6520570.85 - - 226s
H 0 0 4.267482e+10 6520570.85 100% - 226s
0 0 6520570.85 0 43230 4.2675e+10 6520570.85 100% - 238s
0 0 6520570.85 0 43230 4.2675e+10 6520570.85 100% - 238s
0 0 6520570.85 0 2039 4.2675e+10 6520570.85 100% - 258s
H 0 0 6834382.8312 6520592.21 4.59% - 471s
H 0 0 6832661.1758 6520592.21 4.57% - 471s
0 2 6520592.21 0 134 6832661.18 6520592.21 4.57% - 488s
1 4 6528274.03 1 194 6832661.18 6520592.21 4.57% 15307 498s
H 3 8 6832581.1597 6520592.21 4.57% 5103 503s
H 4 7 6524196.7450 6520592.21 0.06% 3833 503s
Questions to discuss
- What is the model?
- Why is it hard to find integer feasible solutions?
- Why is the LP relaxation so close to optimal in objective space?
- Are there simpler heuristics we can use?
- How can we find what constraints to relax?
This issue is just a place for me to document my thought process from looking at the
cats_simulation.jlfile that Jose sent me. There's no action needed at the moment.Reading the tea leaves
I ran with
attributes=Dict("filter_function" => x -> get_base_voltage((get_from(get_arc(x)))) >= 115.0).The logs are pretty interesting. Here's an example of one of the solves (it's the second one). They're all pretty similar.
Good to here
Okay: this part is interesting. We've have both a MIP start from
start =in JuMP, and also from the previous solution.Q: Is there an easy heuristic for how to repair a solution between solves? Can I fix the integers and repair the continuous variables?
We could dig into the details, but this is ~40% of total constraints and 35% of variables, but almost all integer variables. So I guess this is removing inactive line constraints, and substituting out any variables that are defined by equality constraints like
y == f(x).Answer: this is probably parameter substitution, and also
x <= pparameter bound constraintsI wonder if we should turn off concurrent and focus on barrier here.
Pretty much what this says
Do we need a basic solution?
Phew! We solved the root.
And then took another 200 seconds to run a heuristic to produce an incumbent.
Do we need crossover?
Okay, after going through the log, I tried the obvious:
This did not work well. Here's the first solve (I didn't even make it to the second):
In comparison, re-enabling crossover gives:
So for some reason, the basic LP solution gives Gurobi a better starting point for whatever heuristic it runs fourth, and that's what works.
Changing to
MIPFocus=1didn't help:Lazy constraints
When I make Lazy=3 for all non-equality constraints, we get:
Much faster to solve the root, but we can't find an integer feasible solution.
Ditto with
Root solves quickly, then we get
The LP solution sucks
This takes longer to solve, because the LP relaxation gives a false start. There are many binaries where the LP solution is 0 and the optimal integer solution is 1, and vice versa.
Questions to discuss