Simple Use Case

Consider the generalized rosenbrock as our objective function

\[f(x) = \sum_{i=1}^{N-1} \left[100(x_{i+1}^2 - x_i^2)^2 + (1 - x_i)^2\right].\]

We manually translate the formula above into Julia shown below.

julia> function rosen(x)
           N = lastindex(x)
           100sum((x[i + 1] - x[i]^2)^2 for i = 1:N-1) + sum((x[i] - 1)^2 for i = 1:N-1)
       end

Differentiating the objective function and performing a Julia translation below.

julia> function ∇rosen!(g, x)
           N = lastindex(x)
           g[1] = -2 * (1 - x[1]) - 400x[1] * (-x[1]^2 + x[2])

           for i in 2:N-1
               g[i] = -2 * (1 - x[i]) + 200 * (-x[i - 1]^2 + x[i]) - 400x[i] * (-x[i]^2 + x[1 + i])
           end
           
           g[N] = 200 * (x[N] - x[N-1]^2)    
           return g
       end
∇rosen! (generic function with 1 method)

The optimize routine will attempt to find $x$ meeting the first-order neccessary condition for being a local minimum of $f$, i.e.

\[|| \nabla f(x) || \leq \epsilon.\]

Consider dimension $n=100,$ and we randomly assign x₀ to be a point in the $100$-dimensional hypercube.

julia> x₀ = randn(100);

The entry point of scheme $7.1$ occurs below.

julia> optimize(rosen, ∇rosen!, x₀)
SUCCESS 8.070336658758971e-7 ≤ 1.0e-5 in 528 steps
--------------------------------------
  Minimum f:      5.311172726630893e-16
  Minimum ||∇f||: 8.070336658758971e-7
  Minimum Δ:      0.005844719845773086
  Minimum Step:   6.030122388810338e-8

  Model: 
  -------------------
    objective:         rosen
    gradient:          ∇rosen!
    initial iterate:   [0.132719, ..., 0.554169, -0.861590, -0.025498]
    dimension:         100
    directory:         /Users/daniel/.julia/dev/BlockOpt/docs/Missing
    objective formula: missing
  Driver:
  -------------------
    S_update:  S_update_c
    QN_update: SR1
    pflag:     false
    Options:
      samples:        6
      Δ_max:          100.0
      δ_tol:          1.0e-12
      ϵ_tol:          1.0e-5
      max_iterations: 2000
  Trace:
  -------------------
    Weaver:
      f_vals:   [37156.548946, ..., 0.000000, 0.000000, 0.000000]
      ∇f_norms: [15116.003941, ..., 0.000293, 0.000016, 0.000001]
      Δ_vals:   [5.984993, ..., 0.187031, 0.187031, 0.187031]
      p_norms:  [5.172082, ..., 0.000045, 0.000002, 0.000000]
      ρ_vals:   [0.931152, ..., 0.985198, 0.983784, 0.986492]
    Profile:
      trs_counter: 528
      trs_timer:   0.07330679893493652
      ghs_counter: 430
      ghs_timer:   0.07087516784667969

Here, the output shows a Simulation instance in a terminal state, which happens to be a success!

Constructing a Model

We extend our simple use case to define a model, which creates a directory to keep track of various simulations throughout the model's life.

julia> m = Model("Rosenbrock")
  Model: 
  -------------------
    objective:         missing
    gradient:          missing
    initial iterate:   missing
    dimension:         missing
    directory:         /Users/daniel/.julia/dev/BlockOpt/docs/Rosenbrock
    objective formula: missing

Above is an empty model, which is incrementally loaded.

julia> objective!(m, rosen)
rosen (generic function with 1 method)

julia> gradient!(m, ∇rosen!)
∇rosen! (generic function with 1 method)


julia> initial_iterate!(m, x₀)

julia> m
  Model: Rosenbrock
  -------------------
    objective:         rosen
    gradient:          ∇rosen!
    initial iterate:   [-1.177973, ..., -0.535732, 1.304151, 1.194971]
    dimension:         100
    directory:         /Users/daniel/.julia/dev/BlockOpt/Rosenbrock
    objective formula: missing

The model is now loaded to a final state, implying that the objective and gradient function may no longer be modified.

We optionally chose to set a formula for the objective function of model m. The formula will be displayed when plotting simulations of the m.

julia> rosen_formula = "\$\\sum_{i=1}^{N-1} \\left[100(x_{i+1}^2 - x_i^2)^2 + (1 - x_i)^2\\right]\$";

julia> formula!(m, rosen_formula)
"\$\\sum_{i=1}^{N-1} \\left[100(x_{i+1}^2 - x_i^2)^2 + (1 - x_i)^2\\right]\$"

julia> m
  Model: Rosenbrock
  -------------------
    objective:         rosen
    gradient:          ∇rosen!
    initial iterate:   [-1.177973, ..., -0.535732, 1.304151, 1.194971]
    dimension:         100
    directory:         /Users/daniel/.julia/dev/BlockOpt/Rosenbrock
    objective formula: loaded

Our model m is entirely constructed. Observe the directory path above; it was created in the current working directory using the name given to the model constructor.

See the Model section of Manual for more information.

Constructing a Driver

The goal of creating a model is to record simulation information over multiple trials, where each trial uniquely incorporates the second-order hessian information into a steps QN update as dictated by the simulation driver.

A default driver is constructed with an empty argument list.

julia> default_driver = Driver() # default parameters and options
  Driver:
  -------------------
    S_update:  S_update_c
    QN_update: SR1
    pflag:     false
    Options:
      samples:        6
      Δ_max:          100.0
      δ_tol:          1.0e-12
      ϵ_tol:          1.0e-5
      max_iterations: 2000

The default_driver is created in all optimize calls that don't specify a Driver.

Below we pass PSB as argument for the QN_update keyword.

julia> psb_driver = Driver(QN_update = PSB)
  Driver:
  -------------------
    S_update:  S_update_c
    QN_update: PSB
    pflag:     false
    Options:
      samples:        6
      Δ_max:          100.0
      δ_tol:          1.0e-12
      ϵ_tol:          1.0e-5
      max_iterations: 2000

See pflag, S_update, and Options manual section for information on the three other keyword arguments.

Configurations

By passing a model to optimize with several unique drivers, we gain insight into the effects of each driver configuration. We saw above how m performed with the default driver configuration; now, we run it with the PSB QN update

julia> optimize(m, psb_driver)
FAIL 3.566862431214296 ≰ 1.0e-5 in 2000 steps
--------------------------------------
  Minimum f:      28.577172626527943
  Minimum ||∇f||: 1.1239260593910936
  Minimum Δ:      0.0055605929272336966
  Minimum Step:   0.005560592927233694

  Model: Rosenbrock
  -------------------
    objective:         rosen
    gradient:          ∇rosen!
    initial iterate:   [-1.177973, ..., -0.535732, 1.304151, 1.194971]
    dimension:         100
    directory:         /Users/daniel/.julia/dev/BlockOpt/Rosenbrock
    objective formula: $\sum_{i=1}^{N-1} \left[100(x_{i+1}^2 - x_i^2)^2 + (1 - x_i)^2\right]$
  Driver:
  -------------------
    S_update:  S_update_c
    QN_update: PSB
    pflag:     false
    Options:
      samples:        6
      Δ_max:          100.0
      δ_tol:          1.0e-12
      ϵ_tol:          1.0e-5
      max_iterations: 2000
  Trace:
  -------------------
    Weaver:
      f_vals:   [35819.088644, ..., 28.644942, 28.610707, 28.577173]
      ∇f_norms: [14013.886569, ..., 3.567714, 3.387636, 3.566862]
      Δ_vals:   [5.694047, ..., 0.044485, 0.044485, 0.044485]
      p_norms:  [4.372943, ..., 0.014306, 0.014309, 0.014314]
      ρ_vals:   [1.128439, ..., 1.664845, 1.670837, 1.665141]
    Profile:
      trs_counter: 2000
      trs_timer:   3.959573268890381
      ghs_counter: 1974
      ghs_timer:   0.4726881980895996

Here, the simulation failed to reach a successful terminal state and instead reached the maximum allowed iterations. We expect that the PSB update requires more iterations than the default SR1 update. Letting this guide us, we increase the max_iterations of the psb_driver to a larger value.

julia> max_iterations!(psb_driver, 10000)
10000

julia> optimize(m, psb_driver)
SUCCESS 9.93303363395617e-6 ≤ 1.0e-5 in 5578 steps
--------------------------------------
  Minimum f:      3.986623854354119
  Minimum ||∇f||: 9.93303363395617e-6
  Minimum Δ:      0.004905284751681128
  Minimum Step:   6.309128627600796e-8

  Model: Rosenbrock
  -------------------
    objective:         rosen
    gradient:          ∇rosen!
    initial iterate:   [-1.177973, ..., -0.535732, 1.304151, 1.194971]
    dimension:         100
    directory:         /Users/daniel/.julia/dev/BlockOpt/Rosenbrock
    objective formula: $\sum_{i=1}^{N-1} \left[100(x_{i+1}^2 - x_i^2)^2 + (1 - x_i)^2\right]$
  Driver:
  -------------------
    S_update:  S_update_c
    QN_update: PSB
    pflag:     false
    Options:
      samples:        6
      Δ_max:          100.0
      δ_tol:          1.0e-12
      ϵ_tol:          1.0e-5
      max_iterations: 10000
  Trace:
  -------------------
    Weaver:
      f_vals:   [35819.088644, ..., 3.986624, 3.986624, 3.986624]
      ∇f_norms: [14013.886569, ..., 0.000010, 0.000011, 0.000010]
      Δ_vals:   [5.023012, ..., 0.039242, 0.039242, 0.039242]
      p_norms:  [4.096592, ..., 0.000000, 0.000000, 0.000000]
      ρ_vals:   [1.160497, ..., 1.756820, 1.760699, 1.763252]
    Profile:
      trs_counter: 5578
      trs_timer:   0.891240119934082
      ghs_counter: 5552
      ghs_timer:   0.9388515949249268

Allowing for $10,000$ iterations is more then enough to reach convergence within the the tolerance given by ϵ_tol(psb_driver) ($1.0e-5$).

See the Options and Driver sections of the Manual for more information on Driver options and configurations.

Simulation Recipes

Let us assign the variables s1 and s2 as returned terminal simulations driven by drivers d1 and d2 as defined below.

julia> d1 = Driver(S_update=S_update_a)
julia> d2 = Driver(S_update=S_update_d)
julia> s1 = optimize(m, d1);
julia> s2 = optimize(m, d2);

We have configured drivers d1 and d2 to use updates (6.1.a) and (6.1.d).

Next, we create the objective function trace plot with the Julia Plots package. The plotted data may be accessed using f_vals(s1) and f_vals(s2).

julia> using Plots;     # to install: using Pkg; Pkg.add("Plots")
julia> objtrace(s1, s2)

The normed gradient data at each successful iterate may be accessed through ∇f_norms(s1) and ∇f_norms(s2).

julia> gradtrace(s1, s2)

See: radiustrace, steptrace, rhotrace and corresponding ! versions.

Weave

Building upon the last example, we can skip generating the individual plots and weave one Weave.jl trace report.

julia> Using Plots
julia> weave(s1, s2);

The model directory of s1 now contains a trace.html report which can be viewed in the browser.

Here is another report comparing the generalized Rosenbrock simulations driven by the $6$ different sample direction update configurations.

Here is a block size comparison of using $4$ to $20$ different sample directions with the SR1 update and S_update=(6.1.d). We are still using model m.