Author

Josh Day

Published

May 21, 2021

Performance Tips

Introduction

The Julia Language documentation offers comprehensive performance guidance, though it targets computer scientists rather than data professionals. This article translates those concepts for practical application.

The recommended development workflow emphasizes starting with functional code, then systematically addressing performance bottlenecks through targeted optimization.


Execute Computationally Intensive Tasks Inside Functions

Performance-critical code should reside within functions to ensure optimal execution efficiency. Structure your problem into modular functions rather than lengthy procedural scripts.

Minimize Non-Constant Global Variables

Globals—variables defined outside functions—degrade performance:

# Suboptimal approach
x = 1
f(y) = x + y

# Preferred approach
f(x, y) = x + y

For invariant values, use const:

const x = 1
f(y) = x + y

Maintain Type Stability

Type stability ensures a function’s return type depends solely on input types, not values. Unstable functions require the compiler to handle multiple return type scenarios:

function f(x)
    x < 10 ? "string" : 1
end

Avoid reassigning variables to different types:

x = 1
x = x/2  # Changes type from Int to Float64

Avoid Abstract Types in Containers and Structures

Abstract types force Julia to reason about type sets rather than specific types.

Initialize arrays with known types:

x = []         # eltype(x) == Any
x = Float64[]  # eltype(x) == Float64
Float64[]

Create parametric structs:

# Ambiguous
struct A
    thing::Real
end

# Concrete
struct B{T <: Real}
    thing::T
end

Use isconcretetype to verify type specificity. Abstract types represent conceptual categories rather than instantiable objects.

println(isconcretetype(A))  # false — field type is abstract
println(isconcretetype(B{Float64}))  # true
true
true

Monitor Performance with @time

Temporary variables necessitate garbage collection overhead. The @time macro reveals allocation metrics.

Tip

The initial @time invocation includes JIT compilation overhead. Run twice for accuracy, or use BenchmarkTools and its @btime macro.

Example: Inefficient Loop

function add100_slow!(x)
    for i in 1:100
        x[:] = x .+ 1
    end
    return x
end

data = randn(10^6);
@time add100_slow!(data);
  0.654274 seconds (239.40 k allocations: 774.791 MiB, 68.93% gc time, 11.43% compilation time)

Problems identified:

  • Many allocations indicate temporary vector creation
  • High memory consumption
  • Significant GC time

Optimized Version

function add100_fast!(x)
    x .+= 100
end

@time add100_fast!(data);
  0.023019 seconds (109.85 k allocations: 5.465 MiB, 98.53% compilation time)

Performance Improvement Strategies

Leverage Broadcasting

Combine element-wise operations without intermediate arrays using dot syntax:

x = randn(100)

# Broadcast fusion — no intermediate vectors
y = sin.(abs.(x)) .+ 1 ./ (1:100)
100-element Vector{Float64}:
 1.0436711396434621
 1.488860696088759
 1.1588279842851374
 0.3875656829581255
 1.1200733800462561
 0.6642365649674812
 0.9406042888385351
 0.1271254894680726
 0.7682280508499082
 0.6340598983852705
 ⋮
 0.4065065225512069
 0.11267645134518546
 0.2642917691404014
 0.6361712152732573
 0.03908343583974342
 0.08888201524535419
 0.355530049583456
 0.44170429962147967
 0.9450970769817278

Use Mutating Functions

Functions suffixed with ! modify data in-place:

x = [3, 1, 2]
sort!(x)  # In-place sorting
3-element Vector{Int64}:
 1
 2
 3
x = [3, 1, 2]
sort(x)   # Returns sorted copy (x unchanged)
3-element Vector{Int64}:
 1
 2
 3

Optimize Array Operations

Use Views Instead of Copies:

Slicing creates copies; views reference existing data:

x = randn(3, 3)

x[:, 1:2]          # Creates copy

view(x, :, 1:2)    # View — no copy
3×2 view(::Matrix{Float64}, :, 1:2) with eltype Float64:
  0.390288  -0.255508
 -0.331239  -1.73403
  1.71882   -2.07083

Access Elements Sequentially:

Julia uses column-major storage. Iterate rows in inner loops since column elements occupy adjacent memory:

x = rand(3,3)

for j in 1:3      # column j
    for i in 1:3  # row i
        x_ij = x[i, j]
        perform_calculation_on_element(x_ij)
    end
end

Summary

These foundational optimization techniques unlock significant performance gains in Julia. The language community prioritizes performance extensively; deeper exploration awaits in the official documentation.

Resources