Building blocks

In Preliminaries we saw that the Plotly.newPlot JavaScript function expects to receive an array of trace objects and, optionally, a layout object.

In this section we will learn how to build the trace and layout objects in Julia that make up the core elements of a plot.

Traces

A Plot instance will have a single trace or a vector of traces. These should each be a subtype of AbstractTrace.

PlotlyBase.jl provides one such general-purpose subtype GenericTrace defined as

mutable struct GenericTrace{T <: AbstractDict{Symbol,Any}} <: AbstractTrace
    fields::T
end

Here fields is an AbstractDict object that pairs a trace's attributes to their values. The GenericTrace subtype allows us to generically include data to describe the appearance of a trace, such as point locations, marker shape and size, text annotations and more. The reason we create a GenericTrace as a wrapper around a Dict is to provide some convenient syntax, as described below.

Let's consider an example.

Note

The next example can be used as a guide to translating examples using the plotly.js JavaScript library to their equivalent Julia versions.

Suppose we would like to build a Plot to include a scatter-type trace as described here using JSON:

{
  "type": "scatter",
  "x": [1, 2, 3, 4, 5],
  "y": [1, 6, 3, 6, 1],
  "mode": "markers+text",
  "name": "Team A",
  "text": ["A-1", "A-2", "A-3", "A-4", "A-5"],
  "textposition": "top center",
  "textfont": {
    "family":  "Raleway, sans-serif"
  },
  "marker": { "size": 12 }
}

One way to do this in Julia is to create an equivalent dictionary:

fields = Dict{Symbol,Any}(:type => "scatter",
                          :x => [1, 2, 3, 4, 5],
                          :y => [1, 6, 3, 6, 1],
                          :mode => "markers+text",
                          :name => "Team A",
                          :text => ["A-1", "A-2", "A-3", "A-4", "A-5"],
                          :textposition => "top center",
                          :textfont => Dict(:family =>  "Raleway, sans-serif"),
                          :marker => Dict(:size => 12))
GenericTrace("scatter", fields)

scatter with fields marker, mode, name, text, textfont, textposition, type, x, and y

A more convenient approach uses the syntax of the scatter function:

t1 = scatter(;x=[1, 2, 3, 4, 5],
              y=[1, 6, 3, 6, 1],
              mode="markers+text",
              name="Team A",
              text=["A-1", "A-2", "A-3", "A-4", "A-5"],
              textposition="top center",
              textfont_family="Raleway, sans-serif",
              marker_size=12)

scatter with fields marker, mode, name, text, textfont, textposition, type, x, and y

Notice a few things:

The trace type became the function name. There is a similar method for all plotly.js traces types.
All other trace attributes were set using keyword arguments. This allows us to avoid typing out the symbol prefix (:) and the arrows (=>) that were necessary when constructing the Dict.
We can set nested attributes using underscores. Notice that the JSON "marker": { "size": 12 } was written marker_size=12.

We can verify that this is indeed equivalent JSON by printing the JSON. Note the order of the attributes is different, but the content is identical:

import JSON

print(JSON.json(t1, 2))

{
  "textfont": {
    "family": "Raleway, sans-serif"
  },
  "mode": "markers+text",
  "x": [
    1,
    2,
    3,
    4,
    5
  ],
  "textposition": "top center",
  "y": [
    1,
    6,
    3,
    6,
    1
  ],
  "type": "scatter",
  "name": "Team A",
  "text": [
    "A-1",
    "A-2",
    "A-3",
    "A-4",
    "A-5"
  ],
  "marker": {
    "size": 12
  }
}

Accessing attributes

If we then wanted to extract a particular attribute, we can do so using getindex(t1, :attrname), or more directly, t1[:attrname]. Note that both symbols and strings can be used in a call to getindex:

julia> t1["marker"]Dict{Any, Any} with 1 entry:
  :size => 12
julia> t1[:marker]Dict{Any, Any} with 1 entry:
  :size => 12

To access a nested property use a string of the form parent.child

julia> t1["textfont.family"]"Raleway, sans-serif"

or nested dictionaries

julia> t1[:textfont][:family]"Raleway, sans-serif"

Warn

Nested dictionaries will error on missing symbol keys, however using unrecognised or unassigned strings as keys will return empty dictionaries. For example,

julia> t1[:textfont][:color]ERROR: KeyError: key :color not found

returns an error while

julia> t1["textfont.color"]Dict{Any, Any}()

is an empty Dict.

Setting additional attributes

We can also set additional attributes. Suppose we wanted to set marker.color to be red. We can do this with a call to setindex!(t1, "red", :marker_color), or equivalently t1["marker_color"] = "red":

julia> t1["marker_color"] = "red""red"
julia> println(JSON.json(t1, 2)){
  "textfont": {
    "family": "Raleway, sans-serif"
  },
  "mode": "markers+text",
  "x": [
    1,
    2,
    3,
    4,
    5
  ],
  "textposition": "top center",
  "y": [
    1,
    6,
    3,
    6,
    1
  ],
  "type": "scatter",
  "name": "Team A",
  "text": [
    "A-1",
    "A-2",
    "A-3",
    "A-4",
    "A-5"
  ],
  "marker": {
    "color": "red",
    "size": 12
  }
}

Notice how the color attribute was correctly added within the existing marker attribute (alongside size), instead of replacing the marker attribute.

You can also use this syntax to add completely new nested attributes:

julia> t1["line_width"] = 55
julia> println(JSON.json(t1, 2)){
  "textfont": {
    "family": "Raleway, sans-serif"
  },
  "mode": "markers+text",
  "line": {
    "width": 5
  },
  "x": [
    1,
    2,
    3,
    4,
    5
  ],
  "textposition": "top center",
  "y": [
    1,
    6,
    3,
    6,
    1
  ],
  "type": "scatter",
  "name": "Team A",
  "text": [
    "A-1",
    "A-2",
    "A-3",
    "A-4",
    "A-5"
  ],
  "marker": {
    "color": "red",
    "size": 12
  }
}

Layouts

The Layout type is defined as

mutable struct Layout{T <: AbstractDict{Symbol,Any}} <: AbstractLayout
    fields::T
    subplots::Subplots
end

You can construct a layout using the same convenient keyword argument syntax that we used for traces:

julia> l = Layout(;title="Penguins",
                   xaxis_range=[0, 42.0],
                   xaxis_title="Fish Count",
                   yaxis_title="Weight",
                   xaxis_showgrid=true,
                   yaxis_showgrid=true,
                   legend_x=0.7, legend_y=1.15,)layout with fields legend, margin, template, title, xaxis, and yaxis

Here we set different attributes for determining the non-data layout of the plot such as the range and title of the horizontal (xaxis) and vertical (yaxis) axes of the plot, whether the grid lines are drawn and the position of the legend.

Note

A layout is a general term for how non-data elements are displayed on a plot. There is only one layout object used for any given plot (while we may have multiple traces). For a complete list of layout attributes see the layout reference documentation.

The `attr` function

There is a special function named attr that allows you to apply the same keyword magic we saw in the trace and layout functions with underscores, but to nested attributes at the same level.

Let's revisit the previous example, but use attr to build up our xaxis or legend attributes in a way that groups things together:

julia> l2 = Layout(;title="Penguins",
                    xaxis=attr(range=[0, 42.0], title="Fish Count", showgrid=true),
                    yaxis_title="Weight", yaxis_showgrid=true,
                    legend=attr(x=0.7, y=1.15))layout with fields legend, margin, template, title, xaxis, and yaxis

Notice we obtain exactly the same layout as before, but we didn't have to resort to building a Dict by hand or prefixing multiple arguments with xaxis_ or legend_. Notice also that we can mix the different approaches in the one object.

Using `DataFrame`s

Note

DataFrame support was added in version 0.6.0.

You can also construct traces using the columns of any subtype of AbstractDataFrame, such as the DataFrame type from the DataFrames.jl package in particular.

To demonstrate this functionality let's load the well-known "iris" data set:

julia> using DataFrames
julia> import RDatasets
julia> iris = RDatasets.dataset("datasets", "iris");
julia> first(iris, 10)10×5 DataFrame
 Row │ SepalLength  SepalWidth  PetalLength  PetalWidth  Species 
     │ Float64      Float64     Float64      Float64     Cat…    
─────┼───────────────────────────────────────────────────────────
   1 │         5.1         3.5          1.4         0.2  setosa
   2 │         4.9         3.0          1.4         0.2  setosa
   3 │         4.7         3.2          1.3         0.2  setosa
   4 │         4.6         3.1          1.5         0.2  setosa
   5 │         5.0         3.6          1.4         0.2  setosa
   6 │         5.4         3.9          1.7         0.4  setosa
   7 │         4.6         3.4          1.4         0.3  setosa
   8 │         5.0         3.4          1.5         0.2  setosa
   9 │         4.4         2.9          1.4         0.2  setosa
  10 │         4.9         3.1          1.5         0.1  setosa

Suppose that we wanted to construct a scatter trace with the SepalLength column as the x variable and the SepalWidth columns as the y variable. We do this by calling scatter() with a dataframe as the first argument:

julia> my_trace = scatter(iris, x=:SepalLength, y=:SepalWidth, marker_color=:red)scatter with fields marker, type, x, and y

How does this work? The basic rule is that if the value of any keyword argument is a Julia Symbol (i.e. starting with :, such as :one), then the function creating the trace checks if that symbol is one of the column names in the DataFrame. If so, it extracts the column from the DataFrame and sets that as the value for the keyword argument. Otherwise it passes the symbol directly through.

In the above example, when we constructed my_trace the value of the keyword argument x was set to the Symbol :SepalLength. This did match a column name from iris so that column was extracted and replaced :SepalLength as the value for the x argument. The same holds for y and SepalWidth.

However, when setting marker_color=:red we found that :red is not one of the column names, so the value for the marker_color keyword argument remained :red.

We can access and inspect the values of the resulting trace object:

julia> [my_trace[:x][1:5] my_trace[:y][1:5]]5×2 Matrix{Float64}:
 5.1  3.5
 4.9  3.0
 4.7  3.2
 4.6  3.1
 5.0  3.6
julia> my_trace[:marker_color]:red

The DataFrame interface becomes more useful when constructing whole plots. See the convenience methods section of the documentation for more information.

Groups

Note

New in version 0.9.0:

You can construct groups of traces using the DataFrame interface through the group keyword. This is best understood by example, so let's see it in action:

julia> iris = RDatasets.dataset("datasets", "iris");
julia> unique(iris[:,:Species])3-element CategoricalArrays.CategoricalArray{String,1,UInt8}:
 "setosa"
 "versicolor"
 "virginica"
julia> traces = scatter(
           iris, group=:Species, x=:SepalLength, y=:SepalWidth, mode="markers", marker_size=8
       )3-element Vector{GenericTrace}:
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:x => [5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9  …  5.0, 4.5, 4.4, 5.0, 5.1, 4.8, 5.1, 4.6, 5.3, 5.0], :mode => "markers", :y => [3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1  …  3.5, 2.3, 3.2, 3.5, 3.8, 3.0, 3.8, 3.2, 3.7, 3.3], :type => "scatter", :name => CategoricalValue(CategoricalArrays.CategoricalPool{String, UInt8}(["setosa", "versicolor", "virginica"]), 1), :legendgroup => CategoricalValue(CategoricalArrays.CategoricalPool{String, UInt8}(["setosa", "versicolor", "virginica"]), 1), :marker => Dict{Any, Any}(:size => 8)))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:x => [7.0, 6.4, 6.9, 5.5, 6.5, 5.7, 6.3, 4.9, 6.6, 5.2  …  5.5, 6.1, 5.8, 5.0, 5.6, 5.7, 5.7, 6.2, 5.1, 5.7], :mode => "markers", :y => [3.2, 3.2, 3.1, 2.3, 2.8, 2.8, 3.3, 2.4, 2.9, 2.7  …  2.6, 3.0, 2.6, 2.3, 2.7, 3.0, 2.9, 2.9, 2.5, 2.8], :type => "scatter", :name => CategoricalValue(CategoricalArrays.CategoricalPool{String, UInt8}(["setosa", "versicolor", "virginica"]), 2), :legendgroup => CategoricalValue(CategoricalArrays.CategoricalPool{String, UInt8}(["setosa", "versicolor", "virginica"]), 2), :marker => Dict{Any, Any}(:size => 8)))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:x => [6.3, 5.8, 7.1, 6.3, 6.5, 7.6, 4.9, 7.3, 6.7, 7.2  …  6.7, 6.9, 5.8, 6.8, 6.7, 6.7, 6.3, 6.5, 6.2, 5.9], :mode => "markers", :y => [3.3, 2.7, 3.0, 2.9, 3.0, 3.0, 2.5, 2.9, 2.5, 3.6  …  3.1, 3.1, 2.7, 3.2, 3.3, 3.0, 2.5, 3.0, 3.4, 3.0], :type => "scatter", :name => CategoricalValue(CategoricalArrays.CategoricalPool{String, UInt8}(["setosa", "versicolor", "virginica"]), 3), :legendgroup => CategoricalValue(CategoricalArrays.CategoricalPool{String, UInt8}(["setosa", "versicolor", "virginica"]), 3), :marker => Dict{Any, Any}(:size => 8)))
julia> [t[:name] for t in traces]3-element CategoricalArrays.CategoricalArray{String,1,UInt8}:
 "setosa"
 "versicolor"
 "virginica"

Notice how there are three Species in the iris DataFrame, and by passing group=:Species to scatter we obtained three traces.

We can pass a Vector{Symbol} with the group keyword, to split the data according to the values of more than one column.

Here we split data by day of the week and time:

julia> tips = RDatasets.dataset("reshape2", "tips");
julia> unique(tips[:,:Sex])2-element CategoricalArrays.CategoricalArray{String,1,UInt8}:
 "Female"
 "Male"
julia> unique(tips[:,:Day])4-element CategoricalArrays.CategoricalArray{String,1,UInt8}:
 "Sun"
 "Sat"
 "Thur"
 "Fri"
julia> traces = violin(tips, group=[:Day, :Time], x=:TotalBill, orientation="h")6-element Vector{GenericTrace}:
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:type => "violin", :name => "Fri, Dinner", :legendgroup => "Fri, Dinner", :orientation => "h", :x => [28.97, 22.49, 5.75, 16.32, 22.75, 40.17, 27.28, 12.03, 21.01, 12.46, 11.35, 15.38]))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:type => "violin", :name => "Fri, Lunch", :legendgroup => "Fri, Lunch", :orientation => "h", :x => [12.16, 13.42, 8.58, 15.98, 13.42, 16.27, 10.09]))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:type => "violin", :name => "Sat, Dinner", :legendgroup => "Sat, Dinner", :orientation => "h", :x => [20.65, 17.92, 20.29, 15.77, 39.42, 19.82, 17.81, 13.37, 12.69, 21.7  …  10.77, 15.53, 10.07, 12.6, 32.83, 35.83, 29.03, 27.18, 22.67, 17.82]))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:type => "violin", :name => "Sun, Dinner", :legendgroup => "Sun, Dinner", :orientation => "h", :x => [16.99, 10.34, 21.01, 23.68, 24.59, 25.29, 8.77, 26.88, 15.04, 14.78  …  23.33, 45.35, 23.17, 40.55, 20.69, 20.9, 30.46, 18.15, 23.1, 15.69]))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:type => "violin", :name => "Thur, Dinner", :legendgroup => "Thur, Dinner", :orientation => "h", :x => [18.78]))
 GenericTrace{Dict{Symbol, Any}}(Dict{Symbol, Any}(:type => "violin", :name => "Thur, Lunch", :legendgroup => "Thur, Lunch", :orientation => "h", :x => [27.2, 22.76, 17.29, 19.44, 16.66, 10.07, 32.68, 15.98, 34.83, 13.03  …  10.34, 43.11, 13.0, 13.51, 18.71, 12.74, 13.0, 16.4, 20.53, 16.47]))
julia> [t[:name] for t in traces]6-element Vector{String}:
 "Fri, Dinner"
 "Fri, Lunch"
 "Sat, Dinner"
 "Sun, Dinner"
 "Thur, Dinner"
 "Thur, Lunch"

Functions

When using the DataFrame interface you may pass a function as the value for a keyword argument. When each trace is constructed, the value will be replaced by calling the function on whatever DataFrame is being used. When used in conjunction with the group argument, this allows you to compute group specific trace attributes on the fly, such as dynamically annotating a plot based on the data. For example, you might want to show the sample length with the text attribute:

text=(df) -> "Sample length $(size(df, 1))"

See the docstring for GenericTrace and the violin_side_by_side example on the Violin example page more details.

Note

New in PlotlyBase version 0.6.5 / PlotlyJS version 0.16.4:

A facet is another name for a plot displaying a subset of a larger dataset.

When plotting a DataFrame (let's call it df), the keyword arguments facet_row and facet_col allow you to create a matrix of subplots.

The rows of this matrix correspond to the array unique(df[:facet_row]), where :facet_row is a placeholder for the actual symbol passed as the facet_row argument. Similarly, the columns of the matrix of subplots come from unique(df[:facet_col]).

Each subplot will have the same structure, as defined by the keyword arguments passed to plot, but will only show data for a single value of facet_row and facet_col at a time.

Below is an example of how this works. We have a distinction of male/female between rows and a distinction of smoker/non-smoker between columns, creating a two-by-two matrix of four plots:

using PlotlyJS
import CSV
using DataFrames

df = PlotlyJS.dataset(DataFrame, "tips")

plot(
    df, x=:total_bill, y=:tip, xbingroup="x", ybingroup="y", kind="histogram2d",
    facet_row=:sex, facet_col=:smoker, colorbar_showticklabels=false
)