Using Observable

Author

Arvind V.

This post explores the use of Observable (OJS) for data visualization, particularly using the plot library. We will use the penguins dataset from the palmerpenguins package.

Setting up OJS Packages

Code
//Import `aquero`, a `dplyr` equivalent in Observable:
import {aq, op} from "@uwdata/arquero"
  import {aq as aq, op as op} from "@uwdata/arquero"

Reading the Data

Let’s import and view the penguins dataset:

Code
penguins = FileAttachment("palmer-penguins.csv").csv({ typed: true })
penguins = Array(344) [Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, …]
Code
//df_penguins = aq.from(penguins)
//Inputs.table(df_penguins)
//
Inputs.table(penguins)
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexyear
AdelieTorgersen39.118.71813,750male2,007
AdelieTorgersen39.517.41863,800female2,007
AdelieTorgersen40.3181953,250female2,007
AdelieTorgersenNANANANANA2,007
AdelieTorgersen36.719.31933,450female2,007
AdelieTorgersen39.320.61903,650male2,007
AdelieTorgersen38.917.81813,625female2,007
AdelieTorgersen39.219.61954,675male2,007
AdelieTorgersen34.118.11933,475NA2,007
AdelieTorgersen4220.21904,250NA2,007
AdelieTorgersen37.817.11863,300NA2,007
AdelieTorgersen37.817.31803,700NA2,007
AdelieTorgersen41.117.61823,200female2,007
AdelieTorgersen38.621.21913,800male2,007
AdelieTorgersen34.621.11984,400male2,007
AdelieTorgersen36.617.81853,700female2,007
AdelieTorgersen38.7191953,450female2,007
AdelieTorgersen42.520.71974,500male2,007
AdelieTorgersen34.418.41843,325female2,007
AdelieTorgersen4621.51944,200male2,007
AdelieBiscoe37.818.31743,400female2,007
AdelieBiscoe37.718.71803,600male2,007
AdelieBiscoe35.919.21893,800female2,007

Discovering plot library in OJS

Let’s create a slider for making variable histograms with facetting:

Code
viewof bill_length_min = Inputs.range(
  [32, 50], 
  {value: 35, step: 1, label: "Bill length (min):"}
)
viewof islands = Inputs.checkbox(
  ["Torgersen", "Biscoe", "Dream"], 
  { value: ["Torgersen"], 
    label: "Islands:"
  }
)
bill_length_min = 35
islands = Array(1) ["Torgersen"]

Put a interactive filter on the data:

Code
filtered = penguins.filter(function(data) {
  return bill_length_min < data.bill_length_mm &&
         islands.includes(data.island);
})
filtered = Array(46) [Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, …]
Code
Plot.rectY(filtered, 
  Plot.binX(
    {y: "count"}, 
    {x: "body_mass_g", fill: "species", thresholds: 20}
  ))
  .plot({
    title: "Facetted Histogram",
    caption: "Figure 1. A chart with a title, subtitle, and caption.",
    facet: {
      data: filtered,
      x: "sex",
      y: "species",
      marginRight: 80
    },
    marks: [
      Plot.frame(),
    ]
  }
)

Facetted Histogram

AdeliespeciesNAfemalemalesex0.00.20.40.60.81.01.21.41.61.82.02.22.42.62.83.0↑ Frequency3,0004,0003,0004,0003,0004,000body_mass_g →
Figure 1. A chart with a title, subtitle, and caption.
Code
Plot.plot({
  grid: true,
  inset: 10,
  //aspectRatio: 0.05,
  color: {legend: true},
  marks: [
    Plot.frame(),
    Plot.rectY(df_penguins, 
      Plot.binX({y: "count"}, {x: "body_mass_g"}, 
                {color: "species"}, {fill: "species"})
              ),
    Plot.ruleY([0])
  ],
  title: "TITLE: Histogram of Body Mass of Penguins",
  subtitle: "SUBTITLE: Using OJS",
  caption: "Figure 1. A chart with a title, subtitle, and caption.",
})

TITLE: Histogram of Body Mass of Penguins

SUBTITLE: Using OJS

0102030405060708090↑ Frequency2,5003,0003,5004,0004,5005,0005,5006,0006,500body_mass_g →
Figure 1. A chart with a title, subtitle, and caption.
Figure 1

is a histogram.

Trying to master the syntax, step by step:

Code
Plot.boxX(
// Data 
  df_penguins, 
  
// Aesthetics
          {x: "body_mass_g", y: "species", fill: "species", sort: {y: "x"}}
)
// Plot options, concatenated
  .plot({
    title: "Box Plot in Observable",
    subtitle: "Getting hold of the Syntax",
    caption: "Box Plot on Log scale",
    height: 400,
    marginLeft: 100,
    
// scales
    x:{type:"log",}})

Box Plot in Observable

Getting hold of the Syntax

AdelieChinstrapGentoospecies3k4k5k6kbody_mass_g →
Box Plot on Log scale

Let’s try the same plot with facetting, AND with change in colour palette:

Code
Plot.boxX(
// Data 
  df_penguins, 
  
// Aesthetics
          {x: "body_mass_g", y: "species", fy: "island", 
          fill: "species", 
          sort: {y: "x"},
          }
)
// Plot options, concatenated
  .plot({
    title: "Box Plot in Observable",
    subtitle: "Getting hold of the Syntax",
    caption: "Box Plot on Log scale",
// scales
    x:{type:"log"}})

Box Plot in Observable

Getting hold of the Syntax

BiscoeDreamTorgersenislandChinstrapAdelieGentooChinstrapAdelieGentooChinstrapAdelieGentoospecies3k4k5k6kbody_mass_g →
Box Plot on Log scale
Code
Plot.plot({

//plots to overlay, inside "marks". Pass data for each! No inheritance!
  marks: [
    Plot.ruleY([0]), // x-axis intercept!!
    Plot.areaY(aapl, {x: "Date", y: "Close", fillOpacity: 0.2}),
    Plot.lineY(aapl, {x: "Date", y: "Close"})
  ],
  y: {
    type: "log",
    domain: [50, 300],
    grid: true
  }
})
OJS Runtime Error

Failed to fetch

Scatter Plot

Code
Plot.plot({
  grid: true,
  inset: 10,
  color: {legend: true},
  x: {labelAnchor: "center"},
  y: {labelAnchor: "center"},
  //title: "Scatter Plot for Penguins",
  marginLeft: 40,
  marks: [
    Plot.frame(),
    Plot.dot(penguins, 
      {x: "bill_length_mm", y: "bill_depth_mm", 
      stroke: "black", fill: "species", strokeWidth: 0.2})
  ]
})
AdelieChinstrapGentoo
1415161718192021bill_depth_mm →3540455055bill_length_mm →

Grouping and Aggregation within Plot

Code
Plot.plot({
  marginLeft: 80,
  marginRight: 80,
  aspectRatio: 0.05,
  x: {labelAnchor: "center", domain: [0, 200]},
  marks: [
    Plot.barX(penguins, Plot.groupY({x: "count"}, {y: "species"})),
    Plot.ruleX([0])
  ]
})
AdelieChinstrapGentoospecies050100150200Frequency →

Trying Counts and Summaries with aquero

Code
df_penguins = aq.from(penguins)
df_penguins.groupby("species").count().view()
df_penguins = Table: 8 cols x 344 rows {_names: Array(8), _data: Object, _total: 344, _nrows: 344, _mask: null, _group: null, _order: null, _params: undefined, _index: null, _partitions: null}
speciescount
Adelie152
Gentoo124
Chinstrap68
Code
df_penguins
  .groupby("species")
  .rollup({ mean: (d) => op.mean(d.body_mass_g) })
  .view()
speciesmean
AdelieNaN
GentooNaN
Chinstrap3733.088235