The Figure Plugin

Standard Markdown provides only ![alt](path) for inserting images, which lacks features commonly needed in technical and academic documents: captions, side-by-side layout, size control, and text wrap. This page documents the implementation of a custom markdown-it plugin that provides all of these.

End Result

After the plugin is installed, you can write:

Single image with a caption:

markdown

::: figure
![Description of figure|600px](./image.png)

**Figure 1:** Caption text. The figure number appears in bold.
:::

Multiple images side by side:

markdown

::: figure
![Left figure|45%](./img1.png)
![Right figure|45%](./img2.png)

**Figure 2:** Side-by-side comparison. On narrow screens (e.g. a phone in portrait) the images stack vertically automatically.
:::

Floating image with text wrap:

markdown

::: figure
![Thumbnail|200px|right](./thumbnail.png)

**Figure 3:** The caption wraps to the image width. (optional)
:::

Subsequent text flows to the left of the image…

Syntax	Meaning
`\|300`	300px (unit omitted; Obsidian-compatible)
`\|300px`	300px
`\|50%`	50% of the parent element's width
`\|right`	Float right (subsequent text wraps to the left)
`\|left`	Float left (subsequent text wraps to the right)

The caption is optional. Whether or not a caption is present, the image is centered (except when floating).

Implementation

package.json

The plugin uses markdown-it-container to parse the :::figure container syntax. Add it to devDependencies.

package.json

json

{
  "devDependencies": {
    "markdown-it-container": "^3.0.0",
    "vitepress": "next",
    ...
  }
}

After changing package.json, rebuild the container.

$ docker compose down
$ docker volume rm vitepress_node_modules
$ docker compose build --no-cache
$ docker compose up -d

Plugin Code

Create .vitepress/plugins/figure-plugin.js with the following content:

.vitepress/plugins/figure-plugin.js

import container from 'markdown-it-container'

// Parse |size and |float directives from raw alt text.
// Returns { altText, size, floatDir }.
//   altText  — text before the first '|' (rendered as alt attribute)
//   size     — CSS length string (e.g. "300px", "50%") or null
//   floatDir — 'left', 'right', or null
function parseAltDirectives(rawAlt) {
  const pipeIdx = rawAlt.indexOf('|')
  if (pipeIdx === -1) return { altText: rawAlt, size: null, floatDir: null }

  const altText = rawAlt.slice(0, pipeIdx)
  const suffixes = rawAlt.slice(pipeIdx + 1).split('|')

  let size = null
  let floatDir = null

  for (const part of suffixes) {
    const p = part.trim()
    if (/^(left|right)$/i.test(p)) {
      floatDir = p.toLowerCase()
    } else if (/^\d+(?:\.\d+)?(?:px|%|em|rem|vw|vh)?$/.test(p)) {
      // Unit-less numbers → px (Obsidian compatible)
      size = /^\d+(?:\.\d+)?$/.test(p) ? p + 'px' : p
    }
  }

  return { altText, size, floatDir }
}

function figurePlugin(md) {
  // ── 1. Image directives via alt text ────────────────────────────────────
  // Syntax: ![alt|300px](src)       — width only
  //         ![alt|300px|right](src) — width + float
  //         ![alt|right|300px](src) — same, order-independent
  //
  // When float is present, width is applied to the <figure> element (by the
  // container renderer below); the <img> fills it via CSS (width: 100%).
  // When there is no float, width is applied as an inline style on <img>.
  const defaultImageRenderer = md.renderer.rules.image
  md.renderer.rules.image = function (tokens, idx, options, env, self) {
    const token = tokens[idx]
    const rawAlt = self.renderInlineAsText(token.children, options, env)

    const { altText, size, floatDir } = parseAltDirectives(rawAlt)

    if (size !== null || floatDir !== null) {
      // Apply width to <img> only when not floating — floated figures own
      // the width via an inline style on <figure>.
      if (size && !floatDir) {
        const existing = token.attrGet('style') || ''
        token.attrSet('style', (existing ? existing + ' ' : '') + `width:${size};`)
      }

      // Strip all |directives from the rendered alt attribute.
      const lastChild = token.children && token.children[token.children.length - 1]
      if (lastChild && lastChild.type === 'text') {
        lastChild.content = altText
      }
    }

    if (defaultImageRenderer) {
      return defaultImageRenderer(tokens, idx, options, env, self)
    }
    token.attrSet('alt', self.renderInlineAsText(token.children, options, env))
    return self.renderToken(tokens, idx, options)
  }

  // ── 2. :::figure container ───────────────────────────────────────────────
  // Wraps content in <figure class="md-figure">.
  // The last <p> inside the figure is treated as the caption via CSS
  // (.md-figure > p:last-child:not(:only-child)).
  //
  // When any image inside the figure carries a |left or |right directive,
  // the <figure> element receives the md-float--{dir} class and a width
  // inline style. The image's own width is then controlled by CSS
  // (.md-float--right img { width: 100% }) rather than an inline style.
  md.use(container, 'figure', {
    render (tokens, idx) {
      if (tokens[idx].nesting === 1) {
        // Scan ahead in the token stream for a float directive. The image
        // renderer has not yet run at this point, so token.children still
        // hold the unmodified alt text.
        let floatDir = null
        let floatSize = null

        outer: for (let i = idx + 1; i < tokens.length; i++) {
          const t = tokens[i]
          if (t.type === 'container_figure_close') break
          if (t.type !== 'inline' || !t.children) continue
          for (const child of t.children) {
            if (child.type !== 'image') continue
            const rawAlt = child.children
              ? child.children.map(c => c.content).join('')
              : ''
            const parsed = parseAltDirectives(rawAlt)
            if (parsed.floatDir) {
              floatDir = parsed.floatDir
              floatSize = parsed.size
              break outer
            }
          }
        }

        if (floatDir) {
          const styleAttr = floatSize ? ` style="width:${floatSize};"` : ''
          return `<figure class="md-figure md-float--${floatDir}"${styleAttr}>\n`
        }
        return '<figure class="md-figure">\n'
      } else {
        return '</figure>\n'
      }
    },
  })
}

export default figurePlugin

The plugin consists of three parts:

parseAltDirectives() helper. Splits the alt text on | and classifies each segment: a numeric-plus-unit pattern becomes size (a CSS length string), left or right becomes floatDir, and the segment before the first | becomes altText. Because each segment is classified independently, |300px|right and |right|300px produce identical results.
Image directive handling (overriding md.renderer.rules.image). Calls parseAltDirectives() and, when there is no float, writes the width as an inline style on <img>. When a float is present, nothing is added to <img> — the width belongs to <figure> instead, as described below. In both cases the | directives are stripped from the rendered alt attribute.
Figure container (using markdown-it-container). Wraps the :::figure … ::: block in a <figure class="md-figure"> tag. When generating the opening tag, it scans ahead in the token stream to detect any float directive and, if found, adds the md-float--{dir} class and a style="width:..." attribute to <figure>.

Why {width=...} syntax cannot be used

An alternative like ![alt](src){width=600px} might seem natural, but VitePress processes {...} in Markdown as Vue template syntax, so by the time markdown-it parses the tokens, those blocks have already been stripped. The |size approach embeds the size inside the alt text, which is unaffected by VitePress's template processing and therefore works reliably. It also matches the ![alt|width](src) resize syntax that Obsidian natively supports.

Registering in config.mts

Add the import and plugin usage to config.mts:

.vitepress/config.mts

import figurePlugin from './plugins/figure-plugin.js'

export default defineConfig({
  markdown: {
    config: (md) => {
      md.use(figurePlugin)  
    }
  }
})

CSS

Style the figure tag, caption, multi-image row layout, and float behavior. Append the following to .vitepress/theme/custom.css:

.vitepress/theme/custom.css

css

/* ── Figure ─────────────────────────────────────────────────────── */

.vp-doc .md-figure {
  display: flex;
  flex-direction: column;
  align-items: center;
  margin: 2rem auto;
  text-indent: 0;
}

/* Images paragraph: flex row, wraps on narrow screens */
.vp-doc .md-figure > p:not(:last-child) {
  display: flex;
  flex-wrap: wrap;
  justify-content: center;
  align-items: flex-start;
  gap: 0.75rem;
  width: 100%;
  margin: 0;
  text-indent: 0;
}

/* Hide <br> inserted by breaks:true between images in the same paragraph */
.vp-doc .md-figure > p:not(:last-child) br {
  display: none;
}

.vp-doc .md-figure img {
  display: block;
  max-width: 100%;
  height: auto;
}

/* Center block image when it is the only child (no caption, non-float) */
.vp-doc .md-figure:not(.md-float--right):not(.md-float--left) > p:only-child img {
  margin-left: auto;
  margin-right: auto;
}

/* Last paragraph → caption (only when there is more than one child) */
.vp-doc .md-figure > p:last-child:not(:only-child) {
  margin-top: 0.6rem;
  font-size: 0.88em;
  line-height: 1.5;
  color: var(--vp-c-text-2);
  text-align: center;
  text-indent: 0;
}

/* Figure number (the **bold** part) stays bold; rest is normal weight */
.vp-doc .md-figure > p:last-child:not(:only-child) strong {
  font-weight: 600;
  color: var(--vp-c-text-1);
}

/* ── Float figure ────────────────────────────────────────────────── */

.vp-doc .md-figure.md-float--right {
  float: right;
  display: block;
  margin: 0 0 1rem 1.5rem;
}

.vp-doc .md-figure.md-float--left {
  float: left;
  display: block;
  margin: 0 1.5rem 1rem 0;
}

/* All paragraphs inside a float figure use block layout */
.vp-doc .md-figure.md-float--right > p,
.vp-doc .md-figure.md-float--left > p {
  display: block;
  width: 100%;
  margin: 0;
  text-indent: 0;
}

/* Hide <br> in float figures too */
.vp-doc .md-figure.md-float--right > p br,
.vp-doc .md-figure.md-float--left > p br {
  display: none;
}

/* Image fills the float figure width */
.vp-doc .md-figure.md-float--right img,
.vp-doc .md-figure.md-float--left img {
  width: 100%;
  margin: 0;
}

/* Caption inside float figure: restore top margin overridden by the p rule */
.vp-doc .md-figure.md-float--right > p:last-child:not(:only-child),
.vp-doc .md-figure.md-float--left > p:last-child:not(:only-child) {
  margin-top: 0.4rem;
}

A few design notes:

Multiple images inside :::figure are arranged side by side. display: flex is applied to image paragraphs (all but the last). flex-wrap: wrap causes them to reflow into a vertical stack when the screen is too narrow, so the layout also works naturally on phones in portrait orientation.
The caption is treated as the last <p> element. Semantically a <figcaption> tag would be more correct, but we rely on markdown-it converting the container's Markdown to HTML as usual and use CSS to style the last paragraph as a caption. Using :last-child:not(:only-child) as the caption selector ensures that when there is no caption — and the image paragraph is the only child — the caption-specific styles such as font-size: 0.88em do not accidentally apply to it.
Images are centered even without a caption. When a caption is present, the image paragraph matches :not(:last-child) and is centered by display: flex; justify-content: center. When there is no caption, the image paragraph matches :only-child and that rule does not apply. The :only-child img { margin: 0 auto } rule covers this case.
Float applies to <figure>, not <img>. A regular figure is display: flex (a column flex container). Per the CSS spec, float has no effect on flex items. Applying float to <img> would therefore do nothing. Instead, the <figure> itself is given display: block; float: right/left, taking it out of normal flow. The width is set as an inline style on <figure>, and the <img> inside fills it via width: 100%.
Bold figure number vs. normal caption text. Writing **Figure 1:** in the caption produces a <strong> tag. By applying font-weight: 600 and a darker text color only to that element, the figure label appears bold while the rest of the caption is in normal weight.
No conflict with max-width: 100%. Inline styles such as style="width:50%" have higher specificity than external style sheets, so they are always applied correctly.

The Figure Plugin ​

End Result ​

Implementation ​

package.json ​

Plugin Code ​

Registering in config.mts ​

CSS ​

The Figure Plugin

End Result

Implementation

package.json

Plugin Code

Registering in config.mts

CSS