Block transforms: a primer

The Block Editor ships with a powerful API for converting raw text and/or blocks into other blocks. Almost any kind of content can be automatically upgraded to the Block equivalent using a feature called transforms. These transforms will take in HTML or string content and map it to the appropriate Block equivalent, using custom logic that we write for said transformations.

Transforms are used in two situations: when converting a Classic Editor block into individual blocks and converting one block type to another. Put simply, transforms are the official Gutenberg API for changing legacy content into 21st-century, shiny Block Editor content.

There are 6 types of transforms available:

  1. block
  2. enter
  3. files
  4. prefix
  5. shortcode
  6. raw

Each type of transformation takes in different values, but all except shortcode should return a BlockInstance, created using WordPress’ createBlock function.

Each block transformation is written in a JS object format, and there are several common keys available for all transforms:

  • type: This is the type of transform to perform, one of the list above.
  • priority: A numerical value indicating when this transform should run in relation to the other transforms. This priority behaves the same as priorities on WordPress hooks.

Additionally, each type of transform has its own unique keys and functionality.

Writing Transforms

Transforms are passed into the configuration objects for blocks, alongside save, supports, etc. and is an object with from and to keys. The configuration will end up looking something like the following:

(note that all code samples in this post are using TypeScript)

registerBlockType( 
  'HM/blockName',
  {
    ...,
    transforms: {
      from: [
        {
          type: 'block',
          blocks: ['core/shortcode'],
          isMatch: (attributes: KeyedObject): boolean => attributes?.text?.match(/\s*\[(?:fin_calc|monthly_repayment)/),
          transform(): BlockInstance {
            return createBlock(name);
          },
        },
        {
          type: 'shortcode',
          tag: 'monthly_repayment',
          attributes: {
            type: {
              type: 'string',
              shortcode: (): string => 'monthly-repayment',
            },
            monthlyRepaymentOptions: {
              type: 'object',
              shortcode: (attributes: ShortcodeAttributes): object => ({
                title: attributes?.named?.title,
                heading: attributes?.named?.heading,
                btnText: attributes?.named?.buttontext,
                btnLink: attributes?.named?.buttonlink,
                btnUseNewTab: attributes?.named?.isblank !== 'no',
              }),
            },
          },
        },
      ],
    },
    ...
  }
);

Transforms placed in a to key allow your block to be converted into another type of block and can only be of the block type. For example, you might have a “Wrapper” block which applies styles to inner content, and you want to allow a user to convert this into a generic Group block.

Transforms placed in the from key are the bulk of what we’ll be writing for our clients. These handle the logic of moving from a shortcode string, HTML, or other block type into the block we’re handling here.

Now, let’s explore the types of transforms available to us.

Block

The block transform allows for transforms from an existing Editor Block to your block.

Blocks can be transformed from one type into another using the block controls which hover above every block on the page.

This will most commonly be used when converting a Shortcode Editor Block into a custom block, based on the text inside the Shortcode Block.

This transform has several unique keys available to it:

  • blocks: an array of blocks from which the editor can transform into your block.
  • isMatch: this takes a function which should return a boolean, determining whether or not this transform matches the instance being passed it.
  • transform: A function that maps the shortcode attributes into attributes for a block and returns a BlockInstance, created using createBlock.

Examples of block transforms can be found here.

Enter

The enter transform handles content that a user has typed into a new line and hit the Enter key. The difference between the enter transform and the prefix transform is that enter handles content only after a user has hit the Enter key while prefix will take content before a space.

This transform registration uses the unique regEx key which determines whether it accepts the text or not.

For example, the core Separator block will convert --- into a full Separator instance.

Files

The File transform handles files that have been dragged into the Editor. This one is surprisingly straight-forward and uses the isMatch key to determine if the transform should accept the file or not.

You can see an example of this in the core Image block.

Prefix

Prefix transforms are used for transforming Markdown into full blocks and are the least used kind of block transforms.

The prefix block has a prefix key available to it which takes in a Markdown character (> ##, etc. ) to identify that this text node should be used to transform to a given custom block.

An example of a prefix transform can be found in the core quote block.

Shortcode

Shortcode transforms are used when converting custom shortcode text into an Editor Block equivalent.

This transform takes several unique keys:

  • tag: The tag is the exact match of shortcode name to look for. For the shortcode [content_feeds num_posts="5" …] The tag used here would be content_feeds.
  • attributes: Attributes returns an object of attributes mapped from their shortcode attribute equivalents.

Let’s say that we have a shortcode of product. This shortcode has just one attribute of product_id. We want to convert this into a “Product” Editor Block which has an attribute name of productId. Our attributes transform would like like the following:

attributes: {
  productId: {
    type: 'string',
    shortcode: (attributes: ShortcodeAttributes): string => attributes?.named?.product_id,
  },
}

You can see that we’ve mapped a string value from product_id naming into productId, which is a more valid casing for JavaScript (and more in line with the Block Editor handbook examples).

This is a really simple example where we map one piece of data to another, but we can do almost anything with shortcode data necessary to achieve your goals.

Below, you can see an example of taking in a shortcode parameter and performing a lookup/morphing on it before returning.

iconProperName: {
  type: 'string',
  shortcode: (attributes: ShortcodeAttributes): string => {
    const properName = findIconBySlug(attributes?.named?.icon || '');
    if (!properName) {
      return '';
    }

    return properName.label;
  },
},

A more complete example of shortcode transforms can be found in the core gallery block.

The biggest weakness of shortcode transforms is that they currently do not support wrapped content at all. There is currently a Gutenberg issue tracking this deficiency, but no timeline for when it will be added. This means that any sort of shortcode which [example]wraps other content[/example] must be handled using a raw transform, or a more complex process where inner content is temporarily stored as a string attribute on the generated block for later injection as InnerBlocks.

Raw

Raw transforms allow conversions from raw HTML and text into an Editor Block. These are used when converting HTML markup into a custom block. This transform takes in a series of Element and Text nodes and parses the “raw” markup to get the data necessary to build a BlockInstance.

Raw transforms are the most involved, and potentially the most lossy of all the transforms. They are also the most conceptually powerful.

A raw transform takes some several custom keys:

  • isMatch: This is used as a function which takes in a Node, and returns a boolean, either accepting or rejecting the Node. This node is not necessarily an Element.
  • schema: The schema is an object, identifying which types of nodes are accepted and what attributes from those nodes are accepted. If you need a type of attribute (data, class, id, etc.) you must declare it in this key.
  • transform: The transform is a function which takes in the given Node that we parsed for isMatch, and should return a BlockInstance, created using createBlock.

Let’s walk through an example of a raw transform:

{
  type: 'raw',
  priority: 1,
  isMatch: (node: Element): boolean => node?.classList?.contains('u-grid'),
  schema: {
    div: {
      attributes: ['class'],
      children: {
        div: {
          classes: [/^u-\d+\/\d(?:@\w)?$/, /^u-grid__col--align\w*?$/],
        },
      },
    },
    ul: {
      attributes: ['class'],
      children: {
        li: {
          classes: [/^u-\d+\/\d(?:@\w)?$/, /^u-grid__col--align\w*?$/],
        },
      },
    },
  },
  transform(node: Element): BlockInstance {
    const innerBlocks: BlockInstance[] = [];

    node.childNodes.forEach((childNode: Node): void => {
      // We will only be evaluating Nodes which are Elements.
      const currentnode = childNode as Element;

      // If there is a child that isn't a grid column, we throw it out.
      if (currentnode.nodeType !== 1 || !currentnode.classList.contains('u-grid__col')) {
        return;
      }

      /**
       * Organize attributes based on classes.
       *
       * Examples:
       * width:       u-1/2
       * largeWidth:  u-1/2@l
       * mediumWidth: u-1/5@m
       * smallWidth:  u-1/2@s
       * align:       u-grid__col--alignMiddle
       */

      // Setup an initial attributes object.
      const attributes: ColKeys = {
        width: 'u-1/3',
      };

      // Get width attributes out of the HTML classes.
      currentnode.classList.forEach((className: string): void => {
        if (className.includes('@s')) {
          attributes.smallWidth = className;
        } else if (className.includes('@m')) {
          attributes.mediumWidth = className;
        } else if (className.includes('@l')) {
          attributes.largeWidth = className;
        } else if (className.includes('u-grid__col--align')) {
          attributes.align = className.replace('u-grid__col--', '');
        } else if (className.includes('u-')) {
          attributes.width = className;
        }
      });

      // Filter down to only our elements.
      let childElements: Element[] = [...currentnode.childNodes].filter((grandchildNode: Node): boolean => grandchildNode.nodeType === 1);

      // Remove the content-wrapping `.gridExample` div if exists.
      childElements = removeWrappingElement(childElements, 'gridExample');

      const childHTML = convertElementListToString(childElements);

      // Create a block instance for each inner grid-column block.
      const blocks = createBlock(gridColumnName, attributes, rawHandler({ HTML: childHTML }));
      innerBlocks.push(blocks);
    });

    // This block has no attributes.
    return createBlock(name, { _shortcodeContent: node.innerHTML }, innerBlocks);
  },
},

The first part of this is relatively standard: type tells Gutenberg what type of transform we want, from there it becomes quite unique.

priority tells Gutenberg when we want to run this transform, relative to the other transforms. This works just like hook priorities, and here we tell Gutenberg that we want to run before any other transforms. This is because we’re converting uls and the List block will grab these if we don’t set a high priority.

isMatch looks at the top-level element passed to all transforms so that we can decide if we’d like to transform this chunk. It’s important to note here that we could be parsing either an Element Node or a Text node, depending on the below schema.

It’s also useful to note here that we will be passed any top-level Nodes passed into a parser instance. Initially this means any top-level items in a post, but this can also be a nested call to rawHandler or pasteHandler within another block.

schema appears to be used to determine which top-level nodes will get passed into isMatch, based on the type of elements you specify in the schema. Additionally (and more importantly), it determines which attributes get passed in along with the node for evaluation and transformation.

transform is where the real magic happens. We receive a full node as an argument to this function and use the child elements and parent element to determine how to generate one or more blocks from this Node. In this function, you’ll use DOM parsing APIs from JS to run through your node or Nodes and retrieve information that is used to build the attributes for a given block. You can also use string parsing tools on the inner HTML of any given node to detect shortcodes within the tree or handle other edge cases.

In this case we have no attributes on the block that’s doing the transformation. However, we’re running through all children elements and using them to build several InnerBlock blocks. We use classnames to derive attributes for a block, but in other instances I’ve taken text in a given pattern, pulled out DOM nodes entirely, moved nodes around, etc.

The only thing you need to do is to run createBlock at the end of this function to actually generate your block from the Node.

Tips for writing Transforms:

Be Extremely Defensive

Gutenberg is quite finicky and the data you expect to be passed in won’t always be there. For example, you might want/need specific attributes from a shortcode string, but since shortcodes are completely arbitrary, the original author may have forgotten one. if you hit this and haven’t accounted for it, the transform will just freeze with no error message, or error the whole editor. There is no in between.

On past projects, we used Optional Chaining for all data passed to us and it fixed almost all of these issues.

Place Transforms in a Separate File

We had good luck in placing transforms into their own transforms.js file as opposed to within the block’s main index.js. Transforms are typically large enough that they make sense in their own file; you can then import the transforms into your index.js (or whatever registration file) and use them directly.

Shortcode Transforms are Flaky

The parsing for shortcode transformations within Gutenberg is lossy, at best. Their parsing requires a newline character or for the shortcode to be wrapped in <p> tags, which is very rarely the case in reality. This often leads to quite a few shortcodes being missed when converting to blocks, particularly in shortcodes with inner content.

Because of this, using raw transforms and parsing them as text using the @wordpress/shortcode package’s regex-based methods can be more reliable and easier to work with.

Transforms Aren’t Always 1:1

Often, you’ll be writing transforms for one shortcode to one block, one block to another block, etc. But you’ll likely find situations where you want to convert a shortcode or node into multiple blocks, a style of a block, a group block with inner blocks, a text format, etc. The block editor’s API allows for all of these variations and we can be quite creative with these transforms.


We hope these tips help, and were interesting. There is work going on within the Gutenberg project to significantly expand and document how transforms work, but it can be a lot to get your head around. They’re quite powerful, though, and could be a powerful tool for all of us on migration projects. Please leave comments below or discuss in the #guild-gutenberg Slack channel if you have any questions or would like to see more examples!