Skip to content

Authoring templates

A template is a YAML file that declares the entities crease should extract from an xlsx, where each one lives, and what its fields mean. The templates under test_cases/ in the repository double as worked examples for each layout pattern.

Placeholder page

This guide is being filled out. For now, the README and test_cases/ are the canonical reference.

Skeleton

template_id: orders
description: Order export from acme.

entities:
  - name: order
    cardinality: many        # one | many
    locate:
      tab: Orders
      orientation: flat      # flat | property_sheet | anchored
      header_row: 0
    fields:
      - name: order_id
        source_column: order_id
        type: string
        pattern: ^ORD-\d{4}$

See the Reference > Template page for the full schema with every field documented.

Templates that pin the read backend

Crease reads spreadsheets through two interchangeable backends — calamine (the default; reads .xlsx, .xls, .xlsb, .ods) and openpyxl (.xlsx only, but exposes cell metadata calamine does not).

One template feature forces openpyxl: locate.skip_hidden_rows: true. Calamine doesn't surface the row-hidden flag, so a template that needs to drop hidden rows is auto-dispatched to openpyxl. The side effect is that such templates can't read .xls / .xlsb / .ods — those formats live on the calamine path only.

entities:
  - name: order
    locate:
      tab: Orders
      orientation: flat
      skip_hidden_rows: true   # → openpyxl backend; .xlsx only

If you'd rather have multi-format support and accept that hidden-row detection won't fire, override at call time:

crease.extract("orders.xls", template, engine="calamine")  # silently no-ops skip_hidden_rows