Data Team
Payrails Data Catalog
A searchable directory of every dataset, table, and field across Payrails — who owns it, what it means, and how it connects to the rest of our data.
Catalog is live and up to date
Updated on every production release
Maintained by the Payrails Data Team
FAQ
Common Questions
Everything you need to navigate and understand our data.
The Data Catalog is a searchable directory of every dataset, table, and field
that the Payrails Data Team produces and maintains. Think of it like a map of
our data warehouse — it tells you what data exists, what each field means,
and how datasets relate to each other.
It exists so that anyone — from a data analyst to a product manager — can understand our data without needing to ask an engineer. Instead of guessing what a column called
It exists so that anyone — from a data analyst to a product manager — can understand our data without needing to ask an engineer. Instead of guessing what a column called
transaction_status means, you can look it up here.
Once you open the catalog, use the search bar at the top — you can search
by table name, column name, or keyword (e.g. "payment", "merchant", "refund").
On the left sidebar, you can also browse by category or data source (e.g. Checkout, Adyen, Braintree). Tables in the Core and Curated sections are the most commonly used starting points for business reporting.
On the left sidebar, you can also browse by category or data source (e.g. Checkout, Adyen, Braintree). Tables in the Core and Curated sections are the most commonly used starting points for business reporting.
Sources are raw data that arrive directly from external systems — for example,
transaction records sent by a payment processor like Checkout.com or Adyen.
This data is unprocessed and may be incomplete or need cleaning.
Models are datasets that the Data Team has built by transforming that raw data — cleaning, joining, calculating, and restructuring it to make it accurate and easy to use. When in doubt, use models (not sources) for analysis.
Models are datasets that the Data Team has built by transforming that raw data — cleaning, joining, calculating, and restructuring it to make it accurate and easy to use. When in doubt, use models (not sources) for analysis.
Our data goes through several stages before it is ready for reporting:
• dl_ (Data Layer / Sources) — Raw ingested data from external providers (Adyen, Checkout, Braintree, etc.). Do not use these directly unless you know what you are doing.
• Intermediate — Partially cleaned and joined data, used as building blocks. Not intended for direct consumption.
• Core — Clean, standardized, production-ready datasets. This is usually the right place to start for reporting and analysis.
• Curated — Purpose-built datasets assembled for specific reporting needs (e.g. financial reconciliation, merchant dashboards).
• dl_ (Data Layer / Sources) — Raw ingested data from external providers (Adyen, Checkout, Braintree, etc.). Do not use these directly unless you know what you are doing.
• Intermediate — Partially cleaned and joined data, used as building blocks. Not intended for direct consumption.
• Core — Clean, standardized, production-ready datasets. This is usually the right place to start for reporting and analysis.
• Curated — Purpose-built datasets assembled for specific reporting needs (e.g. financial reconciliation, merchant dashboards).
In the catalog, each table shows whether automated data quality tests are
passing or failing. A green status means the table has been validated and its
data quality checks are passing.
As a general rule:
• Tables in the Core and Curated layers are actively maintained and production-grade.
• Tables in dl_ (raw sources) and Intermediate layers are internal building blocks — they may not be complete or documented.
If you are unsure whether a table is safe to use for a report or decision, ask the Data Team.
As a general rule:
• Tables in the Core and Curated layers are actively maintained and production-grade.
• Tables in dl_ (raw sources) and Intermediate layers are internal building blocks — they may not be complete or documented.
If you are unsure whether a table is safe to use for a report or decision, ask the Data Team.
Click on any table name in the catalog to open its detail page. You will see a
full list of columns with their name, data type, and description.
If a column description is missing, it means the Data Team has not yet documented that field. Please flag it to us in #data-help on Slack — we appreciate the feedback and will prioritize adding it.
If a column description is missing, it means the Data Team has not yet documented that field. Please flag it to us in #data-help on Slack — we appreciate the feedback and will prioritize adding it.
Most production datasets in the Core and Curated layers are refreshed
at least once daily. Some high-priority tables are refreshed more frequently.
The Data Catalog itself (this directory) is rebuilt and redeployed automatically each time a new production release goes out. This means the structure and documentation always reflect the latest version of our data models.
If you suspect the data in a dashboard or report is stale, check the
The Data Catalog itself (this directory) is rebuilt and redeployed automatically each time a new production release goes out. This means the structure and documentation always reflect the latest version of our data models.
If you suspect the data in a dashboard or report is stale, check the
updated_at or dbt_updated_at field on the relevant table.
Each table in the catalog can have automated data quality tests — checks
that verify things like: "no duplicate IDs", "this column is never null",
"values are within an expected range."
• Pass — All checks for this table are succeeding. The data meets its quality contract.
• Fail — One or more checks have failed. The Data Team is usually already aware and investigating.
• No tests — This table has not yet been covered by automated tests. Treat with extra caution.
Failing tests do not always mean the data is wrong — sometimes they reflect a known upstream issue or a test threshold that needs adjustment. If you are concerned about a failure, reach out to the Data Team.
• Pass — All checks for this table are succeeding. The data meets its quality contract.
• Fail — One or more checks have failed. The Data Team is usually already aware and investigating.
• No tests — This table has not yet been covered by automated tests. Treat with extra caution.
Failing tests do not always mean the data is wrong — sometimes they reflect a known upstream issue or a test threshold that needs adjustment. If you are concerned about a failure, reach out to the Data Team.
The best way to request data work is to reach out to the Data Team directly:
• Slack — Post in
• Jira / Ticket — If you have access to our project board, create a ticket with your use case.
When making a request, it helps to include: what business question you are trying to answer, which teams or metrics are involved, and how urgently it is needed. The more context you provide, the faster we can scope and prioritise the work.
• Slack — Post in
#data-help with a description of what you need and why.• Jira / Ticket — If you have access to our project board, create a ticket with your use case.
When making a request, it helps to include: what business question you are trying to answer, which teams or metrics are involved, and how urgently it is needed. The more context you provide, the faster we can scope and prioritise the work.
The catalog is maintained by the Payrails Data Team. Individual tables are
owned by the person or team listed in the Owner field on each table's detail page.
For general questions, data requests, or to report something that looks wrong, reach out on Slack in #data-help. We are happy to help you find and understand the data you need.
For general questions, data requests, or to report something that looks wrong, reach out on Slack in #data-help. We are happy to help you find and understand the data you need.