Skip to main content

CountRecords

The CountRecords gem allows you to count the number of rows in a dataset in different ways. You can count all rows, count non-null values in selected columns, or count distinct non-null values in selected columns.

Input and Output

The CountRecords gem accepts the following input and output.

PortDescription
in0Input dataset with the columns to count.
outOutput dataset with the resulting count(s). Output has one row with the selected count(s).

Parameters

Configure the CountRecords gem using the following parameters.

ParameterDescription
Count optionChoose how the data should be counted. See Count options below.
Select columns to countOne or more columns to count. Required for counting non-null records or distinct records.

Count options

Choose one of the following strategies for counting records.

StrategyDescription
Count number of total recordsReturns the total number of rows in the input dataset, including null values.
Count non-null records in selected column(s)Returns the number of non-null rows for each selected column.
Count distinct records in selected column(s)Returns the number of distinct, non-null values for each selected column.

Example

Given a table of patient visits:

PatientIDVisitDateDepartmentDiagnosis
12024-01-01CardiologyFlu
22024-01-02OncologyCancer
32024-01-03CardiologyFlu
42024-01-04NULLCold

If you choose:

  • Count distinct records on Department: the result will be 2 (Cardiology, Oncology).
  • Count non-null records on Department: the result will be 3.
  • Count total number of records: the result will be 4.