Skip to main content

TextToColumns

When working with certain tables, you might encounter text columns that contain multiple values separated by specific characters such as commas or semicolons.

Use the TextToColumns gem to parse this text and simplify further analysis and processing.

Parameters

ParameterDescription
Select Column to SplitThe column that contains the text you would like to split.
DelimiterThe character that delimits the separate values.
Select Split Strategy
  • Split to columns: Values will be split into individual rows in one or more columns.
  • Split to rows: Values will be split into individual rows in one column.

Example

Let's say you have the following table that includes bank account information. Note that some values in the Beneficiaries column contain multiple names separated by semicolons.

Account_NumberAccount_TypeBalanceBeneficiaries
123456789Checking75000.50Amina Yusuf; Juan Pérez
987654321Savings12000.00Chen Wei
456789123Checking3400.50Yuki Tanaka; Leila Haddad; Ivan Petrov
789123456Checking800.00NULL
321654987Savings2200.95Mei Lin; Noah Schmidt

You can use the TextToColumns gem to automatically split these values into separate rows or columns, making your data cleaner and easier to work with.

  1. Open the TextToColumns gem.
  2. Select the Beneficiaries column to split.
  3. Under Delimiter, input ; as the delimiting character.
  4. Select the split strategy.

Let's explore the outputs that result from the different split strategies.

Split to columns

When you use the Split to columns strategy, each name should be split into its own column.

  1. Select Split to columns.
  2. Under Number of columns, type 2.
  3. Keep the default Extra Characters setting to Leave extra in last column.
  4. Keep or change the default column prefix and suffix.

The resulting table will have two new columns. The output table has the same number of rows as the input table.

Account_NumberAccount_TypeBalanceBeneficiariesroot_1_generatedroot_2_generated
123456789Checking75000.50Amina Yusuf; Juan PérezAmina YusufJuan Pérez
987654321Savings12000.00Chen WeiChen WeiNULL
456789123Checking3400.50Yuki Tanaka; Leila Haddad; Ivan PetrovYuki TanakaLeila Haddad; Ivan Petrov
789123456Checking800.00NULLNULLNULL
321654987Savings2200.95Mei Lin; Noah SchmidtMei LinNoah Schmidt

Notice that one the of cells still has a semicolon: Leila Haddad; Ivan Petrov.

This is because the gem was configured to generate two new columns. Extra characters are kept in the last column. If the gem was configured to generate three columns instead, each beneficiary would have their own column.

Split to rows

The Split to rows strategy creates a separate row for each value in the selected column. All other column values are copied to each new row. Use this strategy when:

  • The number of items in a cell (such as beneficiaries) varies between rows.
  • You don’t know the maximum number of items in the column.

To apply this strategy:

  1. Select Split to rows.
  2. Keep or change the default generated column name.

In the output table, each beneficiary has their own row. The output table has more rows than the input table.

Account_NumberAccount_TypeBalanceBeneficiariesgenerated_column
123456789Checking75000.5Amina Yusuf; Juan PérezAmina Yusuf
123456789Checking75000.5Amina Yusuf; Juan PérezJuan Pérez
987654321Savings12000Chen WeiChen Wei
456789123Checking3400.5Yuki Tanaka; Leila Haddad; Ivan PetrovYuki Tanaka
456789123Checking3400.5Yuki Tanaka; Leila Haddad; Ivan PetrovLeila Haddad
456789123Checking3400.5Yuki Tanaka; Leila Haddad; Ivan PetrovIvan Petrov
789123456Checking800NULLNULL
321654987Savings2200.95Mei Lin; Noah SchmidtMei Lin
321654987Savings2200.95Mei Lin; Noah SchmidtNoah Schmidt