DataCleansing
Use the DataCleansing gem to standardize data formats and address missing or null values in the data.
Parameters
Parameter | Description |
---|---|
Remove nulls from entire dataset | Removes any rows that contain null values. This operates on all columns—not just those you selected to clean. |
Select columns to clean | Specifies the columns to apply data cleansing transformations to. |
Replace null values in column | Replaces null values in selected columns with a specified default. Example: 0 for numeric columns, empty string for text |
Remove unwanted characters | Removes specified characters from all values in the selected columns. Example: remove whitespaces or punctuation |
Modify case | Converts text in selected columns to a specified case format. Example: lowercase, UPPERCASE, Title Case |
Example
Assume you have a dataset that includes all entries from a feedback survey.
Name | Date | Rating | Feedback |
---|---|---|---|
Ada | 2025-04-18 | 5 | I really enjoy the product |
scott | 2025-04-18 | 5 | NULL |
emma | 2025-04-17 | 2 | The product is confusing |
NULL | 2025-04-17 | 3 | NULL |
The following is one way to configure a DataCleansing gem for this table:
- Select columns to clean:
Name
- Replace null values in column:
Not provided
- Modify case:
Title Case
Result
After the transformation, the table will look like:
Name | Date | Rating | Feedback |
---|---|---|---|
Ada | 2025-04-18 | 5 | I really enjoy the product |
Scott | 2025-04-18 | 5 | NULL |
Emma | 2025-04-17 | 2 | The product is confusing |
Not provided | 2025-04-17 | 3 | NULL |