From our tabular data, we need to pick out some columns that we know about the customer, the product or the context. These columns will be used as input and are commonly called input features. We then need to pick one column that is the one the model is going to predict, that’s the label feature in the training data set.
What columns the tabular data should contain is depending on what is accessible and a trial-and-error of what is working by trying different variations on the Peltarion platform. It is also useful to be creative when thinking about what features can be used.
An example would be that the tabular data consists of one row per customer. The features could be divided into the ones about the customer, the ones related to how the customer uses the product, and the miscellaneous ones. Customer features could be what kind of user the customer is (novice or advanced), how experienced the person is in another domain, or where the person is located. The customer with product features could be how many days since the customer signed up, how often the customer has interacted with the product for the last month, how many interactions the last week, what parts of the product has the user used, have the company had any other interactions with the customer and more. The miscellaneous features are the hardest to find, but could be things as date, is it during the summer or is it a Friday evening.
One column in the tabular data should be the one to predict. A good label for this case would be a column that says yes or no depending if the user said yes to an upgrade within one month. This column, based on the other features, would be the one the model could then predict.