Numeric encoding

One of the feature encodings available on the platform is numeric encoding. You can select integers, floats or NumPy arrays as the numeric feature.

Integer:

  • An integer is a single numeric value which is not a fraction like …-100,…​,-2,-1,0,1,2,…100,…​

Float:

  • A floating number is a number with decimal points such as -1.3, 0.25, 1.45, 107,3

NumPy array:

  • NumPy (Numerical Python) is the Python library commonly used in machine learning.

  • A NumPy array is a grid of values where all the elements are the same type and indexed by a tuple of nonnegative integers.

  • The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

  • If the npy array has the shape (1000, 20, 10, 3), then the platform will treat it as 1000 examples of a tensor feature with shape (20, 10, 3).

Example:

NumPy Array
Figure 1. 1D NumPy array with shape (4,), 2D NumPy array with shape (2,3), and 3D NumPy array with shape (4,3,2).

Example:

Predict real estate prices
Figure 2. Look for the Cali House dataset in the Predict real estate prices tutorial. This dataset consists of map images and float numeric values such as median income.

Normalization

For numeric features, you can normalize your data before you start training your network.

Normalization is the process of ‘resizing’ values (e.g., the outputs of a layer) from their actual numeric range into a standard range of values.

The normalization process makes all values across features be more consistent with each other, which can be interpreted as making all the values across features of equal importance. Normalization helps speed up the training of the network.

There are two options available for normalization:

  • Standardization converts a set of raw input data to have a zero mean and unit standard deviation

  • Min-max normalization transforms input data to lie in a range between 0 and 1. If the numeric data is already on an appropriate scale, then you don’t have to use normalization and you can choose the None option in the platform.

Example: Consider a dataset containing two features like age and population. While age ranges from 0 to 100, the population can range from 0 to millions. In this case, these two features are in different ranges and they need to be scaled into meaningful ranges to improve your neural network training and make it more accurate and faster.

Common building blocks for numerical data

When working with any kind of numerical data, the most common building blocks used is the Dense block. Numerical inputs can be connected to Dense blocks or concatenated with other numeric inputs. The Concatenate block merges 2 to 5 inputs into a single output that can be connected into other blocks. You can use a Concatenate block to stick numeric data together, and connect it to a Dense block.

Was this page helpful?
YesNo