⚽ Newly released Peltarion Platform Sidekick Beta project at your service!
Great news from Peltarion Data Science/Machine Learning team warriors - we’ve provided you with an open-source public library with Apache2.0 license to make your life easier while working with the end-to-end tasks of your AI project on the Platform. Sidekick helps you with two main tasks:
Prepare the data into suitable format for Platform ingestion.
Running predictions through the Platform REST API once you’ve deployed your trained model. Feel free to pull, use and contribute. Follow the guidelines in README to get started quickly!
📚 Yay! The Knowledge center is alive and kicking. Damn good looking but there’s more under the hood:
Focus on user experience. We want to help all our users become AI superheroes.
Findability. All articles are search engine optimized, and we’ve added a search capability.
Future-ready. Knowledge center will keep improving, always focusing on enabling all our users to do great stuff on our Platform.
🖇 The aggregation method for precision and recall for single-label multi-class problems have been changed from micro-averaging to macro-averaging.
For this type of problem, micro-averaging results in both precision and recall being exactly the same as accuracy. That does not provide any additional information about the model’s performance. Macro-averaged precision and recall provides a complementary metric to the overall accuracy, since it will be low for models that only perform well on the common classes while performing poorly on the rare classes.
Support for multi-class classification models with higher dimensionality targets now on the platform.
Previously, each row in the dataset had to corresponde to exactly one class. We now allow targets of higher dimensionality, eg a target that is a vector of different classes, or a target that is an image with one class per pixel. This unlocks use cases such as multi-class semantic segmentation of images.
To train a multi-class target model, the target data can be represented by a numpy array, where the last axis is interpreted as the class label and needs to be one-hot-encoded before importing into the platform.
Visualizations for higher dimensionality targets on the Evaluation page now available.
Previously, the metrics under Model evaluation were only computed when the target corresponded to exactly one class or to exactly one numeric value only. We now provide the graphs also for multi-dimensional targets, for example a vector of numeric values or a target image with one class per pixel.
In the case of a classification problem with multi-dimensional target the confusion matrix is sampled to a maximum of 500 000 values. For a multi-dimensional target each value in the confusion matrix corresponds to a vector element, or to a pixel in the target image. This means that the total number of values in the confusion matrix will be many more than the number of samples in the dataset.
In the case of a regression problem with multi-dimensional target each dot in the scatter plot represents an element in the target vector or a pixel value in the target image. For visibility reasons, the scatter plot is sampled to show maximum 500 data points. The error distribution plot is based on 5000 sampled values.
Minimap is dead, RIP minimap!
Instead, say Hello to zooming capabilities on the Modeling canvas. We know you’ve been longing for this, so we’ve introduced the zooming capabilities to the model builder, as well as added some basic key commands, like Cmd/Ctrl+A, Cmd/Ctrl+C, Cmd/Ctrl+V for quick and easy blocks selection and copy/paste. Remember Option/Alt+Click, Hold and Drag still works to help you pan around along the whole model canvas.
Blocks and Settings tabs have gotten a face lift
Both dataset and runtime settings are now defined on the Settings tab in the Inspector. When selecting a block on the Modeling canvas, you can adjust the block parameters in the Blocks tab. When you Shift+Click to select more than one block, you can change their common settings together!
All errors and warnings messages are brought to the Information-center-popup
The Information-center-popup is located in the lower left corner of the Modeling canvas and clicking on the error message will guide you directly to the problematic area, to help solve issues with just a click or two!
Quick overview of the Running jobs queue on your organization Projects page
Anyone in your team can now have a better overview of who’s training which model and where the GPU hours are spent. In case all the GPUs are busy, the new initiated jobs will appear in Queued status, and those recently completed or paused experiments will be listed as Trained experiments.
Note that if you select a specific project in the Projects list, you will get to see the actual GPU and storage usage as well as Running jobs queue for this specific project!
Notification reminder about soon expiring quota plan provided a few days before end of validity
Make sure to run your experiments in time and contact our firstname.lastname@example.org to extend a payment plan. After quota plan has expired, you can still view, access and delete your data, experiments and deployments during 90 days.
Binary prediction labels are flipped for the confusion matrix on Evaluation view.
We now make sure that when you are solving a binary classification problem, our computation engine omits 1 to positive and 0 to negative predictions, to make it intuitive to read the confusion matrix when evaluating the model performance.
Data for already trained experiments is not re-calculated. However, when duplicating or resuming model training, the confusion matrix will be showing flipped values from before and after the resume. If this model has some epoch checkpoint saved from before the change as well as after it, you will see the confusion matrix with 0- -1 labels places switched, having the epoch from after the change show the confusion matrix with correct labels.
This does not change how precision, recall, AUC and binary crossentropy are computed, so the experiment values are still comparable from before and after this change!
R2 computation improved for regression problems.
Previously we had to compute R2 separately for each batch. We’ve changed our metrics library to compute the total sum of squares and residual sum of squares independently. This means the resulting R2 will closer resemble the value you would get when computing it once for the entire dataset. This change does not affect other regression metrics like MSE, RMSE, MAE, MAPE. This change does not affect classification metrics.
The compiler option for “Data access seed” in the “Setup and run” dialog is now randomized for each experiment, both when creating a completely new experiment and when duplicating a previous experiment.
The data access seed is used for controlling in which order the data is accessed during training. Randomization of the seed means each experiment will be independent of each other since data access will be different for each experiment. This is desired behavior for comparing performance between models and runs.
Note that in order to achieve deterministic training behavior, the user can still manually set the same seed between experiments.
New set of deep neural networks snippets available. We have created a handy list of well-known and well-performing networks in the Snippets panel to help you get started, this includes Resnet, Densenet, Inception, Tiramisu and more. Check out each snippet tooltip, find the best suitable one for your problem type and input data, add it to the modeling canvas and start experimenting!
Beware that currently the snippets are not pre-trained and training a deep network on a large set of data may consume a significant amount of GPU power.
Parameters, blocks & settings panels on Dataset, Modeling and Evaluation views are now collapsible! When working with big datasets with many features and deep models with many layers, it’s helpful to have more space for exploring and building of the models. We’ve also added a toolbar above the working area for each view to make sure you always find the necessary buttons in the same place. Note that some of the buttons have shifted to the upper right corner from your usual location.
The calculation method for model performance metrics on the Evaluation view has been changed. Previously the metrics for Regression problems were calculated on normalized data, which has now been changed to calculation on denormalized data.
This affects metrics: MSE, MAE, MAPE, MSLE.
Note that if you have paused experiments and resume training after this change, you will experience peculiarities for the metrics graph.
If you have training processes running, the metrics for those will continue to be calculated on normalized data.
Historical metric values for completed experiments are not changed. This change does not affect the experiment Loss. Note that we’ve also added a few new metrics - RMSE and Gradient norm!
More help and guidance to the model serving through deployment API now available. OpenAPI doc is directly downloadable from the Deployment view. Link to deployment API help page with code snippets in Knowledge center is also added. Check out how to call the deployment API directly from the terminal or Python notebook.
Projects list now has search and filter capability! Search for your own projects or other team members projects without scrolling through long lists!
Tagging capability now available for experiments lists. Add tags to your experiments to quickly identify, search and filter specific experiments. Tags are inherited when an experiment is duplicated.
Organization members list now available with purchased quota plan information and membership management capabilities for administrators. Invite co-workers to collaborate or remove accounts that are obsolete with a few clicks.
New deployment solution released with persistent deployment. No more 48-hour limitation! Create a new persistent deployment, choose a suitable experiment checkpoint and enable the deployment for API calls! Once obsolete, disable the deployment to save resources.
Major redesign of Evaluation view graphs with additional context-specific performance metrics (for classification problems) and more loss functions (for regression problems) published during training.
Easier graph settings with wall time capability now available.
Consistent search and filter of experiments by name, creator, loss, and experiment status now also on Evaluation view.
Quick navigation links between Modeling, Evaluation and Deployment views added for each experiment.
Improved user experience during statistics calculations. Histograms and other feature statistics are updated incrementally while being calculated.
Vastly improved the time from uploading a dataset to when it’s available for training, including faster statistics calculation and saving the dataset version.
Advanced optimizer parameters are now available in the experiment settings panel.
Dataset, training, and validation subsets pre-selection heuristics added for faster new experiment definition.
Dataset statistics for features are now available.
Groups and selections renamed to feature sets and subsets for clarity.
Copy with weights for transfer learning with selected blocks.
Updated tutorials for datasets and modeling.
Improved accuracy calculation for model evaluation.
New search and filter of experiments by name, creator, loss and experiment status.
Modeling view redesign.
Filtered list of experiments on Deployment view.
Go to experiment from Deployment view.