Test it on the Peltarion Platform

A platform to build and deploy deep learning projects.

Even if you’re not an AI superstar.


Import files and data sources to the Platform

The Peltarion Platform makes it easy to import data from a variety of sources, and to set it up to train your own models.

Upload data that is organized as many examples of one or more features, where each feature can be a numerical value, a category, an image, or text. You can select one or many features to use as input to your models, and set which feature is the target that your models will learn to predict.

There are two main ways to import datasets inside your project: using our data library, or by uploading your own files.

Data library: ready-made datasets

The Data library contains datasets that are ready to be added to your project.

In the Datasets overview page, click the Data library button to open the Data library window. Use it to:

  • Quickly import the data required to follow one of the tutorials

  • Get started training and evaluating models using reference datasets

  • Prototype models on data similar to your own, and use transfer learning when you are ready to train on your own data

Please note that datasets, machine-learning models, weights, topologies, research papers and other content, including open source software, (collectively referred to as Content) provided and/or suggested by Peltarion for use in the Platform and otherwise, may be subject to separate third party terms of use or license terms. You are solely responsible for complying with the applicable terms. Peltarion makes no representations or warranties about Content. You expressly relieve us from any and all liability, loss or risk arising (directly or indirectly) from Your use of any third party content.

The datasets in the data library may come from third party sources and are provided for convenience. Read the dataset licenses to know the particular terms of each dataset.

BigQuery: import from your data warehouse

BigQuery import lets you retrieve data stored in a Google BigQuery table, and turns it into a dataset inside your Platform project.
This allows you to easily train models on large datasets available to you or your company.

To get started, click on the BigQuery Import button and follow these 3 steps:

  1. Authentication
    Select the Google Account you want to use, and click Allow when asked to allow peltarion.com to view your data in Google BigQuery.
    If you are not already signed in this account, Google will ask you to identify yourself.

    • If you have several Google Accounts, select the one which has access to the BigQuery table you want to use.

    • Peltarion does not store the access permission.
      This means that Peltarion loses all access to your BigQuery information and data as soon as you close the import dialog. This also means that you need to authenticate every time you open the import dialog.

  2. Source
    Google BigQuery charges a fee when you retrieve data from their storage. Select the Billing project that should be charged for the data transfer from BigQuery to the Peltarion Platform. The account you authenticate with needs to have the permission bigquery.jobs.create for the selected Billing project.
    Then find your table by selecting the Project ID, Dataset, and Table from the ones visible to your account.
    The Platform creates one dataset from a single, entire BigQuery table. You might need to create a new table in your BigQuery project before importing it on the Platform, especially:

    • If you need data spread across several BigQuery tables

    • If an existing table has more data (columns or rows) than you need

  3. Preview
    The preview lets you check that the table contains the data you expect before importing it (there is no transfer cost incurred from showing the preview).

When everything looks good click on Create, and a new dataset will be created from the BigQuery table, ready to be used in your project’s experiments!

File import

Import files that you have prepared with the data you want to use.
You can add files in several ways:

  • Click Choose files to upload files directly from your local computer. You can also just drag and drop them in the dotted area.
    This should be limited to files of 5 GB or smaller, to limit the risk of connection issues during upload.

  • You can also use the Data API to let a script (or program) upload the files you want into a new dataset.

  • Import data from a URL, if your files are hosted online. This is recommended for large files, as it usually provides a better and faster connection.

File formats supported

You can upload 3 types of files into a dataset:

  • csv: A comma-separated values text file, the easiest way to upload data with heterogeneous features

  • npy: A saved NumPy array file, where different examples are listed along the first array dimension

  • zip: A compressed zip file, containing one or more csv, npy, or image files

Each file type has its own specifications and requirements.

To work with images, you can use either of the

  • jpg

  • png

file formats, and upload the images inside a zip file.

Uploading several files

You can upload several files into the same dataset.
In that case, additional files are assumed to contain additional features (columns) for the same examples, and not additional examples (rows) of the same features.

This can be useful if your training examples have several features which are saved in different files.
If you want to upload multiple files, keep in mind that:

  • All the files must be uploaded at the time when the dataset is created

  • All the files must contain the same number of examples (the number of examples is determined from the first file you upload)

  • All the files must contain the examples in the same order