Data science /

The first time I tried coding, I built an AI model

February 2 2020/7 min read

Ok, calm down. It’s not what you think. This is not one of those look-what-I-did-I-must-be-a-superhuman types of articles. Instead, it’s a few thoughts on how far the no-code (or “low-code”) movement has come. We can now extend the ability to “explore-by-doing” to people outside the coding community, even with AI. To build my AI model, the only coding I did was use a Twitter API to get the data I wanted, then I plugged the data into the Peltarion Platform and built my model from there. I think we can finally conclude that the no-code movement has reached AI! 

In this article, I’m going to show all you non-coders out there how you can build an AI model that can tell the difference between Twitter users by identifying patterns in previous tweets from that account. When first getting started, I thought it would be interesting to see how easily the model would be able to determine which U.S. presidential candidate had written a particular Tweet, to see if it picked up on general themes in their politics or their writing style. I even toyed with the idea of creating an app where people could find out which presidential candidate they Tweet like. But, I then decided the uselessness of having an app outweighed the brief entertainment value, and it would be more interesting to just explore the model itself. So, here we are instead - an article to help you design your own model and which will let you wow your computer scientist friends the next time you’re at a dinner party!

When I tried the model using Peltarion’s Text Classifier app, which lets you paste or write some text and see what the model predicts, it did pretty well. Bear in mind, doing random one-offs is no way to test a real model that you’re actually going to use (to do this you would at least want to have a labeled dataset that the model has never seen before and perform an accuracy test), but for our purposes of having some fun and learning about AI, it does the trick.  

Here's the candidate the model associates with some different test statement

What are some potential use cases? 

All right, so first let’s start with some idea generation here. What could this model be used for? Well, if you watched the U.S. Democratic debate the other day, perhaps you’d be keen to make a model that could identify whether someone was defending the #NeverWarren hashtag or trying to get people to stop using it based on whether they had previously been for or against candidate Elizabeth Warren (assuming that the Twitter users you were looking at wouldn’t have changed their minds about Warren based on the debate or the hashtag). 

To keep it simple, let’s choose four presidential candidates (Joe Biden, Donald Trump, Bernie Sanders and Elizabeth Warren) and see if we can design a model that can tell which one has written a particular tweet. 

If you’re new to coding (like I was), an easy way to get started is to simply go to Google Drive and create a Google Colab notebook, which will save you the hassle of having to learn how to use the terminal on your computer. You still have to do the coding in Python to use it though (as it doesn’t support any other programming languages), so make sure you’re comfortable with this. 

If this is the first time you’re using the Twitter API you’ll need to start by creating a developer account with them. If you haven’t used Twitter much before, this may take a bit of time and you may need to answer a few questions about how you are going to use it. Once this has been approved, you can then use the guides on Twitter to learn how to set it up and get your API key. 

Let’s start coding 

You can use the code in this Colab notebook to retrieve as many tweets as the API allows for and return a CSV file. Do make sure you’ve finished this step first so you have your own keys with which to retrieve the Twitter data. 

Once you’ve downloaded your CSV file the coding work is done! At this point in the project, I was ready to do a little happy dance. To experienced coders, the above step is a piece of cake, but for me, it took more time than I’d care to admit so I think the happy dance was warranted.

Getting started with the platform

Now we’re ready to start using the Peltarion Platform to design our model. The platform has a free community version which is pretty generous and gives you 50 GPU hours per month to play around with for free. It also has Google’s pre-trained BERT (Bidirectional Encoder Representations from Transformers) model for training models based on text data – which is perfect for our purposes. 

After creating a project on the platform, we’re ready to upload the data by dropping the CSV file with the Twitter data into the platform. Be sure to see if it looks all right. 

The only thing we need to do in the dataset view is verified that the sequence length of the text is set to a number that will make the model read the full tweet. You can set it to 100 or 120, which will make sure it looks at the whole Tweet.

The dataset

AI modeling 

Now we’re ready to move on to the modeling view (the second to the right on the top bar). After creating a new experiment we can then start building our model. From the menu bar on the right, choose the model named BERT (uncased). Once clicked, it will ask if we want the weights to be trainable. Check the box to enable training, the model will then be imported into our modeling page. 

Let's start modeling

Two boxes will be marked in red. This means something needs to change before we can train the model. Let’s start with the input block. This is where we need to tell the model which data to use in order to make its predictions. In this case, we want it to use the text from the tweets to make predictions about who wrote it, so the input feature can be set to “text,” and the target feature should be set to “Candidate.” 

When we set this up, a little error message will appear informing us that there is a problem with the layer before: it only outputs two categories, but we need four (as many as the candidates). To solve this,  press the “Dense” box, click the “Drop weights” button and change the number of nodes to four. Et voilà! The model is now ready to start training. 

Training the model 

Now, before you hit the run button, I’m going to suggest a shortcut here. Normally, you would run the experiment with your best bet for how to get a good result first, and then tweak things and run more experiments afterward. But in the interest of time and word count, I’m going to leave you to read up on that more on your own. Now, let’s just make sure to get a pretty decent result. 

Click the “BERT” block in your model, followed by “Settings” and then set them as follows. Since BERT is a very well-trained model, based on huge amounts of text, we want the model to change just a little bit with the new data we are putting in. Otherwise, we are undoing a whole lot of excellent training that has already been put into the model. 

BERT block

Now you can hit “Run.” It will take one or two hours for the model to train so perhaps use this time to go outside and enjoy the fresh air. Alternatively, if you want to live in suspense you can go to the “Evaluation” view to watch the graphs while the model trains. No judgment either way. I won’t say which option I’m going for. 

...All right, we’re back (or, maybe, we were here all along...)! Let’s take a look at our creation. 

Evaluation view

In the evaluation view, you’ll see how the model’s accuracy for the training data compares with the accuracy of the validation data. You can also compare different models that you try out to see which changes seem to make a big difference to your experiment. Our model currently has an accuracy of 85%. At this point, real data scientists would use their knowledge from lots of different research papers they’ve read to tweak and improve the model until it starts making really good predictions. But as first-time coders, I think we can be relatively pleased with a result like this for now.  

If you’ve stayed with me for this long you’ve maybe also just completed your own model and started to understand the technology a bit better. Pretty exciting, right? 

It is hardly hyperbole to say that AI will affect most aspects of our lives, and many people today understand that we need thinkers from a range of backgrounds to start approaching the questions that come with it. Operational AI platforms, like Peltarion’s, mean that we can finally encourage people without a technical background to begin to better understand AI and get involved.

It is hardly a hyperbole to say that AI will affect most aspects of our lives, and many people today understand that we need thinkers from a range of backgrounds to start approaching the questions that come with it. Operational AI platforms, like Peltarion’s, mean that we can finally encourage people without a technical background to begin to better understand AI and get involved.

  • Anna Gross

    Anna Gross

    Business Developer

    Anna Gross works with business development at Peltarion, aiming to make deep learning accessible to people from a wide range of industries. Before joining Peltarion, she set up the startup non-profit Project Access. She holds a bachelor’s degree in History from the University of Oxford and has also spent two years at Peking University in China studying Mandarin.

02/ More on Data science