The blog is posted by WeCloudData’s AI instructor Rhys Williams.
In this two-parter I’ll bounce from the conception of an idea and its manifestation in the real-world, and end in some insight as to choosing and building your first neural network to begin bringing life to these ideas. Ultimately I hope to inspire you to dream up projects wherever you go by giving you a window into my thought process during development.
While of course Deep Learning isn’t the only tool in machine learning’s belt its nonetheless a powerhouse, and it’s creating a healthy debate with regards to functional versus biologically inspired models of intelligence. Theorists like Jeff Hawkins are offering some truly fascinating retorts to current ideas, while people like Pedro Domingos argue with such optimism that a master algorithm could indeed be in sight. However this time around we’re going to stay firmly in the connectionist school due to the sheer amount of resources available to us, not to mention its widespread adoption by industry and creative endeavour alike.
Alright, so where am I headed with all of this? I recently challenged myself to build something cool with the materials you will be provided during WeCloudData’s Deep Learning Capstone. I figured this would be a great opportunity to demonstrate how accessible all of this has become to you guys. You can will the previously impossible into existence within a short space of time, and I’d love to help you do the same. Hey but before I continue, let’s quickly delve into what led me to this point – last post I mentioned that a single-board computer became a robot, but there’s actually a prequel to that story.
Epoch Hopping Mice in Sunny Aotearoa
Auckland, New Zealand – the land of the long white cloud, and home to my favourite giant plate of Kimchi pork with an absurd moat of tofu. I was challenged by my wife Christine, an incredible Scenographer, to write and perform a small script with her. It seemed like an epic way to end our time in the southern hemisphere and I fell straight down the rabbit hole. Being new to the craft I excitedly wore all my influences on my sleeve. I had one of Johann Johannsonn’s beautiful scores in mind to do the emotional legwork. I’d often find myself drifting off into entire worlds while listening IBM 1401: A User’s manual or Fordlandia, and wanted to pay homage to that power. My science-fiction influences we’re apparent and a little mouse found itself drifting through through a first act of fables, into a nightmare of dichotomies, and into another age. In the ultimate act of Grant Morrison fanboyism I also set out to write a script who’s words were that of a dormant consciousness. The author and the designer compelled to input simple narrative into its architecture out of a desperate attempt to complete the same arc as the little time-travelling protagonist. Only able to usher it towards its own agency.
During the process I had researched evolutionary algorithms, reinforcement learning, watched DeepDream’s uncanny hallucinations and realised that is just what people in the machine learning world were trying to do. That we’re trying to tell machines our story in hope that they can help to reshape it with us. That machines are a manifestation of our will to do so.
While I came at all of this from a philosophical perspective, I had a realisation along the way that as lofty as the goal of artificial intelligence is, it is as much a data science problem as anything. Data is a giant slice of the representation of our understanding of the world, and I quickly found myself orbiting around people who were excited at wielding it. At wrangling big swathes of it to reinforce narratives or to birth new ones.
At the same time as this realisation our little mouse traveller found itself outside the confines of its script and into a robot body named Mr-B, named after Clone High’s Mr. Butlertron. At the time just a cute little breadboard on wheels, guided by a simple ultrasonic range sensor and an Adafruit motor hat.
It quickly became apparent that my new little buddy was the vessel that I could channel all of my learning through, a reflection of a part of myself. I honestly believe there is no better way to understand data science than with a robot, and no better way to robot than to embrace data science.
Fast-forward, Mr-B is a big monster with TFmini Plus LiDAR based navigation, along with the original Neural Compute Stick on his side, and I’m here talking to you guys about my next creation. Now, with this project I had an opportunity to refine my design methodology. Mr-B had always been a matter of getting it done. Expanding his tracks meant grinding down a coffee pot holder. Custom circuitry meant learning to solder on a smaller scale. The carpenter in me certainly felt comfortable amongst the chaos of it all.
Mr-B eventually got too big for his own boots and I simply transferred the shell of what he was to the powerful Lynxmotion tri-track chassis. Here however I had the chance to create something efficient, a little self-sufficient kernel, the smallest manifestation I could make of a truly offline inference machine. A Mini-B.
Get your Flynn On
As always with this idea I started top down. Asking myself what I wanted to do in its ultimate form. To revisit the importance of data science, and in particular data engineering, my questions for this project were the following:
- What if I could make my data collection mobile?
- What if I could pipe it all into the training process and feed that model back into that vessel for verification?
- What if I could do all of this at the push of a few buttons and have the whole process narrated and error-checked by a moving, grooving little character?
From there I distilled my excitement down to a set of ideas, technologies and libraries that I’m familiar with and begin searching for one’s that might work otherwise. In this case I’ve had experience with the first Neural Compute stick, with a Myriad 2 vpu, and I’m excited to stay ahead of the curve by shifting to OpenVino Toolkit and the Neural Compute Stick 2 and the Myriad X vpu. Sounds like sci-fi to me. I’m in.
During the beginning of a project I always like to lock myself away and surround myself with stimulus. Progressive rock blasting, a Miami Heat game on one screen, and multiple tabs firing on and off my screen for research, with a pot of coffee on the hob. Something about the chaos of those initial phases anchors me into an idea, as though I’m drawing inspiration from the madness of it all and depositing them in the idiosyncrasies of a song. There’s no greater feeling than finding that track that takes you by the hand, and pulls you along into the moment you were looking for. I think we forget sometimes that musicians are intellectuals too, and to me the greats push themselves hard to find that balance between the technical and the emotional. While I was dreaming Mini-B I had the pleasure of seeing Animals As Leaders live in Glasgow. The members of AAL aren’t just virtuosos at their craft, they’re thinkers in their field.
Check out the video: Tosin abasi on process
Witnessing them put their ideas into practice in such an awe-inspiring fashion I found myself rushing home with their energy instilled in me, ready to push past concept and into the testing phase. What I mean to say is find your rhythm, but most importantly surround yourself with what you love, with coffee, and somewhere along the way ideas will travel from their realm into the corporeal, through your fingertips and into a circuit board. To quote Jeff Bridges’ character Kevin Flynn in Tron: Legacy – “Bio-digital jazz, man.”
Down to the Nuts and Bolts
Now if there’s one thing I’ve learnt from my time in the arts it’s the beauty of retrofitting something to your needs. In this particular scenario I knew from research that the Zumo chassis is cute and small, built for compact micro metal gear motors, but in reality is designed for the Arduino platform. Enter a measuring tape, a power drill, TinkerCAD, and a 3D design brought to life with the help of the fantastic Sorenzo Studios. Here I have a custom platform that houses a RPi4 and an Explorerhat Pro on top of my NCS 2, with plenty of room to breathe for both devices.
As you can see I’ve added slits to cable tie the NCS 2 and a lip to secure an amplified speaker board. Intel aren’t the only people to offer this kind of power commercially – Nvidia are also pushing things forward with the Jetson Nano. In this instance however I chose to stick with the Neural Compute Stick 2 out of the pursuit of a more modular build. What this essentially means is that I get to really give a Raspberry Pi 4 4GB edition a good test drive. Raspbery Pihas always had a good community and the accessories available are abundant. As a result I was able to plonk an Explorer Hat Pro right on top and benefit from a built-in motor driver. Combine that with the NCS 2’s minimal 1W draw and you’ve got a compact learning machine using just one power bank. The beauty of all this power now is that I can stay on the edge, minimising the resources I need to outsourice to. A little background on Intel’s VPU’s – These guys are powerful 16 shave core, low-powered offline inference engines. On top of this the Neural Compute Stick makes this scenario modular and operable in parallel, bolstering the inference ability of edge devices. From object detection to pose estimation. Very cool.
Check out the video: Neural Compute stick intro video
In the interest of trying to think one step ahead I went with the power bank here to keep the battery holder built into the Zumo chassis free to power micro servos in a future upgrade. I’m also using a 40mm hex standoff as a placeholder for Mini-B’s neck. This way a simple redesign of the custom mounting plate can provide articulated neck support and even little arms. All in good time!
Of course all of this doesn’t mean that I needn’t be ambitious when it comes to outsourcing to a server efficient enough to train a neural network. I’m lucky enough to have custom desktop with an MSI Nvidia GTX 1060 6GB GPU at my fingertips. I’d love to delve into a multi-GPU situation shortly to gain that extra boost too! This machine has become a true manifestation of my aim to keep myself in as many camps as I can. A triple-boot dream. Should I power up headless the rig defaults to a Ubuntu 16.04 installation dedicated to deep learning, accessible anywhere in my apartment. Attach a screen and I can boot into either that server, into the latest version of Ubuntu to stay ahead of the curve and check for stability, or Windows 10 should I need to work within a situation that calls for it.
...And You Get a Robot…And You Get a Robot…
Here however is where more questions arise:
- Where’s the joy in falling back on routine?
- Who is this project for?
I talked earlier about how accessible this kind of technology is, and I believe access needs to factor into every aspect of this kind of project. In this sense I wanted to factor in different scales of access to resources.
Enter Tkinter, Python’s GUI library, and Google Cloud Platform support along with a short notebook to mediate a deep learning pipeline with Google Collaboratory support, fed back to a bucket and then subsequently updated upon relinquishing manual control and giving Mini-B autonomy. This way one can either outsource to a local server, a virtual machine instance on a relatively small budget, or use Colab’s generous built-in Tesla K80.
The beauty of all of this is that I can collect my data wherever I am, regardless of the resources available to me, and train in parallel whilst continuing to collect data and subsequently verify in a real-world scenario. An end-to-end robot eating its own tail in a loop with its user.
I can basically ask Mini-B to begin a multi-shot or single-shot session, she’ll give me a countdown and I get to drive her around in the context of what I want to teach her, or remain static at a location. Say for instance maintaining a clear path to provide the positive class for a binary image classifier, or driving to find a specific object I want to train her to learn. While I employ a bunch of techniques (scaling, rotation, resizing) along the way programmatically, I can compare and contrast models with regards to external variables beforehand this way. A simple user interface also allows me to choose whether I’m piping the resulting photos into a blocked or clear folder for avoidance, or into a new or current detection project ready to be annotated before training. Here my target dataset is super similar to the COCO dataset in that are certainly plants, people and fruit etc knocking around my apartment, but in this instance a pretrained model didn’t quite cut it in terms of accuracy and detection rate.
Check out the video: through the eyes of a robot
A little fine-tuning and I have a model accurate enough to cater to my needs, namely to inform Mini-B’s movement and speech. Whilst also providing a platform – using Tensorflow’s Object Detection API – to teach Mini-B specific objects that branch out but are related to a given source dataset.
When diving down the transfer learning path I sometimes like to keep an object class related to the dataset the original pretrained model was trained on as a kind of Inception style totem while I fall through the process of creating my own custom model. For me it’s an orange or a mandarin – it features in both the COCO and Imagenet datasets, I can roll it and hold it to verify a robot’s mobility and check for overfitting early on. I used it during my first experiments with colour filtering and OpenCV at the beginning of my robot endeavours and oranges have just stuck around as a reminder of the epic journey it has been so far.
I am of course just experimenting here and an efficient use of transfer learning could perhaps involve localisation of an array of fruit choices, or perhaps even types of citrus fruit. Next post however a tiny little robot friend will say hello and provide Mini-B with another object class to detect and interact with so watch this space.
Of course I need not limit myself to both object detection and obstacle avoidance. I’d love to be able to pick up Mini-B to have her help me decide whether my plants need to be watered, an idea that while cute in this context can and already is being scaled to industry. Perhaps I teach both Mr-B and Mini-B to compose music with generative modelling based on the people they are exposed to, and who are willing to take the time to sing them a song. What a beautiful idea, to give a robot a unique fingerprint with the gift of music. My mind reels with what else we can achieve if we all put our minds to it!
In my next post I’ll delve a little further under the hood, get a bit more technical and discuss some of my considerations in trying to breathe life into this character with deep learning. We’ll discuss the thought process behind choosing the right convnet for the job, and discuss the merits of transfer learning while taking a look at some initial results using Tensorboard and Seaborn.
For now here’s a sneak peek of what Mini-B can do with the mandatory appearance of an orange! See you next time!