Mycroft is the world's first open source voice assistant designed to provide users with the ability to customize their experience in order to create a personalized AI. Additionally, user data is kept private and all queries are deleted in real time by the software. The voice platform can be deployed anywhere or used in conjunction with Mycroft's "Mark" family of devices.
The following is an abridged transcript of a February 15, 2018 interview between Joshua Montgomery, CEO of Mycroft AI, Inc., and the element14 Community team.
Can you give a high level overview of Mycroft and talk about the inspiration behind developing this software and releasing it?
Mycroft is envisioned as an AI that runs anywhere and interacts exactly like a person. The idea being that you can have a real conversation with a smart speaker or an automobile or, for corporate America, a planned sales system—the same type of natural conversation that you would have with another person. It's a big vision—it's something that's going to take a long time to achieve—but it's clearly the vision of Big Tech in Silicon Valley: that eventually all of your devices will be able to hold a conversation and they'll—in Big Tech's case—those conversations will be with them. In our case, our vision is a future where these AIs that inhabit the devices that we use every day, they belong to the users, they represent the users, and they respect the users from the standpoint of privacy and data independence. That's the big vision here at Mycroft.
It seems like a lot of the product descriptions and the press releases really stress that issue of privacy. What is it about transparency and releasing your product as an open source technology that you think consumers find valuable, and why is it of such significance to Mycroft?
Privacy is something that we've largely given up in our day-to-day lives on the Internet. Companies track us and monitor us and monetize us every time we open a web browser. But, you have a choice as to whether or not you use the Internet. There are spaces in your life that are private and personal, that today are isolated and you can't be spied on. The issue with these technologies is they become ubiquitous, is that they're moving into our homes. They're moving into our bedrooms. They're moving into our vehicles and into all of our personal spaces. And if the story that Silicon Valley's been telling for the last twenty years continues, we'll end up giving up all privacy to just a few companies in Silicon Valley. It's important that people have private space in their lives; it's an important part of being human. Right now, there's a danger that all of this technology will be locked up in the vaults of Silicon Valley and nobody will have access to it without sacrificing their privacy. Our vision around this technology is that the users control their experience, they control their data, they're able to customize it, and they're able to keep their private information, private. It's become an issue on the Internet, because people do spend so much time there, there's danger in the future of all of our information being owned and monetized by these third party companies. We're the bulwark against that. We're building a technology that's open, that can be audited, where it's a stated vision of the technology to respect people's privacy, so that when you install a Mycroft-powered microwave or a Mycroft-powered smart speaker in your home, you know that your activities and your day-to-day life are not being monitored and monetized by some random data analyst in a Big Tech company.
Along with privacy, it seems like another emphasis of the Mark II product is the element of customization. Between those two—privacy and customization—are there any other main selling points that you're really trying to hit on with this product?
Yeah, privacy and customization are both really important, and then, finally, we're doing a lot of innovative things around the user experience and around user agency. The technologies that are coming out of some of the other companies—I view it as hit it and quit it: you ask one question, it answers it or it doesn't, but there's no real opportunity to follow on. There's really no stated objective with those companies to be able to hold a conversation. It's a very utilitarian, in and out, sort of, interaction. Where, for us, we're looking to build something that's a little bit deeper. Our community is working on a variety of different tools. The one that has most recently become available, training.mycroft.ai, takes missed queries and makes them available to the community so the community can answer them. So, if I ask the technology, "What's your favorite color?" it really doesn't have an objective answer to that—that's a subjective query. So, the subjective query gets sent in to the community and the community manages the responses. And for the default Mycroft persona—the one that represents our brand—we're being very careful to make sure that it's very objective and unopinionated about a variety of subjects, because we really do need to provide a neutral playing field for people to stand on. But, people can take that personality and fork it, and they can do whatever they want. They can make Mycroft a fan of the Steelers and they can make Mycroft a fan of the Patriots. They can customize how it responds to various different missed queries: it could be sarcastic, it could be funny, it could be childlike. We're making all of those features available and building them into a conversation engine called Persona that will eventually allow people to have full conversations with the technology. I think that that's a very different goal than the goal that has been set for some of these other technology stacks: they're more about HMI and data access than they are about building what I'd call a strong AI. I think the differentiation of goals is very important because we're not building a new human-machine interface, we really are building an AI that interacts like a person. And that's a very different thing. One of the other concepts that we've really been giving a lot of thought to is the concept of user agency. User agency is the idea that the technology that you represent, represents you and not the company that developed it. We have to really examine, as we use technology, "Is this piece of technology I'm using—is it looking out for my best interest, or is it looking out for the best interest of the company that made it?" And, for us, we want to build a technology stack that looks out for the users first.
I'm wondering if we can get into more of the product specs and talk to me a little bit about the hardware that goes into the Mark II?
When we made the Mark 1, we decided to base it on hacker and maker technologies that were easy to access and had great documentation, which is why we used Raspberry Pi. It was fantastic as we went through the development cycle that Raspberry Pi advanced from the Pi 2 to the Pi 3, because then we got Wi-Fi and Bluetooth and a bunch of other features that we effectively paid the same price for the part, so that was great and it turned out to be a really great decision and made the technology extensible and it also allowed us to use the same software on the Mark 1 that you can bring down as a Raspberry Pi image and just image to any Pi anywhere. The idea behind the Mark 1 was to empower makers and hackers and developers to help to advance the state of the art of this technology by engaging in creative activities with it. The Mark II is being designed for the consumer market. The idea is that it works just as well for grandma as it does for a maker. The Mark II is also being developed as a completely open hardware technology, the same as the Mark 1, so that if you're a big brand and you want to ship a smart speaker, you can take a Mark II design, send it off to a contract manufacturer and make a million of them. We hope that you use our software stack, but if you want to build your own or use one from another company, you're welcome to. The idea there is that we will get the economies of scale that you would get from a big tech company, without having to build a retail brand that will sell hundreds of thousands of units. The Mark II is completely open source, like the Mark 1, it has a quad-core processor. We added another gig of RAM to the Mark II, and we're working with Xilinx on the chip set, which is really, really exciting, because Xilinx has an FPGA fabric as part of that chip that allows us to move software effectively on hardware. We'll be using about twenty percent of that fabric to do wakeword spotting and a few other things inside the software stack—beamforming and noise cancellation—but the rest of the fabric remains available for people to do all kinds of crazy, exciting things with it. We're really excited to be bringing a state of the art chip set with noise cancellation, beamforming, lots of processing power, lots of RAM—and then a removable SD, so you can add as much storage as you want—to market for the same price that Big Tech is bringing completely closed and proprietary speakers to market.
You talked a little bit about the importance of being able to scale this product. Going along that line, can you talk about your relationship with Avnet and how they fit into the picture?
As a small company that's really building more of a reference design than a product, it's important that we get the types of economies of scale that are available to companies that would be shipping vastly greater quantity. Between Avent and Aaware, that's doing our noise cancellation, we're able to access a supply chain that small companies like us would never be able to build on our own at pricing that's very competitive to what a big tech company would pay—a company that made hundreds of thousands or millions of units. Those partnerships are important from a logistics standpoint in making sure that we have access to all of the parts we need. They're important from a pricing standpoint, making sure that those prices are in line with our competition. And then they're important from a strategic standpoint, because Avnet and Aaware and our friends at Kickstarter are really great at building communities and helping to promote into the maker and the developer community, which is where we're hoping our software will live.
I'm wondering if you could take a moment to identify any key landmarks or milestones that you encountered in your own journey going from that initial vision to a physical prototype and then finally a product that you brought to market. Were there any really key moments that stood out to you that others might find helpful as they navigate their own journeys?
As a technical person, my inclination is to build a design and then go out and build a prototype, do a couple of iterations, build a production unit, and then take it to market. The issue with that is it's completely backwards. The first thing you should do is envision your product. The second thing you should do is take it to market. Make sure that somebody is actually willing to pay for this thing that you're building. That's where crowdfunding sites are very, very important. I think that the biggest wasted resource in our country is entrepreneurs who are chasing problems that nobody's willing to pay to have solved. Going out to market with your product early, finding out if people are willing to pay for it, doing something called a smoke test, where you go out and you presell a product that doesn't even exist—actually take the customer's money to make sure that they're willing to pay and then return it and say, "Hey, we're going to be back in a year with the product." That type of testing is really, really key to building a successful product, because then the next step becomes building a prototype, iterating, and making sure, in an ideal world, that you deliver a product that's significantly better than the one you originally sold. In Mycroft's case with the Mark 1, we originally had a Mark 1 Basic, where, in the back of the Mark 1, we were going to expose a single USB port and an Ethernet port. The USB port would have been used for a Wi-Fi dongle. The product we actually shipped had two RCA ports, 40 pins of IO, four USB ports, an Ethernet port, built-in Ethernet, built-in Bluetooth, and really significantly exceeded the original specifications. The reason we were able to do that is because we had the support of this community, because we had gone out and presold the product and verified there was a market for it. It's really important for makers and hackers everywhere who are thinking of doing a commercial product to make sure that people want to buy it before they waste a lot of time and effort building something that nobody wants.
For more information about Mycroft AI, read our in-depth case study highlighting the Mark products and the evolution of the company.
Top Comments