People are losing it over AI. From analyst predictions, to industry events, to earnings calls, artificial intelligence and machine learning are all the rage. And dollars are flowing in response: the number of AI start-ups getting funding has spiked 4x from 2012 to 2016 (from 160 deals to 658 deals, according to CB Insights) and a recent Forrester study projects a 300% increase in corporate investments from 2016 to 2017. Yet, we’ve seen this hype before. Especially those of us who were at the front lines of AI back in the early days (or at least the mid-90’s!).
Yes, AI is super-exciting. Yes, machine learning (a sub-set of AI, but sometimes not) offers to revolutionize the way we monitor, and model and pattern match and interpret all the big and small data that is swirling around all of us. But no, it’s not all going to happen overnight. And (especially) no, many businesses and consumers still don’t have a clue about the best ways to select, apply and monetize AI for practical, everyday stuff. The following rules should help in this regard – but mainly are envisioned as a (time-tested) sanity check for surviving the latest edition of the AI hype machine.
Rule #1: Scope Matters. Creating a general-purpose thinking machine is hard. Creating an intelligent agent (or bot) that automates a single or small set of everyday, repetitive, “standard” tasks is a lot more tractable. Just as the key to early AI – and the Knowledge Management movement that followed – was finding narrow but high value applications like streamlining problem resolution in call centers or processing loan applications, the same type of “think global, act local” approach applied to today’s AI is equally important. For the same reason, starting with small-ish data vs. super-large big data sets can make sense when applying analytical techniques to many non-scientific business applications (more on this next).
Rule #2: Machine Learning is Not Magic. And it’s not easy (for most) to get right the first time either. Experimentation with a tool like RStudio is key, and there are many algorithms to choose from (Bayes, decision trees, regressions, and once-again popular neural network models aka “deep learning”). Of course training deep learning models can be both an art and a science. However, the good news is that an excellent recent article by Andrew Beam shows that you don’t need Google-scale data to use deep learning. I saw this in some of my graduate work when I was training neural nets to do simple pattern recognition, and it’s great to see this type of small data approach continuing to stand the test of time!
you don’t need Google-scale data to use deep learning”
Rule #3: Data is King. Getting close to customers, understanding their journey, tailoring their experience, and selecting just the right offer are all outcomes that are enabled by insights powered by big and small data. Generating these insights in a timeframe and cost that make them readily available to front line teams (and consumers themselves) is where advanced analytics and techniques like deep learning need to go. But as mentioned above, what you get out is very much a function of what (data) you put in. Where will your training data come from? How will you prepare it? Who will test the performance? These are questions as important as what tool or algorithm you’ll use.
Rule #4: People Matter. Even as AI systems become more skilled at complex decision-making, and take over some “back of house” functions previously performed by humans, we are a long way from creating virtual human beings, a point that Om Malik made in his excellent piece for The New Yorker last summer. Which is why some of the best, most impactful use cases will continue to augment rather than replace human workers, such as the AI voice analysis and feedback system from Boston-based Cogito that gives real-time guidance to employees as they engage customers on the phone, or the “davis” AI powered virtual assistant for IT ops from APM pioneer Dynatrace.
Rule #5: Consumers Don’t Care About Your Technology. Data nerds want to know what flavor of machine learning you are using. If you are selling to them or other techies, than skip to Rule #6. For everyone else, take note that it’s more important to focus on the “why” than the “how” when selling the value of your AI initiative to internal or external stakeholders. Why is the problem interesting? Why is it hard to solve with traditional (non-AI) approaches? Why is this repeatable/scalable vs. one-off solution? Even more so, what unique value is AI providing to your initiative or app? And how will you show ROI going forward?
Rule #6: Embedding AI Drives Adoption. Back in the day, the old joke among AI researchers was that when something in AI become successful, it wasn’t called AI any more. Today, many successful AI powered apps and services have AI “in them” but the technology is not apparent to the end-user. And that’s the point really – embedding AI drives adoption. Fortunately there are a growing number of tools to add AI or machine learning or other intelligent capabilities. These include open source development frameworks and engines like Apache Lucene (NLP) and Mahout (ML), Eclipse BIRT (Developed by Actuate – now part of OpenText) for embedded analytics and visualization, and RapidMiner for machine learning; embedded analytics specialists like Izenda and Sisense; developer platforms like IBM Watson APIs (conversation, speech, vision) and Microsoft Cognitive Services (decision rules, search, vision); and even custom hardware like Nvidia’s Jetson TX2 card.
Rule #7: Focus on Improving Everyday Work. Much of my research and writing over the past few years has focused on turning small and big data insights into everyday value. For marketers there are established use cases for data-driven marketing (see some of them in the piece I did for DMN a few years back), and there’s also a helpful framework for considering which marketing processes are mostly likely to be disrupted by AI from the folks at TopRight Partners. And for others looking at the bigger picture, there’s a very cool study (and poster!) on the overall potential of automation in the workplace – “Where machines could replace humans” – from McKinsey that is worth checking out.
As I approach 5 years of writing, speaking, applying and tracking the growth of small data, it seemed like a good time to compile an updated resource guide with the apps, research centers, agencies, influencers, and vendors who have advanced the state of the art and continue to create what is now a vibrant marketplace of tools and ideas. This exercise has provided an opportunity to revisit some early contributors to the small data community, and discover many new players who have embraced some/all of the small data philosophy.
It has also reminded me that there is a real need to continue evolving the notion of business intelligence, even as AI seems to have replaced big data at the top of the hype curve. And for companies to make better use of all the data (and content) they have already collected – not just to inform decision making, but also to share with customers and even monetize via new, derivative data assets.
Meanwhile, the race among brands to provide more personalized, omni-channel, and ultimately immersive consumer experiences will require new data-driven profile management and tracking approaches. While the expanding Internet of Things (IoT) and wearables adoption offer to shift the focus of analytics further to embedded use cases and small data, and to create even more devices that both create and consume (more high fidelity) local information.
What remains constant is the diversity of views and opinions about what constitutes small data – where it comes from, who owns it, how much is “small” and why it matters. My approach all along was to recognize each new view as is comes along, but also attempt to mash them up into one workable definition:
Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged – often visually – to be accessible, understandable, and actionable for everyday tasks.
Of course this framing was actually created back in the summer of 2013 and unveiled in this post. Overall, I’d say the definition has stood the test of time, and works well along side more recent functional definitions such as Martin Lindstrom’s for brand marketers (“seemingly insignificant behavioral observations…pointing towards an unmet customer need”).
In parallel, I’ve also spent some time over the years looking at how data impacts the customer journey, and how smart apps and products could influence our design thinking. Readers may want to see my talk from the 2016 SPARK Boston event, which has sections on both of these topics.
With this as background, the remainder of this post outlines the first Small Data 100 – 100 of the most noteworthy apps (10), research centers and agencies (10), influencers (40) and vendors/tools (40) that are fueling a new era of data-driven innovation. Of course there are many others who didn’t make the cut, so feel free to nominate additions to my list in the comments below. Thanks for reading!
The first article I published in 2012 (with Mark Fidelman) on the topic of small data focused on the idea of consumer-style and self-service “smart” apps that would bring the power of big data to the masses. And today, some of the best examples of data in action are consumer apps and websites that embrace one or more small data goals (simple, smart, responsive, social). Here are 10 that are especially noteworthy:
Amazon – sure, the site and the company’s mobile apps have fueled many case studies for using data to recommend products and the power of reviews, and there’s Alexa of course, but the move into physical stores is Amazon’s true “gateway to small data” says IMD Switzerland professor Howard Yu in a recent piece in Forbes.
Credit Karma – the freemium credit score app which now claims over 60M users, is a great example of starting with a “killer use case” (get your free credit score), and is evolving into a more full featured personal finance app while continuing to hit on 3 out of the 4 core principles of small data: be simple, smart and mobile.
Fitbit – early on the Nike Fuelband was my poster child for the power of small data, and now that it’s retired, the Fitbit has taken its place as a mass market example of how to deliver real-time, contextual (and personalized) insights, all in a highly intuitive package.
Gilt Groupe – now owned by Hudson’s Bay Company (Saks Fifth Avenue), Gilt was one of the early pioneers in retail flash sales, and has been a testbed for predictive modeling, social engagement and “best offer” style data-driven apps.
Groupon – the daily deal (once) high-flyer is continuing to reinvent itself via analytics acquisitions (Venuelabs) and has mastered how to provide an app experience users want. In fact it topped the latest user sentiment ranking of retail apps by the ARC at Applause.
Kayak – the travel site’s “When to Book” tool (also called price predictor) remains an excellent example of filtering down complex big data into a highly consumable, visual small data, including a recommended action: wait or buy now.
Runkeeper – sitting at the intersection of big data (from a 50M strong community) and small data, this type of “self-quantification” app offer a unique glimpse at the future of engagement and ownership of personal data.
Shopkick – now owned by Korea’s SK Telecom, Shopkick is a widely used product discovery and rewards app that uses beacons (already installed in 14000 stores) to integrate local small data on shoppers’ activity with other insights from 15M users.
Tinder – the (in)famous dating app has been in the news (no not for that) as an example of using simple card-swiping interfaces to gather user-generated small data and set up “anticipatory computing” opportunities – see this piece in Medium.
Yelp – the review site is certainly a great model for tapping the power of user-generated content (TripAdvisor is another one), and has long used big and small data to improve their recommendations – see highlights of their approach here.
Research Centers and Programs
While a lot of small data innovation comes from the commercial tech community, several academic and non-profit centers have been critical to the growth of the field as well. Here are 5 of them, followed by 5 agencies that are on the front lines of small data:
Open Knowledge International – Co-founder Rufus Pollock was one of the first to promote the idea of small data as a way to “decentralize data wrangling” back in 2013.
The Wharton School – the school’s online business journal, Knowledge@Wharton, has been a frequent publisher of articles on moving beyond big data.
The Small Data Lab at Cornell Tech – founded by professor Deborah Estrin (of TEDMed fame), the group has expanded to a number of researchers, advisors and grad students working on new apps and infrastructure for all things small data.
MIT Living Lab – Multi-disciplinary center exploring technical society impacts of data, including development of a data management platform for collecting and integrating personal small data from smart phones, activity trackers and wearables, campus data, and external sources like social, weather and city data.
United Nations University Small Data Lab – initiatives include those looking at small data and real-time information tools (with Vodaphone and Microsoft as corporate partners) to improve local decision making and sustainable development.
Agencies and Consultancies
Jack Morton – the brand agency jumped into the small data pool with a splash early last year, as it announced a partnership with Martin Lindstrom with the modest goal of “leading the small-data revolution” (they are creative guys after all) – see here.
SapientRazorfish – the newly merged firm has been doubling down on its data and AI chops and has some unique perspectives on the role of data in powering the buyer’s journey – see news of the firm’s Microsoft partnership here.
The small data movement has been shaped by a growing number of institutions and individuals. Representing several disciplines ranging from BI and analytics to product design, branding and customer experience strategy, here are 40 key influencers I’ve learned from and in some cases collaborated with over the past 5+ years. A number are fellow analysts or marketers, while others are technologists, editors or authors. All are worth following and getting to know:
Lisa Arthur – @lisaarthur – CMO at e-discovery vendor Kcura, former CMO of Teradata Applications, and author of “Big Data Marketing”
Kirk Borne @KirkDBorne – Principal Data Scientist at BoozAllen and prolific contributor to the analytics Twittersphere
Joe Chernov @jchernov – content marketing maven (was VP of content at HubSpot) and now head of marketing at InsightSquared
Rob Ciampa @robciampa – digital strategy and analytics guy, former CMO at Pixability, and author of “YouTube Channels for Dummies”
Dorie Clark @dorieclark – Duke professor, contributor to Forbes and HBR, and author of “Stand Out”
Erica Dhawan @edhawan – former Harvard researcher, expert on collaboration and connectional intelligence, and author of “Get Big Things Done”
Boris Evelson @bevelson – Forrester VP and Principal Analyst who has been a key proponent of democratizing big data via his “Systems of Insight” model
Laura Fitton @pistachio – founder of oneforty.com, co-author of “Twitter for Dummies” and HubSpot’s inbound marketing evangelist
Daniel Gutierrez @AMULETAnalytics Managing Editor at insideBIGDATA.com, lecturer and author of 4 books on databases and data science technology
Claudia Imhoff @Claudia_Imhoff – author and big data analyst, and founder of the Boulder BI Brain Trust
Esteban Kolsky @ekolsky – customer strategist, enterprise feedback pioneer, Gartner alum, and valued sounding board for my early small data perspectives
Suzanne LaBarre @suzannelabarre FastCo.Design editor
Charlene Li @charleneli – Principal Analyst at Altimeter and co-author of “Groundswell”
Mitch Lieberman @mjayliebs – former CRM industry executive and now director of research at G2 Crowd, focusing on business modeling and software research
Scott Liewehr @sliewehr – head of Digital Clarity Group, long-time analyst and consultant, and collaborator for my small data research while I worked with DCG
Martin Lindstrom @MartinLindstrom– branding consultant, speaker, and author of the bestselling “Small Data: The Tiny Clues That Uncover Huge Trends”
Maribel Lopez @MaribelLopez – forward-looking tech analyst (formerly at Forrester and IDC) and author of “Right-Time Experiences: Driving Revenue with Mobile and Big Data”
Loie Maxwell @loiemaxwell – former creative executive at CVS and Starbucks, and now VP of Creative at Cartoon Network; Loie is co-author of the work I’ve done around “Designed Serendipity” (see our Forbes piece here)
Ron Miller @ron_miller – TechCrunch enterprise reporter, Contributing Editor at EContent Magazine, and contributor to SocMediaNews
Debbie Millman @debbiemillman – brand consultant, host of Brand Thinking, and author of “Look Both Ways”
Margaret Molloy – @MargaretMolloy branding expert and CMO at Siegel+Gale
Mark Morley @MarkMorley – marketing technologist/IoT guy at OpenText
Don Norman @jnd1er – noted design thinker at UCSD, former VP at Apple, and author of the “The Design of Everyday Things”
Annie Pettit @LoveStats – market researcher and guru of questionnaire design
Gergory Piatetsky @kdnuggets – AI, data mining and big data veteran (alum of GTE where I worked back in my AI days) who manages KDnuggets.com
Augie Ray @augieray – well-known customer experience and loyalty researcher at Gartner and former head of social media at USAA
Jenny Rooney @jenny_rooney – Editor of the CMO Network at Forbes
Lisa Joy Rosner @lisajoyrosner – former CMO at Neustar and before then NetBase, where she launched the company’s Brand Passion Index
Nate Silver @NateSilver538 – besides pushing statistics into the mainstream, Nate is all about using big (and small) data to tell stories; author, “The Signal and the Noise”
Aaron Strout @AaronStrout – CMO at agency W2O Group, long time social media guy, and author of “Location-Based Marketing for Dummies”
Edward Tufte @EdwardTufte – statistician and information design pioneer; his books are a must read for those looking to visualize any type of data
Tim Walters @tim_walters – GDPR expert, x-Forrester, and former colleague who is one of my go-to guys RE digital experiences + disruption
Ray Wang @rwang0 – prolific tweeter, futurist and CEO of Constellation Research
When selecting the 40 vendors to include in this resource guide, I started with some of the tools that I first identified when I launched this blog in 2012 as well as those I covered over the years, and then broadened my research to look at new entrants since that time, along with established SaaS and infrastructure providers who have signaled alignment with the small data movement (either explicitly or via support for one of the key tenets).
From my definition above – and the taxonomy we created for my 2013 study with Digital Clarity Group, tools that were considered work with social, transactional or other online data, are focused on “everyday” business users and tasks, and generally support one or more of the following functions:
- They make multi-source data and content more accessible by collecting and processing it (e.g. social monitoring, mobile analytics, DIY survey tools, data blending tools),
- They allow users to explore, generate and share insights, so data is more understandable by more people (e.g. add-on reports and simple dashboards, visualizers and profiling tools), and
- They make insights more actionable for everyday tasks – or trigger actions automatically (e.g. DIY workflow/applet builders, campaign tools, usability testing solutions)
Adobe – an early supplier/advocate of data-driven marketing, via its Marketing and Analytics Cloud (also sponsored my 2013 study on the topic).
Applause – the original crowdsourced software testing pioneer, Applause also offers usability studies, and a growing range of competitive and product research services; the company closed a $35M Series F round in September 2016.
Attivio – a leader in “cognitive search” for bringing structured and unstructured data together and making insights more accessible to business users (one of my 2013 Vendors Worth Watching); the company raised $31M in March 2016.
Bison Analytics – great example of a (small data) add-on to a broadly used platform, in this case tailored analytics and reporting for QuickBooks.
Brand Networks – social advertising platform using triggers based on real-world (small data) inputs like weather, foot traffic, media etc plus predictive measurement.
Brandwatch – leading social intelligence and market research company with some impressive real-time reputation monitoring.
Cision – Media intelligence platform that acquired Visible Technologies in Sept 2014 (social monitoring), which was one of my small data companies to watch in 2013; the company went public in March 2017.
ClearStory Data – Apache Spark-based next-gen BI player with some slick data discovery, prep, blending and contextual “storyboards” to deliver self-serve insights; the company raised another $10.5M in the summer of 2016.
comScore – the media measurement and analytics provider has long been versed in making complex, cross-platform data accessible and actionable by marketing users.
Ensighten – an omni-channel customer data platform, including solutions for data collection, profiling and data privacy – acquired Anametrix in 2014; the company raised a $53M Series C round in Oct 2015.
Geckoboard – one of the best ways to easily access and visualize KPIs from various transactional data sources (we were a happy user at Placester).
GoodData – one of the early movers bringing the power of big data to the masses, now focused on transforming business data into new revenue generating assets (on my list of the Vendors Worth Watching from Feb 2013).
Google Analytics – still one of the easiest/best ways to track website traffic and turn it into simple insights.
Grow – simple BI reporting and dashboard software for SMBs, the company launched at the end of 2014 to focus on small data and finding actionable insights; the company raised a $9M Series A round in July 2016.
Hortonworks (Onyara) – one of the leaders in open source big data, moved into the IoT and small data arena when it purchased Onyara in 2015.
HubSpot – the company known for Inbound Marketing is also obsessively data-driven, with tools like its free CRM that are consistently simple, smart and social (former client).
IFTTT – Web-base automation service that enables everyday users to create simple “recipes” (applets) that link 2 services together with an action (like automatically adding all tweets with a conference hashtag to a tracking Google spreadsheet).
InsightSquared – provider of business analytics and reporting tool for sales teams, and strong proponent of empowering SMBs to become more data-driven.
Izenda – next-gen BI provider focused on embedding analytics to turn everyday users into citizen data scientists.
Localytics – mobile analytics provider with new offerings for in-app marketing and optimization tools; the company raised another $10M in Sept 2016.
Microsoft – starting with Excel and now its Power BI freemium self-service cloud service, Microsoft has arguably been the first to bring analytics to the masses and continues to promote the value of “thinking small” – see here, here and here
Moz – an all in one SEO and local marketing platform, the company’s analytics tools feature highly actionable visuals and recommendations for everyday marketers; the company raised a $10M Series C round in Jan 2016.
NetBase – NLP-based social analytics tool, their Brand Passion Index/Report is a brilliant example of how to visualize brand relationships (former client).
Nimble – the first pure-play social CRM vendor, the company has always been about simple, smart, data-driven apps that users will actually want to use (one of my 2013 Vendors Worth Watching); the company raised a $9M Series A round in March 2017.
OpenText – global enterprise information management leader with a range of tools for both “big content” and small and big data analytics via its Actuate and Recommind acquisitions.
Optimizely – data-driven experimentation and personalization company for optimizing experiences across web, mobile and connected devices; the company first started promoting small data in 2014 in this post.
QlikTech – While the company has gone more mainstream, it’s still about powering simple, mobile, contextual apps (one of my 2013 Vendors Worth Watching); the company was acquired by Thoma Bravo in June 2016.
SurveyMonkey – leading freemium online survey platform, with goal of making the way people give and take feedback “accessible, easy and affordable.”
Tableau – mass market leader for storytelling with data via easy to use data blending, visualization and analytics.
Talech – a simple retail and restaurant POS system focused on the “small data opportunity” to provide merchants with analytics to run their business better; the company raised a $15M Series B round in Nov 2015.
TIBCO – the analytics and event processing company has made a number of acquisitions (Spotfire, Jaspersoft) that broaden its predictive/embedded capabilities and has been an early proponent of small data – see here.
Trueffect – first-party media and measurement platform focused on using small data to better target and engage customers.
Velocidi – multi-source marketing analytics firm focused on helping marketers bring their campaign data together and “organize and derive insights from it” – see coverage here from siliconANGLE; raised $12M Series A round in Nov 2016.
WordStream – paid search and social campaign platform, with a widely used free PPC tool – CEO Ralph Folz is another GTE alum.
Yesware – sales productivity and analytics tools for Outlook and Office 365.
Zapier – simple, event-based automation tool for connecting everyday web apps and streamlining repetitive tasks (a super-charged IFTTT)
In a thought provoking interview with CNET published this past week, Fitbit designer Gadi Amit explores the use of wearables in everyday applications – and introduces the notion of “wise” devices that provide just the right information, when and where we need it.
Beyond the fact that Amit’s firm is designing wearables for unique markets like babies (well I guess really for their parents) and pets (!), what struck me in this piece was Amit’s perspective – certainly shaped by his role as president and lead designer at design firm NewDealDesign – on the state of wearables, their future, and our relationship with them. Specifically, in response to a question about how wearables will be integrated into our daily lives, he states:
The interesting thing is when I say that, people immediately jump to the conclusion that we will be cyborgs. My goal with designing this is that we won’t be cyborgs. We actually will become more human and more free from the technology. What we have now in the design business is two camps: there is the camp that wants to create a lot of data and wants to analyse a lot of data; and there is the other camp which I belong to that tries to create devices that are not smart, they are actually wise. They are more than smart, they are wise enough to understand you, to filter and allow you to go on with your life with all their data processing in the background giving you hints of what is essential when it is essential.
Having data processing in the background and focusing on what information is essential is of course very much in line with the small data “aesthetic” we’ve been promoting here and in a number of venues over the past 2 years, so it’s cool to hear validation from another corner. As a former AI/machine learning guy, I also like the idea of “wise” devices that understand context and personal preferences, and can make a case that small data will in fact be the new “OS” for these devices (more in a future post).
But even more so, if we think of the cyborg comment as a challenge to all of us, I think we need to consider the element of “humanness” as we create new apps and digital experiences. And perhaps provide better opportunities and incentives to untether/unplug (partially?) from our digital devices, even as consumers clamor for faster, more personal, more portable, and ultimately more satisfying data-enriched experiences.
Designing Data-driven Apps
Speaking of the new data consumer, I’ve been spending more time with developers and those thinking about the future of customer facing apps, and recently created a talk on design principles that builds on some of the work you’ve read about on this very blog. As always I believe that data-driven design is an art and a science, so it’s been fun to brush up on the science/tech part for sure.
Of course our first job is still to think about the end-consumer, and how we can inform, connect, and motivate them to get involved or take action. As an aside, if you’ve paid attention to how I’ve presented this last point, I’ve always used Nike Fuelband as my example, so with news that Nike is getting out of the fitness hardware business (good analysis in this Gigaom piece), it’s been interesting to see Fitbit and even Samsung step up their efforts ahead of the likely fall iWatch debut.
On the business side, beyond understanding the value of data along the customer journey and focusing on “last mile” functionality, having a scalable foundation that can potentially support millions of users and large data sets from many sources (before it is transformed into useful small data) is essential as we look to bring powerful, yet human-scale, smart (wise) apps to the masses. So is a community to drive innovation – like the 3.5 million BIRT developers, or 600K+ Drupal users and coders.
Many of these ideas (and some examples) were covered in the talk I did with SD Times recently. There’s a link to the replay and a summary by my colleague Fred Sandsmark on the Actuate blog – which you can read here.
I also presented a longer version focused on bringing the power of advanced analytics to “everyday tasks” at the CAMP IT big data event this past week, (a well-produced event by the way) and plan to post those slides to my slideshare shortly.
Finally, I will be moderating a very cool expert panel on “building the next big app” at a special event Actuate is hosting in San Jose on the evening of July 10. Scheduled to join me on stage will be Eclipse Foundation Executive Director Mike Milinkovich, plus industry watcher and enterprise apps futurist Esteban Kolsky, along with 1-2 other special guests. We’ll explore how consumer experiences will (and are) be shaped by new devices and data, open source driven innovation, and next-generation design tools and practices.
Be sure to let me know if you’ll be in the area and want to join us, since I have a limited number of VIP passes to share.