(Music) Welcome to Module 2 of our course on building AI applications. In this module, we’ll explore Watson Discovery, one of the coolest services available on the IBM Cloud. You can think of Watson Discovery as an advanced insight engine. The key goal of Watson Discovery is to leverage Watson’s Artificial Intelligence capabilities to extract insight out of your data. Most companies have a wealth of data. The real challenge, is extracting insight and discovering actionable information. This is why the data scientist role has become so prominent in recent years. Okay, I hear you, this still feels somewhat abstract, so let’s consider a practical example. Data is everywhere. The ability to it to improve your business is where the real value lies. Okay, I hear you, this still feels somewhat abstract, so let’s consider a practical example. Watson Discovery allows you to load your own data, but it also comes with a convenient Watson Discovery News data set containing news articles from the past 60 days. This huge collection is automatically maintained and updated for you, allowing you to easily analyze the news. Now, imagine that you are a journalist, a writer, or anyone who needs to analyze the news. Having access to over ten million news articles is nothing short of amazing, but it’s also quite overwhelming. How do you go about querying such a huge dataset to extract insight out of it? That’s where the magic of Watson Discovery kicks in. Discovery has the ability to apply Artificial Intelligence algorithms to enrich the raw data that you imported (or in the case of the News collection, that was imported for you). Such enrichments include extracting keywords from the documents, tagging concepts, extracting specific entities such as country, companies, and even people, as well as sentiment analysis. Was a news article about a particular company, positive, neutral, or negative? Was a news article about a particular company, positive, neutral, or negative? Discovery allows us to query this enriched data. We can build our own queries through natural language, a visual builder, or by using a specific syntax employed by Discovery, known as the Discovery Query Language. In the case of Watson Discovery News we even have some sample queries for immediate gratification and to show us how said syntax is specified. What’s also great about these sample queries is that they give you an idea of the sort of insight that you can extract using Watson Discovery. For example, using this news collection, we can find out the top 10 companies in the news who received the most positive coverage, or AI companies that were recently acquired, or even the most mentioned people in the tech industry. Imagine trying to do this manually without a service like Discovery. The reason why these queries are possible is because Watson Discovery enriches the documents. Take an article about Tesla for example. Watson will read the article, determine whether it’s positive, negative or neutral, detect relevant people (for example, Elon Musk) and companies mentioned within it (for example, Tesla and Mercedes-Benz), and so on. You’ll get more familiar with the Discovery Query Language in the labs for this module, but to give you an idea of what it looks like, let’s zoom in on this particular query for the most positively talked about companies in the media. We start off with enriched_text, which is the text of a news article, plus all the nice enrichments Watson added for us. From there, we select the entities within the document. Entities are special values that were detected by Watson. Think companies, people, cities, et cetera. We then filter for a specific type of entity. In our case, companies. But we only want the subset of mentions that are positive, so we use the sentiment analysis enrichment to ask for a high positive sentiment score. Finally, we specify that we want to aggregate the results through the name of the entity, for example, Netflix or Facebook. If we wanted to find out the people who received negative media coverage, we could simply change the entity type filter to be Person instead of Company, and adjust the sentiment score requirements. Sometimes you’ll find the same name in both the top positive and negative lists. This is not uncommon for very famous politicians as they tend to receive polarizing coverage in the press, so they might receive lots of positive mentions but also lots of criticism. Don’t worry too much about getting the details of the query language right away. There are simpler ways to query Discovery using either natural language or visual mode. Nevertheless, it’s important that you know that the advanced option is there if you need it. And the more you use Discovery and play with it, the more familiar you’ll become with its powerful query language. Having access to news data is great, but most businesses are likely to be interested in being able to query their own data and documents. Discovery allows you to easily connect various data sources to create your own collection. There is integration with Salesforce, SharePoint, Box, and IBM Cloud Object Storage for example. You can also get Discovery to crawl a website for the information as well as uploading documents directly, including PDFs. Your business might have domain specific entities, so Discovery even offers you the ability to create custom machine learning models for the identification and classification of entities that are pertinent to your specific business. This is a more advanced topic that will require integration with another Watson tool, namely Watson Knowledge Studio which allows you to create and train models. As far as this course goes, we are not going to leverage such functionality, but it’s important that you know it’s there if you need it. In fact, feel free to explore Discovery and what it has to offer beyond what we cover in this course, by perusing the official documentation. (Music)