Microsoft has a collection of publicly available artificial intelligence tools that it calls the Cognitive Services.
These services are provided via REST APIs and are available to use anywhere, including software running on other cloud providers, such as AWS or GCP.
Some of the services offered include:
- Computer Vision API: Interpret the objects in a photograph: What they are, what color they might be, even the context of a photograph.
- Face API: Locate and recognize faces in images.
- Translator Speech API: Real-time audio translation of spoken content.
- Speaker Recognition API: Determine who has uttered a given statement based on voice characteristics.
- Bing Autosuggest API: Provides autosuggest text for search boxes.
- Content Moderation: Detect inappropriate content in text, images, and videos.
There are far more services than these, but many of the Cognitive Services are still in public preview.
The best part of Cognitive Services is that it’s free to use if you can accept a one-request-per-second rate limit and being bound to one free service per subscription. If so, you can unlock enterprise-grade artificial intelligence to aid in your workflows. (There are also paid versions of these services which are not rate-limited and allow for multiple services per subscription.)
In this guide, we will create a free Cognitive Services Content Moderation service, retrieve its key, then use its API management portal to test some of the available image interpretation services.
Create a Cognitive Services Content Moderation service
To begin, log in to the Azure Portal, click the New button, then click AI + Cognitive Services. Scroll down until you see “Content Moderation”, and click that.
Click the Create button on the next pane.
- Give the service a name you would like to use. This name need only be unique to the resource group.
- Choose your subscription from the pull-down menu.
- Choose a location for the service. Note that the regional availability of Cognitive Services is limited.
- From the pull-down pricing tier menu choose Free F0. Note you can only have one free tier Cognitive Service per subscription.
- Either choose or create a resource group for this deployment.
- You must accept Microsoft’s terms for use of your content for its AI training, along with other requirements, to use Cognitive Services. Check the box to indicate your agreement.
- You may optionally check the box to pin a tile for this resource to your portal Dashboard.
- Click the Create button.
Obtain and copy API key
Close all tabs and wait for deployment to complete. Once done, open your resource group, and you should see that your Cognitive Services Content Moderation service is available.
Note that the Content Moderation Cognitive Service has versions for images, video, and text, as well as several methods to interpret content, including facial recognition, optical character recognition, and more.
- Click the link to the Cognitive Services account to open its details blades.
- In the left-hand blade, scroll down until you see Keys.
- Click the Keys link.
- Copy the value for Key 1. You will need it to run your queries.
Open the Cognitive Services API portal
With your key in hand, you’re ready to use the API portal.
- Click the Quick start button in the left-hand blade.
- Click the Content Moderator API reference link in the right-hand blade.
- In the documentation page that comes up, click the Image Moderation API Reference link.
You should now be in the Cognitive Services API management portal. From here, you can run test queries on Content Moderation for images.
Evaluate a ‘racy’ photo
By default the Evaluate endpoint is selected. This API judges whether an image contains adult or suggestive content. We will begin with this endpoint.
Click the blue button that corresponds to your Cognitive Services account’s location. In this guide, my region is West Central US, but click whichever one has your account.
You are now in the appropriate API testing page for your region’s Cognitive Services endpoint.
- Leave the CacheImage value blank. You can cache images you send to the API to improve performance, but we are working with a low volume; therefore, we will see little benefit from caching.
- Leave Content-Type as application/json.
- For Ocp-Apim-Subscription-Key, enter the key value you copied from your Cognitive Services account earlier.
- Scroll down until you see the heading “Request Body,” with a text area that contains JSON. You will see a default HTTP path to an image in the Value field.
- Replace that path with this one, showing a group of people on a beach: http://www.freestockphotos.name/wallpaper-original/wallpapers/people-sun-beach-1744.jpg
Depending on your workplace, this image may be considered inappropriate. It depicts two women in bikinis and two men in board shorts, seen from the front. This image is never shown in the API management portal, so this guide is safe to perform at work; however, if you visit the image URL in a Web browser to see what the image looks like, it may violate your workplace rules.
- Scroll to the end of the page. You will see your request URL and the raw request you are going to send.
- Click Send.
You should see new content appear beneath the Send button within a moment.
- Scroll to the bottom of the page.
- You should see a Response Status of 200 or HTTP OK.
- The Response Latency tells you how long, in milliseconds, it took from request to response. Initial requests of Cognitive Services make a “cold start” of the service for your account and usually take longer than follow-up requests made a few moments later.
- In the Response Content field, you will see the HTTP headers returned with the request, as well as a JSON formatted body.
- The Adult Classification Score gives you an estimate of whether this is an adult image. A score of 1.0 is a hardcore pornographic image.
- IsImageAdultClassified is a Boolean you can use to flag an image’s content as adult, rather than having to interpret the score yourself. If the score is high, this value will be true.
- RacyClassificationScore is an alternative measure for the appropriateness of an image. A “racy” image may not be pornographic but may be provocative. Our photograph scores high in this classification but low on the adult classification. A score here of 1.0 indicates the image is provocative.
- IsImageRacyClassified is a Boolean you can use to flag an image as suggestive rather than having to interpret the score itself. If the score is high, this will be true.
- Result will be true if either IsImageAdultClassified or IsImageRacyClassified are true.
- AdvancedInfo will always return an empty array.
- Status will describe the result of the API call.
- A Code of 3000 indicates success.
- The Description field will describe the error state of the request.
- The Exception field will be an object describing the internal exception thrown by the API if one exists; otherwise it will be null.
- TrackingId is an internal reference to the request which you can use for error tracing or to reference if you want to make note to Microsoft about the result of the request (such as, you think an image was mischaracterized as racy or adult).
Evaluate a ‘safe’ photo
Let’s change images and see the scoring for a “safe” photo.
- Scroll back up to the Request Body section of the page.
- Replace the Value of the JSON object to be this URL of a mother, grandmother, and baby: https://cdn.pixabay.com/photo/2017/06/23/00/09/grandparents-2433019_960_720.jpg (Note: Clicking this link will lead to a 301 redirect against hotlinking. To see the image, copy the link, open a new tab, paste in this URL and hit return. The API will be able to correctly retrieve this image.)
- Scroll back down to HTTP Request and click the Send button beneath the text area.
The new scores indicate that this photo is neither racy nor adult.
Detect and read text in an image
This API also supports optical character recognition, or OCR. That is, it can read words in a photo.
- On the left-hand menu, click OCR. You will be sent to the OCR page.
- Click the blue button that corresponds to the region where your Cognitive Services account is created.
- Leave Language as eng.
- Leave CacheImage as false.
- Leave enhanced as false. This setting performs a finer inspection of the shapes in the image, to locate additional possible word matches. This tends to create “false positives” and is unnecessary when using in-focus images with good tonal differences between text and background.
- Leave Content-Type as application/json.
- Provide your API key as the value for Ocp-Apim-Subscription-Key.
- We are going to use the API-provided sample image, so scroll to the bottom of the page and click Send.
You will see the response appear below the Send button after a few moments.
- The JSON object returned gives the dimensions of the image, its language, and the content of any text discovered in the image.
- Newlines (\n) and carriage returns (\r) indicate any line breaks detected.
- Note that the AI misses the last two characters of the last line of text (OURSELVE vs OURSELVES.) This is probably due to the darker background near those letters. Enhanced mode might improve interpretation; give it a try!
Sources / Resources
Cognitive Services documentation is at https://docs.microsoft.com/en-us/azure/cognitive-services/
The full list of available services, as well as samples and API references, are also located there.