DMI Google Image Scraper to Clarifai Tags
Tag images from Google queries using the Clarifai service.
- Uses the DMI Google Image Scraper
- Requires an API key from Clarifai
- Works as a bookmarklet
- Downloads a CSV
GENERATE BOOKMARKLET
Tagging algorithm (model) more info |
Confidence threshold |
tags as list | tags as columns | raw JSON |
---|---|---|---|---|
General purpose model. |
||||
Recognizes fashion-related items. |
||||
Celebrities resembling detected faces. |
||||
Age, gender, ethnic group of found faces. |
||||
Recognizes food items and dishes. |
||||
Detects unwanted content: gore, nudity... |
||||
Identifies nudity ("Not Safe For Work"). |
||||
Recognizes common visual patterns. |
||||
Travel and hospitality-related concepts. |
||||
Wedding-related concepts. |
HOW TO USE
1. Browse to the DMI Google Image Scraper. This tool simplifies the querying of Google Images.
2. Enter your query in the field titled "Key words", or follow instructions.
3. Click the bookmarklet and WAIT. Here your browser connects to Clarifai to tag the images. It takes some time; a CSV file will download when the tagging is done. Open the Javascript console for more detailed information during the process.
HELP
What is the purpose of the tool?
When you type a query in Google Image, you get a list of images. This tool allows you downloading this list for any number of queries. And for each image on the list, it adds data from an image recognition service named Clarifai. The Clarifai data uses machine learning to identify elements of the picture: objects, but also faces and their demographic attributes.
The tool requires a setup process, but once it is done the data can be gathered in one click. The setup can be fully done in this page. It requires you get a Clarifai API key, a personal identifier that allows you to get data from the Clarifai service. In the end it generates a bookmarklet: a mini script embedded in a bookmark. To use it, you just have to type your query in a certain page and click on the bookmarklet to download the data.
How to get a Clarifai API Key
Sign up to Clarifai to get your API key. Just follow the instructions. It does not require any payment or card number. But beyond the first 5000 images per month, it will stop working unless you pay for it.
Why an API key? Clarifai tags your images with machine learning techniques. You send it images via the web and it responds with tags. The "API" is the door to the service: it has both an address and a lock that requires a key. The key is personal, and Clarifai uses it to monitor your use. Indeed the service is only free up to 5000 images a month. This tool knows where the address is, but you need to tell it your key. The resulting bookmarklet will only visible to you, so no one will get your API key. You can go to Clarifai to know how many of your monthly free queries are used.
How to use the settings?
Pick the tagging model(s) relevant to you. See list below. The bookmarklet you generate will use the specified Clarify models and data formats.
Each model you use spends an API call per image tagged so the more models you use, the faster you reach the limit of 5000 free API calls per month. Example: you use 3 models, and you tag 100 images. It uses 300 API calls each time you run the bookmarklet.
Three data formats are available. Each corresponds to a different need, and you can pick multiple. Two of them require a threshold, a number between 0 and 1 that you can set.
1. Tags as list. The easiest data format. Concepts with a confidence score above the specified threshold will appear as a list of tags in a single column. Example:
Image | Concepts |
---|---|
A | fashion, business, leather |
B | leather, retro |
C | business, retro, coffee, shopping |
2. Tags as columns. Quite easy but more rich. Concepts with a confidence score above the specified threshold will appear in multiple column. Each tag has its own column. Example:
Image | Fashion | Business | Leather | Retro | Coffee | Shopping |
---|---|---|---|---|---|---|
A | 1 | 1 | 1 | |||
B | 1 | 1 | ||||
C | 1 | 1 | 1 | 1 |
Important note: "tags as columns" can count multiple tags for a same image. This happens for models that can recognize multiple items per image. For instance the model "Demographics" can find multiple faces in a single image, and tag them all. If you have three feminine faces in an image, the column "feminine" will have the value "3" for that image.
3. Raw JSON. This format is the harder to use but the most complete. It will simply store the information answered by Clarifai in a single column. The data is formated as a JSON, structured as a tree. It is therefore very hard to use in a spreadsheet, but easy to use in a script; but it is a good way to log the results for further use. Note: it also contains concepts under the confidence threshold. Example:
Image | Concepts |
---|---|
A |
{"concepts": [ {"id": "ai_GC6FB0cQ", "name": "fashion", "value": 0.99863684}, {"id": "ai_fBH5DFMJ", "name": "business", "value": 0.9962599}, {"id": "ai_2KV5G1Fg", "name": "leather", "value": 0.97945905}, {"id": "ai_XN1QLhwp", "name": "retro", "value": 0.27526324}, {"id": "ai_KWmFf1fn", "name": "coffee", "value": 0.1743866}, {"id": "ai_GC6FB0cQ", "name": "unicorn", "value": 0.0054384} ]} |
What are the different algorithms available?
Clarifai proposes multiple algorithms, or "models". Each is trained differently, and recognizes different concepts. Some are more specialized (NSFW only tells if an image is "safe for work" or not) than others (GENERAL recognizes 11,000 concepts).
Refer to the Clarify Model Gallery for complete information, or look at the summary below.
GENERAL.
General purpose model. Recognizes over 11,000 different concepts.
Examples of concepts:
Afternoon
Art
Beautiful
Bicycle
Happiness
Togetherness
Apparel.
Recognizes fashion-related items.
Examples of concepts:
Blouse
Bracelet
Casual Dress
Fleece Jacket
Loafers
Pant Suit
Celebrity.
Identifies celebrities resembling detected faces.
Examples of concepts:
Marilyn Monroe
Ice Cube
Jennifer Lopez
Angelina Jolie
Jake Gyllenhaal
Demographics.
Predicts the age, gender, and cultural appearance of detected faces.
Examples of concepts:
18
94
feminine
masculine
asian
black or african american
Food.
Recognizes food items and dishes.
Examples of concepts:
Apple
Avocado
Bread
Ice Cream
Sandwich
Steak
Moderation.
Recognizes unwanted content: gore, drugs, nudity.
Examples of concepts:
Gore
Drug
Explicit
Suggestive
Safe
NSFW (Not Safe For Work).
Identifies nudity: "safe for work" or "not safe for work".
It uses only two concepts:
NSFW (Not Safe For Work)
SFW (Safe For Work)
Textures and patterns.
Recognizes common visual patterns.
Examples of concepts:
feathers
woodgrain
petrified wood
glacial ice
veined
metallic
Travel.
Travel and hospitality-related concepts.
Examples of concepts:
Balcony
Beach
Breakfast Buffet
Casino
Kids Area
Restaurant
Wedding.
Wedding-related concepts.
Examples of concepts:
Bouquet
Bride
Cake
Ceremony
Flowers
Groom