This notebook was prepared by Algorithmia. Source and license info is on GitHub.
Reference: Algorithmia Documentation
Table of Contents:
In [ ]:
pip install algorithmia==0.9.3
In [1]:
import Algorithmia
import pprint
pp = pprint.PrettyPrinter(indent=2)
In [2]:
API_KEY = 'YOUR_API_KEY'
# Create a client instance
client = Algorithmia.client(API_KEY)
Uses a pretrained model to detect faces in a given image.
Read more about Face Detection here
In [3]:
from IPython.display import Image
face_url = 'https://s3.amazonaws.com/algorithmia-assets/data-science-ipython-notebooks/face.jpg'
# Sample Face Image
Image(url=face_url)
Out[3]:
In [4]:
Algorithmia.apiKey = 'Simple ' + API_KEY
input = [face_url, "data://.algo/temp/face_result.jpg"]
algo = client.algo('opencv/FaceDetection/0.1.8')
algo.pipe(input)
# Result Image is in under another algorithm name because FaceDetection calls ObjectDetectionWithModels
result_image_data_api_path = '.algo/opencv/ObjectDetectionWithModels/temp/face_result.jpg'
# Result Image with coordinates for the detected face region
result_coord_data_api_path = '.algo/opencv/ObjectDetectionWithModels/temp/face_result.jpgrects.txt'
result_file = Algorithmia.file(result_image_data_api_path).getBytes()
result_coord = Algorithmia.file(result_coord_data_api_path).getString()
# Show Result Image
Image(data=result_file)
Out[4]:
In [5]:
# Show detected face region coordinates
print 'Detected face region coordinates: ' + result_coord
In [6]:
# Get a Wikipedia article as content
wiki_article_name = 'Technological Singularity'
client = Algorithmia.client(API_KEY)
algo = client.algo('web/WikipediaParser/0.1.0')
wiki_page_content = algo.pipe(wiki_article_name)['content']
print 'Wikipedia article length: ' + str(len(wiki_page_content))
In [7]:
# Summarize the Wikipedia article
client = Algorithmia.client(API_KEY)
algo = client.algo('SummarAI/Summarizer/0.1.2')
summary = algo.pipe(wiki_page_content.encode('utf-8'))
print 'Wikipedia generated summary length: ' + str(len(summary['summarized_data']))
print summary['summarized_data']
This algorithm takes a group of documents (anything that is made of up text), and returns a number of topics (which are made up of a number of words) most relevant to these documents.
Read more about Latent Dirichlet Allocation here
In [8]:
# Get up to 20 random Wikipedia articles
client = Algorithmia.client(API_KEY)
algo = client.algo('web/WikipediaParser/0.1.0')
random_wiki_article_names = algo.pipe({"random":20})
random_wiki_articles = []
for article_name in random_wiki_article_names:
try:
article_content = algo.pipe(article_name)['content']
random_wiki_articles.append(article_content)
except:
pass
print 'Number of Wikipedia articles scraped: ' + str(len(random_wiki_articles))
In [9]:
# Find topics from 20 random Wikipedia articles
algo = client.algo('nlp/LDA/0.1.0')
input = {"docsList": random_wiki_articles, "mode": "quality"}
topics = algo.pipe(input)
pp.pprint(topics)
Recognize text in your images.
Read more about Optical Character Recognition here
In [10]:
from IPython.display import Image
businesscard_url = 'https://s3.amazonaws.com/algorithmia-assets/data-science-ipython-notebooks/businesscard.jpg'
# Sample Image
Image(url=businesscard_url)
Out[10]:
In [11]:
input = {"src": businesscard_url,
"hocr":{
"tessedit_create_hocr":1,
"tessedit_pageseg_mode":1,
"tessedit_char_whitelist":"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-@/.,:()"}}
algo = client.algo('tesseractocr/OCR/0.1.0')
pp.pprint(algo.pipe(input))