In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo('4m44aPkLY2k')


Out[1]:

如何使用和开发微信聊天机器人的系列教程

A workshop to develop & use an intelligent and interactive chat-bot in WeChat

http://www.KudosData.com

by: Sam.Gu@KudosData.com

May 2017 ========== Scan the QR code to become trainer's friend in WeChat ========>>

第四课:自然语言处理:语义和情感分析

Lesson 4: Natural Language Processing 2

  • 消息文字中名称实体的识别 (Name-Entity detection)
  • 消息文字中语句的情感分析 (Sentiment analysis, Sentence level)
  • 整篇消息文字的情感分析 (Sentiment analysis, Document level)
  • 语句的语法分析 (Syntax / Grammar analysis)

Flag to indicate the environment to run this program:


In [1]:
# parm_runtime_env_GCP = True
parm_runtime_env_GCP = False

Using Google Cloud Platform's Machine Learning APIs

From the same API console, choose "Dashboard" on the left-hand menu and "Enable API".

Enable the following APIs for your project (search for them) if they are not already enabled:

  1. Google Translate API
  2. Google Cloud Vision API
  3. Google Natural Language API
  4. Google Cloud Speech API

Finally, because we are calling the APIs from Python (clients in many other languages are available), let's install the Python package (it's not installed by default on Datalab)


In [2]:
# Copyright 2016 Google Inc.
# Licensed under the Apache License, Version 2.0 (the "License"); 
# import subprocess
# retcode = subprocess.call(['pip', 'install', '-U', 'google-api-python-client'])
# retcode = subprocess.call(['pip', 'install', '-U', 'gTTS'])

# Below is for GCP only: install audio conversion tool
# retcode = subprocess.call(['apt-get', 'update', '-y'])
# retcode = subprocess.call(['apt-get', 'install', 'libav-tools', '-y'])

导入需要用到的一些功能程序库:


In [3]:
import io, os, subprocess, sys, re, codecs, time, datetime, requests, itchat
from itchat.content import *
from googleapiclient.discovery import build


GCP Machine Learning API Key

First, visit API console, choose "Credentials" on the left-hand menu. Choose "Create Credentials" and generate an API key for your application. You should probably restrict it by IP address to prevent abuse, but for now, just leave that field blank and delete the API key after trying out this demo.

Copy-paste your API Key here:


In [4]:
# Here I read in my own API_KEY from a file, which is not shared in Github repository:
with io.open('../../API_KEY.txt') as fp: 
    for line in fp: APIKEY = line

# You need to un-comment below line and replace 'APIKEY' variable with your own GCP API key:
# APIKEY='AIzaSyCvxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

In [5]:
# Below is for Google Speech synthesis: text to voice API
# from gtts import gTTS

# Below is for  Google Speech recognition: voice to text API
# speech_service = build('speech', 'v1', developerKey=APIKEY)

# Below is for Google Language Tranlation API
# service = build('translate', 'v2', developerKey=APIKEY)

# Below is for Google Natual Language Processing API
# nlp_service = build('language', 'v1', developerKey=APIKEY)
nlp_service = build('language', 'v1beta2', developerKey=APIKEY)

多媒体二进制base64码转换 (Define media pre-processing functions)


In [6]:
# Import the base64 encoding library.
import base64
# Pass the image data to an encoding function.
def encode_image(image_file):
    with io.open(image_file, "rb") as image_file:
        image_content = image_file.read()
# Python 2
    if sys.version_info[0] < 3:
        return base64.b64encode(image_content)
# Python 3
    else:
        return base64.b64encode(image_content).decode('utf-8')

# Pass the audio data to an encoding function.
def encode_audio(audio_file):
    with io.open(audio_file, 'rb') as audio_file:
        audio_content = audio_file.read()
# Python 2
    if sys.version_info[0] < 3:
        return base64.b64encode(audio_content)
# Python 3
    else:
        return base64.b64encode(audio_content).decode('utf-8')

机器智能API接口控制参数 (Define control parameters for API)


In [7]:
# API control parameter for Image API:
parm_image_maxResults = 10 # max objects or faces to be extracted from image analysis

# API control parameter for Language Translation API:
parm_translation_origin_language = 'zh' # original language in text: to be overwriten by TEXT_DETECTION
parm_translation_target_language = 'zh' # target language for translation: Chinese


# API control parameter for 消息文字转成语音 (Speech synthesis: text to voice)
parm_speech_synthesis_language = 'zh' # speech synthesis API 'text to voice' language
# parm_speech_synthesis_language = 'zh-tw' # speech synthesis API 'text to voice' language
# parm_speech_synthesis_language = 'zh-yue' # speech synthesis API 'text to voice' language

# API control parameter for 语音转换成消息文字 (Speech recognition: voice to text)
# parm_speech_recognition_language = 'en' # speech API 'voice to text' language
parm_speech_recognition_language = 'cmn-Hans-CN' # speech API 'voice to text' language

# API control parameter for 自然语言处理:语义和情感分析
parm_nlp_extractDocumentSentiment = True # 情感分析 (Sentiment analysis)
parm_nlp_extractEntities = True          # 消息文字中名称实体的识别 (Name-Entity detection)
parm_nlp_extractEntitySentiment = False  # Only available in v1beta2. But Chinese language zh is not supported yet.
parm_nlp_extractSyntax = True            # 语句的语法分析 (Syntax / Grammar analysis)

定义一个调用自然语言处理接口的小功能


In [8]:
# Running Speech API
def KudosData_nlp(text, extractDocumentSentiment, extractEntities, extractEntitySentiment, extractSyntax): 
    # Python 2
#     if sys.version_info[0] < 3: 
#         tts = gTTS(text=text2voice.encode('utf-8'), lang=parm_speech_synthesis_language, slow=False)
    # Python 3
#     else:
#         tts = gTTS(text=text2voice, lang=parm_speech_synthesis_language, slow=False)
        
    request = nlp_service.documents().annotateText(body={
                "document":{
                    "type": "PLAIN_TEXT",
                    "content": text
                    },
                "features": {
                    "extractDocumentSentiment": extractDocumentSentiment,
                    "extractEntities": extractEntities,
                    "extractEntitySentiment": extractEntitySentiment, # only available in v1beta2
                    "extractSyntax": extractSyntax,
                    },
                "encodingType":"UTF8"
                })
    responses = request.execute(num_retries=3)        
    print('\nCompeleted: NLP analysis API')
    return responses

< Start of interactive demo >


In [9]:
text4nlp = 'As a data science consultant and trainer with Kudos Data, Zhan GU (Sam) engages communities and schools ' \
           'to help organizations making sense of their data using advanced data science , machine learning and ' \
           'cloud computing technologies. Inspire next generation of artificial intelligence lovers and leaders.'

In [10]:
text4nlp = '作为酷豆数据科学的顾问和培训师,Sam Gu (白黑) 善长联络社群和教育资源。' \
           '促进各大公司组织使用先进的数据科学、机器学习和云计算技术来获取数据洞见。激励下一代人工智能爱好者和领导者。'

In [11]:
responses = KudosData_nlp(text4nlp
                            , parm_nlp_extractDocumentSentiment
                            , parm_nlp_extractEntities
                            , parm_nlp_extractEntitySentiment
                            , parm_nlp_extractSyntax)


Compeleted: NLP analysis API

In [12]:
# print(responses)

* 消息文字中名称实体的识别 (Name-Entity detection)


In [13]:
# print(responses['entities'])

In [14]:
for i in range(len(responses['entities'])): 
#     print(responses['entities'][i])
    print('')
    print(u'[ 实体 {} : {} ]\n  实体类别 : {}\n  重要程度 : {}'.format(
          i+1
        , responses['entities'][i]['name']
        , responses['entities'][i]['type']
        , responses['entities'][i]['salience']
    ))
#     print(responses['entities'][i]['name'])
#     print(responses['entities'][i]['type'])
#     print(responses['entities'][i]['salience'])
    if 'sentiment' in responses['entities'][i]:
        print(u'  褒贬程度 : {}\n  语彩累积 : {}'.format(
              responses['entities'][i]['sentiment']['score']
            , responses['entities'][i]['sentiment']['magnitude']
        ))
#     print(responses['entities'][i]['sentiment'])
    if responses['entities'][i]['metadata'] != {}:
        if 'wikipedia_url' in responses['entities'][i]['metadata']:
            print('  ' + responses['entities'][i]['metadata']['wikipedia_url'])


[ 实体 1 : 酷豆数据科学 ]
  实体类别 : OTHER
  重要程度 : 0.21968429
  https://en.wikipedia.org/wiki/Data_science

[ 实体 2 : 顾问 ]
  实体类别 : PERSON
  重要程度 : 0.14425991

[ 实体 3 : Sam Gu ]
  实体类别 : PERSON
  重要程度 : 0.13641302

[ 实体 4 : 培训师 ]
  实体类别 : PERSON
  重要程度 : 0.118462436

[ 实体 5 : 白黑 ]
  实体类别 : OTHER
  重要程度 : 0.118462436

[ 实体 6 : 数据科学 ]
  实体类别 : OTHER
  重要程度 : 0.04529309

[ 实体 7 : 云计算技术 ]
  实体类别 : OTHER
  重要程度 : 0.039374225

[ 实体 8 : 社群 ]
  实体类别 : OTHER
  重要程度 : 0.035438944

[ 实体 9 : 教育资源 ]
  实体类别 : OTHER
  重要程度 : 0.035438944

[ 实体 10 : 机器学习 ]
  实体类别 : OTHER
  重要程度 : 0.03370535

[ 实体 11 : 公司组织 ]
  实体类别 : ORGANIZATION
  重要程度 : 0.030995347

[ 实体 12 : 人工智能爱好者 ]
  实体类别 : PERSON
  重要程度 : 0.021235999

[ 实体 13 : 领导者 ]
  实体类别 : PERSON
  重要程度 : 0.021235999

* 消息文字中语句的情感分析 (Sentiment analysis, Sentence level)


In [15]:
# print(responses['sentences'])

In [16]:
for i in range(len(responses['sentences'])):
    print('')
    print(u'[ 语句 {} : {} ]\n( 褒贬程度 : {} | 语彩累积 : {} )'.format(
          i+1
        , responses['sentences'][i]['text']['content']
        , responses['sentences'][i]['sentiment']['score']
        , responses['sentences'][i]['sentiment']['magnitude']
    ))


[ 语句 1 : 作为酷豆数据科学的顾问和培训师,Sam Gu (白黑) 善长联络社群和教育资源。 ]
( 褒贬程度 : 0.8 | 语彩累积 : 0.8 )

[ 语句 2 : 促进各大公司组织使用先进的数据科学、机器学习和云计算技术来获取数据洞见。 ]
( 褒贬程度 : 0.9 | 语彩累积 : 0.9 )

[ 语句 3 : 激励下一代人工智能爱好者和领导者。 ]
( 褒贬程度 : 0.2 | 语彩累积 : 0.2 )

https://cloud.google.com/natural-language/docs/basics

  • score 褒贬程度 of the sentiment ranges between -1.0 (negative) and 1.0 (positive) and corresponds to the overall emotional leaning of the text.
  • magnitude 语彩累积 indicates the overall strength of emotion (both positive and negative) within the given text, between 0.0 and +inf. Unlike score, magnitude is not normalized; each expression of emotion within the text (both positive and negative) contributes to the text's magnitude (so longer text blocks may have greater magnitudes).
Sentiment Sample Values
明显褒义 Clearly Positive "score 褒贬程度": 0.8, "magnitude 语彩累积": 3.0
明显贬义 Clearly Negative "score 褒贬程度": -0.6, "magnitude 语彩累积": 4.0
中性 Neutral "score 褒贬程度": 0.1, "magnitude 语彩累积": 0.0
混合 Mixed "score 褒贬程度": 0.0, "magnitude 语彩累积": 4.0

* 整篇消息文字的情感分析 (Sentiment analysis, Document level)


In [17]:
# print(responses['documentSentiment'])

In [18]:
print(u'[ 整篇消息 语种 : {} ]\n( 褒贬程度 : {} | 语彩累积 : {} )'.format(
            responses['language']
            , responses['documentSentiment']['score']
            , responses['documentSentiment']['magnitude']
    ))


[ 整篇消息 语种 : zh ]
( 褒贬程度 : 0.6 | 语彩累积 : 2 )

* 语句的语法分析 (Syntax / Grammar analysis)


In [19]:
for i in range(len(responses['tokens'])): 
    print('')
    print(responses['tokens'][i]['text']['content'])
    print(responses['tokens'][i]['partOfSpeech'])
    print(responses['tokens'][i]['dependencyEdge'])
#     print(responses['tokens'][i]['text'])
#     print(responses['tokens'][i]['lemma'])


作为
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 16, u'label': u'VMOD'}

酷
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 2, u'label': u'SUFF'}

豆
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'AFFIX', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 4, u'label': u'NN'}

数据
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 4, u'label': u'NN'}

科学
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 6, u'label': u'POSS'}

的
{u'case': u'GENITIVE', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PRT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 4, u'label': u'PS'}

顾问
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 0, u'label': u'DOBJ'}

和
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'CONJ', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 6, u'label': u'CC'}

培训师
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 6, u'label': u'CONJ'}

,
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 16, u'label': u'P'}

Sam
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'X', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 15, u'label': u'NN'}

Gu
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'X', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 10, u'label': u'FOREIGN'}

(
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 13, u'label': u'P'}

白黑
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 10, u'label': u'APPOS'}

)
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 13, u'label': u'P'}

善长
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 16, u'label': u'NSUBJ'}

联络
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 16, u'label': u'ROOT'}

社群
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 16, u'label': u'DOBJ'}

和
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'CONJ', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 17, u'label': u'CC'}

教育
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 20, u'label': u'NN'}

资源
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 17, u'label': u'CONJ'}

。
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 16, u'label': u'P'}

促进
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 22, u'label': u'ROOT'}

各
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'DET', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 26, u'label': u'DET'}

大
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'AFFIX', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 25, u'label': u'PREF'}

公司
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 26, u'label': u'NN'}

组织
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 40, u'label': u'NSUBJ'}

使用
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 40, u'label': u'VMOD'}

先进
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'ADJ', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 31, u'label': u'AMOD'}

的
{u'case': u'RELATIVE_CASE', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PRT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 28, u'label': u'RCMODREL'}

数据
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 31, u'label': u'NN'}

科学
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 27, u'label': u'DOBJ'}

、
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 31, u'label': u'P'}

机器
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 34, u'label': u'NN'}

学习
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 31, u'label': u'CONJ'}

和
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'CONJ', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 31, u'label': u'CC'}

云计算
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 38, u'label': u'NN'}

技
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 38, u'label': u'SUFF'}

术
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'AFFIX', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 31, u'label': u'CONJ'}

来
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'ADV', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 40, u'label': u'PRT'}

获
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 22, u'label': u'CCOMP'}

取数
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 42, u'label': u'NSUBJ'}

据
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'ADP', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 40, u'label': u'CCOMP'}

洞
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 44, u'label': u'NSUBJ'}

见
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 42, u'label': u'PCOMP'}

。
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 22, u'label': u'P'}

激励
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 46, u'label': u'ROOT'}

下
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'ADP', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 46, u'label': u'PRT'}

一
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NUM', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 49, u'label': u'NUM'}

代
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'NOUN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 53, u'label': u'UNKNOWN'}

人工智
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'UNKNOWN', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 53, u'label': u'NN'}

能
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 52, u'label': u'AUX'}

爱好
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 53, u'label': u'SUFF'}

者
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'AFFIX', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 46, u'label': u'DOBJ'}

和
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'CONJ', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 53, u'label': u'CC'}

领导
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'VERB', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 56, u'label': u'SUFF'}

者
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'AFFIX', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 53, u'label': u'CONJ'}

。
{u'case': u'CASE_UNKNOWN', u'reciprocity': u'RECIPROCITY_UNKNOWN', u'mood': u'MOOD_UNKNOWN', u'form': u'FORM_UNKNOWN', u'gender': u'GENDER_UNKNOWN', u'number': u'NUMBER_UNKNOWN', u'person': u'PERSON_UNKNOWN', u'tag': u'PUNCT', u'tense': u'TENSE_UNKNOWN', u'aspect': u'ASPECT_UNKNOWN', u'proper': u'NOT_PROPER', u'voice': u'VOICE_UNKNOWN'}
{u'headTokenIndex': 46, u'label': u'P'}

< End of interactive demo >

定义一个输出为NLP分析结果的文本消息的小功能,用于微信回复:


In [20]:
def KudosData_nlp_generate_reply(responses):
    nlp_reply = u'[ NLP 自然语言处理结果 ]'
    
    # 1. 整篇消息文字的情感分析 (Sentiment analysis, Document level)
    nlp_reply += '\n'
    nlp_reply += '\n' + u'[ 整篇消息 语种 : {} ]\n( 褒贬程度 : {} | 语彩累积 : {} )'.format(
            responses['language']
            , responses['documentSentiment']['score']
            , responses['documentSentiment']['magnitude']
        )

    # 2. 消息文字中语句的情感分析 (Sentiment analysis, Sentence level)           
    nlp_reply += '\n'
    for i in range(len(responses['sentences'])):
        nlp_reply += '\n' + u'[ 语句 {} : {} ]\n( 褒贬程度 : {} | 语彩累积 : {} )'.format(
              i+1
            , responses['sentences'][i]['text']['content']
            , responses['sentences'][i]['sentiment']['score']
            , responses['sentences'][i]['sentiment']['magnitude']
        )
                
    # 3. 消息文字中名称实体的识别 (Name-Entity detection)
    nlp_reply += '\n'
    for i in range(len(responses['entities'])): 
        nlp_reply += '\n' + u'[ 实体 {} : {} ]\n  实体类别 : {}\n  重要程度 : {}'.format(
              i+1
            , responses['entities'][i]['name']
            , responses['entities'][i]['type']
            , responses['entities'][i]['salience']
        )
        if 'sentiment' in responses['entities'][i]:
            nlp_reply += '\n' + u'  褒贬程度 : {}\n  语彩累积 : {}'.format(
                  responses['entities'][i]['sentiment']['score']
                , responses['entities'][i]['sentiment']['magnitude']
            )
        if responses['entities'][i]['metadata'] != {}:
            if 'wikipedia_url' in responses['entities'][i]['metadata']:
                nlp_reply += '\n  ' + responses['entities'][i]['metadata']['wikipedia_url']
                           
    # 4. 语句的语法分析 (Syntax / Grammar analysis)
#     nlp_reply += '\n'
#     for i in range(len(responses['tokens'])): 
#         nlp_reply += '\n' + str(responses['tokens'][i])
    
    return nlp_reply

In [21]:
print(KudosData_nlp_generate_reply(responses))


[ NLP 自然语言处理结果 ]

[ 整篇消息 语种 : zh ]
( 褒贬程度 : 0.6 | 语彩累积 : 2 )

[ 语句 1 : 作为酷豆数据科学的顾问和培训师,Sam Gu (白黑) 善长联络社群和教育资源。 ]
( 褒贬程度 : 0.8 | 语彩累积 : 0.8 )
[ 语句 2 : 促进各大公司组织使用先进的数据科学、机器学习和云计算技术来获取数据洞见。 ]
( 褒贬程度 : 0.9 | 语彩累积 : 0.9 )
[ 语句 3 : 激励下一代人工智能爱好者和领导者。 ]
( 褒贬程度 : 0.2 | 语彩累积 : 0.2 )

[ 实体 1 : 酷豆数据科学 ]
  实体类别 : OTHER
  重要程度 : 0.21968429
  https://en.wikipedia.org/wiki/Data_science
[ 实体 2 : 顾问 ]
  实体类别 : PERSON
  重要程度 : 0.14425991
[ 实体 3 : Sam Gu ]
  实体类别 : PERSON
  重要程度 : 0.13641302
[ 实体 4 : 培训师 ]
  实体类别 : PERSON
  重要程度 : 0.118462436
[ 实体 5 : 白黑 ]
  实体类别 : OTHER
  重要程度 : 0.118462436
[ 实体 6 : 数据科学 ]
  实体类别 : OTHER
  重要程度 : 0.04529309
[ 实体 7 : 云计算技术 ]
  实体类别 : OTHER
  重要程度 : 0.039374225
[ 实体 8 : 社群 ]
  实体类别 : OTHER
  重要程度 : 0.035438944
[ 实体 9 : 教育资源 ]
  实体类别 : OTHER
  重要程度 : 0.035438944
[ 实体 10 : 机器学习 ]
  实体类别 : OTHER
  重要程度 : 0.03370535
[ 实体 11 : 公司组织 ]
  实体类别 : ORGANIZATION
  重要程度 : 0.030995347
[ 实体 12 : 人工智能爱好者 ]
  实体类别 : PERSON
  重要程度 : 0.021235999
[ 实体 13 : 领导者 ]
  实体类别 : PERSON
  重要程度 : 0.021235999

用微信App扫QR码图片来自动登录


In [ ]:
itchat.auto_login(hotReload=True) # hotReload=True: 退出程序后暂存登陆状态。即使程序关闭,一定时间内重新开启也可以不用重新扫码。

In [ ]:
# Obtain my own Nick Name
MySelf = itchat.search_friends()
NickName4RegEx = '@' + MySelf['NickName'] + '\s*'

In [ ]:
# 单聊模式,自动进行自然语言分析,以文本形式返回处理结果:
@itchat.msg_register([TEXT, MAP, CARD, NOTE, SHARING])
def text_reply(msg):
        text4nlp = msg['Content']
        # call NLP API:
        nlp_responses = KudosData_nlp(text4nlp
                            , parm_nlp_extractDocumentSentiment
                            , parm_nlp_extractEntities
                            , parm_nlp_extractEntitySentiment
                            , parm_nlp_extractSyntax)
        # Format NLP results:
        nlp_reply = KudosData_nlp_generate_reply(nlp_responses)
        print(nlp_reply)
        return nlp_reply

In [ ]:
# 群聊模式,如果收到 @ 自己的文字信息,会自动进行自然语言分析,以文本形式返回处理结果:
@itchat.msg_register(TEXT, isGroupChat=True)
def text_reply(msg):
    if msg['isAt']:
        text4nlp = re.sub(NickName4RegEx, '', msg['Content'])
        # call NLP API:
        nlp_responses = KudosData_nlp(text4nlp
                            , parm_nlp_extractDocumentSentiment
                            , parm_nlp_extractEntities
                            , parm_nlp_extractEntitySentiment
                            , parm_nlp_extractSyntax)
        # Format NLP results:
        nlp_reply = KudosData_nlp_generate_reply(nlp_responses)
        print(nlp_reply)
        return nlp_reply

In [ ]:
itchat.run()

In [ ]:
# interupt kernel, then logout
itchat.logout() # 安全退出

第四课:自然语言处理:语义和情感分析

Lesson 4: Natural Language Processing 2

  • 消息文字中名称实体的识别 (Name-Entity detection)
  • 消息文字中语句的情感分析 (Sentiment analysis, Sentence level)
  • 整篇消息文字的情感分析 (Sentiment analysis, Document level)
  • 语句的语法分析 (Syntax / Grammar analysis)

下一课是:

第五课:视频识别和处理

Lesson 5: Video Recognition & Processing

  • 识别视频消息中的物体名字 (Recognize objects in video)
  • 识别视频的场景 (Detect scenery in video)
  • 直接搜索视频内容 (Search content in video)