The MIT License (MIT)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

First off ...

If you are already familiar with Python and the libraries I used in this notebook, I take a lot of poetic license in my descriptions of what's going on. Please forgive them :)

The setup...

Get a few Python modules loaded up first



In [1]:

    
%matplotlib inline
import requests
import pandas as pd

import matplotlib
import matplotlib.pyplot as plt
import spacy
from gensim import corpora, models, similarities

from lxml import etree
from bs4 import BeautifulSoup
from IPython.core.display import display, HTML
import re

More housekeeping

Tables of data will be displayed later - this will allow them to expand and scroll their full width. Also, load up the "brain" for the Natural Language Processing library used in this example, spaCy.

This can take 30-60 seconds, depending on your computer's processor speed.



In [2]:

    
pd.set_option('display.max_colwidth', -1)
plt.rcdefaults()

nlp = spacy.English()

Some SharePoint configuration details

Time to define what SharePoint Online site will be used, and the credentials used to sign into it. Since SharePoint Online supports SAML for authentication, create a template to make the request

IMPORTANT

You'll need to update the four variables here to match your SharePoint Online environment. If you're not using SharePoint Online, you'll need to change the code from the following cell all the way down to where sp_request is defined.



In [3]:

    
# Your SharePoint Online domain root goes here, including the / at the end
endpoint = 'https://????.sharepoint.com/'

# Your SharePoint Online user goes here (the user you want to use for authenticating)
username = '????@????.onmicrosoft.com' 

# And your password. Imagine that.
password = 'pass@word1' 

# The URL to the root of the site collection you want to use goes here, including the / at the end
the_site = 'sites/contoso/Employee/ITWeb/Information%20Technology/' 

api_base = '{}{}_api/web/'.format(endpoint, the_site)

request_headers = {
    'Accept': 'application/json',
    'odata': 'verbose',
}

saml_request_template = '<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing" xmlns:u="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"><s:Header><a:Action s:mustUnderstand="1">http://schemas.xmlsoap.org/ws/2005/02/trust/RST/Issue</a:Action><a:ReplyTo><a:Address>http://www.w3.org/2005/08/addressing/anonymous</a:Address></a:ReplyTo><a:To s:mustUnderstand="1">https://login.microsoftonline.com/extSTS.srf</a:To><o:Security s:mustUnderstand="1" xmlns:o="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"><o:UsernameToken><o:Username>{username}</o:Username><o:Password>{password}</o:Password></o:UsernameToken></o:Security></s:Header><s:Body><t:RequestSecurityToken xmlns:t="http://schemas.xmlsoap.org/ws/2005/02/trust"><wsp:AppliesTo xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy"><a:EndpointReference><a:Address>{endpoint}</a:Address></a:EndpointReference></wsp:AppliesTo><t:KeyType>http://schemas.xmlsoap.org/ws/2005/05/identity/NoProofKey</t:KeyType><t:RequestType>http://schemas.xmlsoap.org/ws/2005/02/trust/Issue</t:RequestType><t:TokenType>urn:oasis:names:tc:SAML:1.0:assertion</t:TokenType></t:RequestSecurityToken></s:Body></s:Envelope>'
saml_request_body = saml_request_template.format(
    endpoint=endpoint, username=username, password=password)

<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing" xmlns:u="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"> <s:Header> <a:Action s:mustUnderstand="1">http://schemas.xmlsoap.org/ws/2005/02/trust/RST/Issue</a:Action> <a:ReplyTo> <a:Address>http://www.w3.org/2005/08/addressing/anonymous</a:Address> </a:ReplyTo> <a:To s:mustUnderstand="1">https://login.microsoftonline.com/extSTS.srf</a:To> <o:Security s:mustUnderstand="1" xmlns:o="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"> <o:UsernameToken> <o:Username>????@????.onmicrosoft.com</o:Username> <o:Password>pass@word1</o:Password> </o:UsernameToken> </o:Security> </s:Header> <s:Body> <t:RequestSecurityToken xmlns:t="http://schemas.xmlsoap.org/ws/2005/02/trust"> <wsp:AppliesTo xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy"> <a:EndpointReference> <a:Address>https://????.sharepoint.com/</a:Address> </a:EndpointReference> </wsp:AppliesTo> <t:KeyType>http://schemas.xmlsoap.org/ws/2005/05/identity/NoProofKey</t:KeyType> <t:RequestType>http://schemas.xmlsoap.org/ws/2005/02/trust/Issue</t:RequestType> <t:TokenType>urn:oasis:names:tc:SAML:1.0:assertion</t:TokenType> </t:RequestSecurityToken> </s:Body> </s:Envelope>

So many namespaces

An XML document can actually combine a bunch of different grammars, each one known by a namespace. A tag named with a prefix, like <SharePoint:AwesomeSauce>, means that the AwesomeSauce tag comes from the SharePoint namespace.

These namespaces may be useful, so they'll be stored in a Python dictionary for use later.



In [4]:

    
nsmap = {
    'S': 'http://www.w3.org/2003/05/soap-envelope',
    'wsse': 'http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd',
    'wsu': 'http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd',
    'wsa': 'http://www.w3.org/2005/08/addressing',
    'wst': 'http://schemas.xmlsoap.org/ws/2005/02/trust'
}

Hello, SharePoint, can I come in?

This function will make a request to SharePoint Online, using the SAML template from earlier, and return the response from the request. Hopefully.



In [5]:

    
def saml_request(body):
    # SharePoint Online REST API - Get Request Security Token
    # POST https://login.microsoftonline.com/extSTS.srf

    try:
        response = requests.post(
            url="https://login.microsoftonline.com/extSTS.srf",
            headers={
                "Accept": "application/json",
            },
            data=body
        )
        return response
    except requests.exceptions.RequestException:
        print('HTTP Request failed')

Drumroll, please...

Now, to make the request. If all goes well, there will be a special token returned that will be exchanged for the actual right to log in to the site.



In [6]:

    
s = saml_request(saml_request_body)
saml_response = etree.XML(s.content)
saml_token = saml_response.xpath(
    '//wsse:BinarySecurityToken[@Id="Compact0"]', namespaces=nsmap)[0].text
saml_token









    Out[6]:





'somerandomjunkwillgohere'

I have the golden ticket!

Now that the token is returned, it's time to exchange it with the SharePoint Online site used in this example. It's kinda like logging in, but the SharePoint site isn't given the login and password, it's given the ticket that the earlier request returned to us. It will respond with two cookies (yum!) that will be used again (and again) for any future requests to the SharePoint Online site.



In [7]:

    
a = requests.post(
    url="{}_forms/default.aspx?wa=wsignin1.0".format(endpoint),
    data=saml_token
)

Show me your cookies

Here, the cookies are assigned to a variable we can use whenever we attend a future SharePoint Cookie Exchange.



In [8]:

    
authcookies = {
    'FedAuth': a.cookies['FedAuth'],
    'rtFa': a.cookies['rtFa']
}



In [9]:

    
authcookies









    Out[9]:





{'FedAuth': 'somerandomjunkwillgohere',
 'rtFa': 'somerandomjunkwillgohere'}

Hey, DJ, I have a SharePoint request

This function will handle common SharePoint requests. It supplies the previously-captured authentication cookies, and will return a Pandas DataFrame of the results of the request.



In [10]:

    
def sp_request(path_from_base, fields_to_keep, index_on_id=True):
    r = requests.get(
        url="{}{}".format(api_base, path_from_base),
        cookies=authcookies,
        headers=request_headers
    )
    j = r.json()
    v = j['value']

    df = pd.DataFrame(v)
    df = df[fields_to_keep]
    if index_on_id:
        df.set_index('Id', inplace=True)
    return df

Who's there?

First, the list of users is returned by the sp_request function and assigned to the variable df_users.



In [11]:

    
df_users = sp_request('siteusers', ['Id', 'Email', 'Title'])



In [12]:

    
df_users









    Out[12]:






  
    
      
      Email
      Title
    
    
      Id
      
      
    
  
  
    
      23
      AlexD@lnidemo.onmicrosoft.com
      Alex Darrow
    
    
      24
      AllieB@lnidemo.onmicrosoft.com
      Allie Bellew
    
    
      25
      AnneW@lnidemo.onmicrosoft.com
      Anne Wallace
    
    
      26
      AzizH@lnidemo.onmicrosoft.com
      Aziz Hassouneh
    
    
      27
      BelindaN@lnidemo.onmicrosoft.com
      Belinda Newman
    
    
      28
      BonnieK@lnidemo.onmicrosoft.com
      Bonnie Kearney
    
    
      29
      DavidL@lnidemo.onmicrosoft.com
      David Longmuir
    
    
      30
      DenisD@lnidemo.onmicrosoft.com
      Denis Dehenne
    
    
      31
      DorenaP@lnidemo.onmicrosoft.com
      Dorena Paschke
    
    
      13
      
      Everyone
    
    
      16
      
      Everyone except external users
    
    
      32
      FabriceC@lnidemo.onmicrosoft.com
      Fabrice Canel
    
    
      33
      GarretV@lnidemo.onmicrosoft.com
      Garret Vargas
    
    
      34
      GarthF@lnidemo.onmicrosoft.com
      Garth Fort
    
    
      35
      JanetS@lnidemo.onmicrosoft.com
      Janet Schorr
    
    
      36
      JulianI@lnidemo.onmicrosoft.com
      Julian Isla
    
    
      37
      JunminH@lnidemo.onmicrosoft.com
      Junmin Hao
    
    
      38
      KariF@lnidemo.onmicrosoft.com
      Kari Furse
    
    
      39
      KatieJ@lnidemo.onmicrosoft.com
      Katie Jordan
    
    
      17
      admin@lnidemo.onmicrosoft.com
      MOD Administrator
    
    
      41
      MollyD@lnidemo.onmicrosoft.com
      Molly Dempsey
    
    
      12
      
      NT AUTHORITY\authenticated users
    
    
      42
      PavelB@lnidemo.onmicrosoft.com
      Pavel Bansky
    
    
      20
      
      Provisioning User
    
    
      19
      
      Provisioning User
    
    
      18
      
      Provisioning User
    
    
      21
      
      Provisioning User
    
    
      22
      
      Provisioning User
    
    
      40
      RobinC@lnidemo.onmicrosoft.com
      Robin Counts
    
    
      43
      RobY@lnidemo.onmicrosoft.com
      Rob Young
    
    
      44
      SaraD@lnidemo.onmicrosoft.com
      Sara Davis
    
    
      14
      
      _spocrwl_186_16429
    
    
      1073741823
      
      System Account
    
    
      45
      TonyK@lnidemo.onmicrosoft.com
      Tony Krijnen
    
    
      46
      ZrinkaM@lnidemo.onmicrosoft.com
      Zrinka Makovac

Yo dawg, I hear you like lists of lists

Now, sp_request is used to return a list of lists, and filtered to only include lists that aren't marked as hidden.



In [13]:

    
df_lists_all = sp_request(
    'lists', ['Title', 'Description', 'ItemCount', 'Id', 'Hidden'])
df_lists = df_lists_all[df_lists_all.Hidden == False]



In [14]:

    
# Top 5 lists
df_lists.sort(['ItemCount'], ascending=False).head()









    Out[14]:






  
    
      
      Title
      Description
      ItemCount
      Hidden
    
    
      Id
      
      
      
      
    
  
  
    
      0e5f4ea4-cefc-4137-926e-476396f620bc
      Discussions List
      Use the Discussion list to hold forum-style conversations, including question and answer, on topics relevant to your team, project, or community.
      41
      False
    
    
      389381be-4e08-488e-9f62-94f6615ed487
      Community Members
      This list keeps a record of ongoing activity by members and reputation they accrue within this community.
      13
      False
    
    
      a8648997-cb67-4aff-8eae-ed782e5cb4b8
      Site Pages
      
      6
      False
    
    
      d1e1de69-2934-4904-b128-b2509a763635
      Categories
      Use the Categories list to define the categories available for discussion list posts.
      1
      False
    
    
      fa06102c-e32e-42fa-8a34-d312d7352a2e
      Documents
      This system library was created by the Publishing feature to store documents that are used on pages in this site.
      0
      False

Time to visit the watercooler

And let's see .. what are people discussing?

First, discussion_id is set to the ID of the list titled Discussions List. Then, tmp_discussions (so named because it isn't intended to be the final copy of this dataset) is set to the listing of discussion content.



In [15]:

    
discussion_id = df_lists[df_lists.Title == 'Discussions List'].index.values[0]

tmp_discussions = sp_request(
    path_from_base="lists(guid'{}')/items".format(discussion_id),
    fields_to_keep=[
        'Title', 'Body', 'AuthorId', 'Id', 'ParentItemID', 'BestAnswerId'],
    index_on_id=False
)



In [16]:

    
tmp_discussions









    Out[16]:






  
    
      
      Title
      Body
      AuthorId
      Id
      ParentItemID
      BestAnswerId
    
  
  
    
      0
      Update to Office, Available Soon!
      <div class="ExternalClass7680EB0FE93243B1919ABD9B621E78F3"><p>The new release of Office will is right around the corner&#160; <span><span>I am not sure when Contoso is planning to upgrade, but </span></span>I know I will be getting the new version as soon as I can.&#160; <br></p></div>
      32
      1
      NaN
      3
    
    
      1
      None
      <div class="ExternalClass4889D37A14C34CE7BB830F5A7FF38A96"><p>I am glad you are so up-to-date on things Fabrice! &#160;I will try to get the new version soon as well.</p></div>
      41
      2
      1
      NaN
    
    
      2
      None
      <div class="ExternalClassF913E494E9CC465C86A6F1692A0E86FC"><p>Sounds good Fabrice!&#160; Let me know if you need someone to help troubleshoot something with you.</p></div>
      42
      3
      1
      NaN
    
    
      3
      None
      <div class="ExternalClassB2DC0D9515FC4291903985A825BBA466"><p>Anyone know what some of the new features are?</p></div>
      38
      4
      1
      NaN
    
    
      4
      None
      <div class="ExternalClass85990E08E2C24CD296E6711B5D5312ED"><p>The Microsoft site has a full description of the new Office products.<br></p></div>
      32
      5
      1
      NaN
    
    
      5
      Microsoft Issues Word Patch
      <div class="ExternalClass1A1D176690C04027BB797CAF92E32C80"><p>Make sure to update your systems, because Microsoft has released a critical patch for Word. &#160;This should help smooth over some of the difficulties we have been having and&#160;give the Security team a rest. &#160;Even so, remember not to open files unless they are from a trusted source!</p></div>
      34
      6
      NaN
      NaN
    
    
      6
      None
      <div class="ExternalClass66B34417A321490DAE6A30D413D87F74"><p>This is a helpful reminder, Garth. &#160;I will remember to set up the update before I leave for the weekend. &#160;</p></div>
      35
      7
      6
      NaN
    
    
      7
      None
      <div class="ExternalClass9EEA8838B7D34AA2A1083C7166DDFBC7"><p>Thanks.&#160; I just got the patch.</p></div>
      38
      8
      6
      NaN
    
    
      8
      None
      <div class="ExternalClass7CDA6082559245648E3B114FD0F1A15F"><p>I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!</p></div>
      41
      9
      6
      NaN
    
    
      9
      Windows 10 PCs for Next Year
      <div class="ExternalClass8459219F464C4AD3B72DDF6ED7AB52C9"><p>It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects. &#160;</p></div>
      35
      10
      NaN
      13
    
    
      10
      None
      <div class="ExternalClassC2AA5D409B4D412C8B5DAB3F484BE7E4"><p>Where can we get new hardware before then? &#160;We have a new hire in Product Marketing and she needs a laptop.</p></div>
      41
      11
      10
      NaN
    
    
      11
      None
      <div class="ExternalClass501FFBDBC6C947F6BD10168CEE22F937"><p>It's too bad we can't get them sooner.&#160; I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.&#160; <br></p></div>
      32
      12
      10
      NaN
    
    
      12
      None
      <div class="ExternalClass7ABFB4067F634305BE7A48A3739DC694"><p>Molly, I think if you check with Garret Vargas he might be able to help.&#160; There is also a spare parts room down the hall from me if you need smaller things.</p></div>
      42
      13
      10
      NaN
    
    
      13
      None
      <div class="ExternalClass05C0F75DFCDF43FBA95CF57A23B09DAA"><span style="line-height&#58;18px;">Thanks, Pavel. &#160;We have&#160;everything figured out now!</span></div>
      41
      14
      10
      NaN
    
    
      14
      Contoso Electronics Annual Conference
      <div class="ExternalClass9EF3920ACE76463490E6F3D3B4A891C3"><p>Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters.&#160; Come and see the latest Contoso products on display and what will be new in 2013.&#160; I am posting this for Junmin Hao, as he is on Vacation this week.<br></p></div>
      26
      15
      NaN
      18
    
    
      15
      None
      <div class="ExternalClass4073E610825D4932912C24A745D83703"><p>If I can get some time away from my project then I will try to stop by! &#160;It would be nice to get a better idea of what is coming out next year.</p></div>
      23
      16
      15
      NaN
    
    
      16
      None
      <div class="ExternalClass3D5D71E7DB7B4F0CB49C3DD110E5BD35"><p>Will this be happening in the lobby of building one like last year?<br></p></div>
      27
      17
      15
      NaN
    
    
      17
      None
      <div class="ExternalClass6C98B30759914971924FC738C498DFBD"><p>I think so, Belinda.&#160; I believe there will be a formal announcement about it soon.&#160; </p></div>
      39
      18
      15
      NaN
    
    
      18
      None
      <div class="ExternalClass716703823FD3406CA443ED98207BFBF1"><p>Last year there were some projects that I hadn't even heard about. &#160;It is helpful to get filled in on what the Engineering and R&amp;D departments are up to.</p></div>
      35
      19
      15
      NaN
    
    
      19
      Reminder: Contoso Site Scheduled Maintenance
      <div class="ExternalClassEAFB575A5F864303B5629648ACDAD4B4"><p>Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4&#58;00am to 9&#58;00am next Monday&#160;morning. &#160;The site will be down during this time, so please plan accordingly. &#160;</p></div>
      28
      20
      NaN
      23
    
    
      20
      None
      <div class="ExternalClass85891AEB4CDA411B9D2A0DEE1CF3DDBE"><p>Do you have any idea what changes are going to be made?<br></p></div>
      26
      21
      20
      NaN
    
    
      21
      None
      <div class="ExternalClassC26D91B6FACE431C8299819E70B9B606"><p>I think the changes are mostly behind-the-scenes. &#160;</p></div>
      23
      22
      20
      NaN
    
    
      22
      None
      <div class="ExternalClassEECEFF9459394301A23688F20397EFAA"><p>There will be a few security updates as well as some UI changes in the product view area. &#160;See the Operations page for a full list of updates.</p></div>
      35
      23
      20
      NaN
    
    
      23
      Contoso Boston Show is Going Forward
      <div class="ExternalClass7CEF049EFC104DF9B78DE8F3FFC913BA"><p>Contoso Sales product show confirmed in Boston on the 15th of November.&#160; The long awaited&#160;new lineup of QT2000 models will be unveiled to the public.&#160; Please mark your calendars if you are scheduled to attend.</p></div>
      42
      24
      NaN
      NaN
    
    
      24
      None
      <div class="ExternalClass3C8C616E8D34448BB6F7471EDD28FEF6"><p>I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.</p></div>
      35
      25
      24
      NaN
    
    
      25
      Contoso Cyber Security
      <div class="ExternalClassAD866A0079F74461B272CAB3A50051C0"><p>I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.</p></div>
      23
      26
      NaN
      27
    
    
      26
      None
      <div class="ExternalClass8E9D74A7CB73431FBBB74ED732C554B0"><p>Thank you for the concern, Alex!&#160; From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.</p></div>
      39
      27
      26
      NaN
    
    
      27
      None
      <div class="ExternalClassA5645BC66C2644249369F20188D76E57"><p>I was concerned myself, what with the security breach two years ago.&#160; Nevertheless, I saw a report on posted by Garret Vargas&#160;on Monday about the new security system. <br></p></div>
      26
      28
      26
      NaN
    
    
      28
      None
      <div class="ExternalClassFF5CC4DD8C554F5B94278832C9E95D64"><p>If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.<br></p></div>
      27
      29
      26
      NaN
    
    
      29
      None
      <div class="ExternalClass084CADB1843B4786B1CA4362B0789760"><p>With the new patch released for Word, I think some of our problems may be lessened. &#160;See my latest post.</p></div>
      34
      30
      26
      NaN
    
    
      30
      Browser Recomendation?
      <div class="ExternalClass0542C39609D349F78EE04C7B04013676"><p>I was wondering what all of you prefer to use as your default browser when at work or home.&#160; I use Internet Explorer 9, as it seems the most stable when running a large number of tabs.&#160; What do you think?<br></p></div>
      27
      31
      NaN
      32
    
    
      31
      None
      <div class="ExternalClass1EBC1F98DCC54A47A42334587D6E5096"><p>I have always used IE primarily myself.</p></div>
      23
      32
      31
      NaN
    
    
      32
      None
      <div class="ExternalClass70900ADCE11145AB89AAAD9BE04364DB"><p>Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.</p></div>
      34
      33
      31
      NaN
    
    
      33
      None
      <div class="ExternalClassE330B4A36E1B489FB2469A238E5CDF48"><p>Have you heard about the vulnerabilities with the latest version of Firefox?&#160; </p></div>
      42
      34
      31
      NaN
    
    
      34
      Tablet PCs to be Issued Next Month?
      <div class="ExternalClass866847BAC5DD4FD4993F5A862035C78E"><p>A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops.&#160; Has this been confirmed?&#160; Is there a request form available?&#160; Thanks in advance!<br></p></div>
      31
      35
      NaN
      36
    
    
      35
      None
      <div class="ExternalClassFDB19559BDD140FF8D6F3427269E28B1"><p>I think if you ask Garret Vargas from Operations he might be able to help you.</p></div>
      26
      36
      35
      NaN
    
    
      36
      None
      <div class="ExternalClass7D8995E7F79341EF95C75A2B06F379B5"><p>I will email him about that. &#160;A tablet PC would be quite helpful for creating the new round of holiday ads.</p></div>
      23
      37
      35
      NaN
    
    
      37
      None
      <div class="ExternalClass85293723737444E09824C4E104EA6181"><p>You are right, Aziz.&#160; I just saw him today and he told me to check the Operations document center to get the request form.<br></p></div>
      27
      38
      35
      NaN
    
    
      38
      Please Remember to Keep your Software Updated
      <div class="ExternalClass37DEF653B8124DCD84D9F9900651C73C"><p>So that we can all stay on the same page, please remember to keep your software items updated.&#160; I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents.&#160; Many of these problems can be resolved by doing a simple update.&#160; The IT help desk can provide assistance of you are unsure of what to do.<br></p></div>
      31
      39
      NaN
      NaN
    
    
      39
      None
      <div class="ExternalClass4A7B2AB9D13B43AF9C6F12787095CCB7"><p>Thanks Dorena, I was having a problem with this just last week.&#160; </p></div>
      26
      40
      39
      NaN
    
    
      40
      None
      <div class="ExternalClass7E49E6C38232475FAD90173531BD7D0C"><p>Thank you! &#160;I always forget to do that, especially with the new QT series project due next month!</p></div>
      23
      41
      39
      NaN

HTML is nice, but plain text is ... well, plain.

Natural Language Processing works better when you're processing a single language — in this case, English — so the makeplain function will use the Python module BeautifulSoup to remove all the HTML. A regular expression is also used to remove some other whitespace-related characters to help clean up the text.



In [17]:

    
def makeplain(html):
    text = BeautifulSoup(html, 'html.parser').get_text()
    text = text.replace(u'\u200b','')
    text = text.replace(u'\xa0','')
    return text

Correlate all the things

The df_users DataFrame and the tmp_discussions DataFrame are merged together, so that the resulting dataframe (df_discussions) includes the Authors' names and email addresses.



In [18]:

    
df_discussions = tmp_discussions.merge(
    df_users, left_on=tmp_discussions.AuthorId, right_on=df_users.index.values, suffixes=['_disc', '_user'])
df_discussions.set_index('Id', inplace=True)
df_discussions['Body_Text'] = df_discussions.apply(
    lambda row: makeplain(row['Body']), axis=1)
df_discussions = df_discussions[['Title_user', 'Title_disc', 'Body_Text', 'Body', 'AuthorId', 'ParentItemID', 'BestAnswerId',
                                 'Email']]

Who's really contributing to the community?

Here, the discussions are grouped by author, and a graph is plotted to show how many posts each user has made, and how many posts they've marked as an answer, a metric that may show how much they're truly contributing to rather than "leeching" from the community.



In [19]:

    
%matplotlib inline
plt.rcParams.update({
    'figure.facecolor': 'white',
    'font.size': '24',
    'axes.grid': 'true',
})

grouped = df_discussions.groupby(['Title_user']).count().sort(
    ['Title_disc'], ascending=False)
grouped[['Title_disc', 'BestAnswerId']].plot(
    kind='barh', title='Frequent Posters', figsize=(18, 12), colormap='rainbow')
plt.legend(['Posts by User', 'Other Answers Marked by User'])









    Out[19]:





<matplotlib.legend.Legend at 0x107fde710>

Who's the best?

The posts marked as "best answers" may be used to identify subject matter experts, so best_answers is populated with a DataFrame of only those discussions that have been marked as a best answer.



In [20]:

    
best_answer_ids = df_discussions.BestAnswerId.dropna().values
best_answers = df_discussions[df_discussions.index.isin(best_answer_ids)].Body.values

for answer in best_answers:
    display(HTML(answer))









    




Sounds good Fabrice!  Let me know if you need someone to help troubleshoot something with you.






    




Molly, I think if you check with Garret Vargas he might be able to help.  There is also a spare parts room down the hall from me if you need smaller things.






    




There will be a few security updates as well as some UI changes in the product view area.  See the Operations page for a full list of updates.






    




I think if you ask Garret Vargas from Operations he might be able to help you.






    




I have always used IE primarily myself.






    




I think so, Belinda.  I believe there will be a formal announcement about it soon.  






    




Thank you for the concern, Alex!  From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.

Time to investigate the content further

discussion_bodies is set to a plain list (technically, a Python list that's been enhanced as a NumPy array) of the plain text discussions. This will be used to create a corpus, or body of content.



In [21]:

    
df_discussions.sort_index(inplace=True)
discussion_bodies = df_discussions.Body_Text.values
discussion_bodies[:5]









    Out[21]:





array([ 'The new release of Office will is right around the corner I am not sure when Contoso is planning to upgrade, but I know I will be getting the new version as soon as I can. ',
       'I am glad you are so up-to-date on things Fabrice! I will try to get the new version soon as well.',
       'Sounds good Fabrice! Let me know if you need someone to help troubleshoot something with you.',
       'Anyone know what some of the new features are?',
       'The Microsoft site has a full description of the new Office products.'], dtype=object)

STOP! We don't like those words...

Some words don't really contribute to the ability to discern important words or topics. These words, like the, of, altogether, or whereby, are called stopwords. SpaCy has its own list of them, and a few bits of punctuation and partial contracts are added to that list. Finally, texts is set to a list of lists of words. One list for each of the original discussions.



In [22]:

    
my_stop_list = set('the of is - . , ? ! \'s ca n\'t'.split())
stoplist = my_stop_list.union(spacy.en.STOPWORDS)

texts = [
    [word.lower_ for word in nlp(discussion) 
     if word.lower_ not in stoplist] 
    for discussion in discussion_bodies
]

The most popular words are...

frequency is set as a dictionary of words and their frequencies. Each word gets an entry in the dictionary, and each time the word appears, its count is increased. Words that appear only once are filtered out and set to the variable texts. Since this is still a list (one entry for each of the discussions themselves) of lists (of those "relevant" words, that appeared more than once in the entire corpus), it's easily assigned as a new column in the df_discussions DataFrame.



In [23]:

    
from collections import defaultdict

frequency = defaultdict(int)
for text in texts:
    for token in text:
        frequency[token] += 1
        
texts = [[token for token in text if frequency[token] > 1]
         for text in texts]

df_discussions['relevant_words'] = texts

Show me the money text.

Here's the list of the "relevant" words for each discussion:



In [24]:

    
i = 0
for text in texts:
    i += 1
    print(i,text)









    



1 ['new', 'office', 'right', 'sure', 'contoso', 'upgrade', 'know', 'new', 'version', 'soon']
2 ['glad', 'things', 'fabrice', 'try', 'new', 'version', 'soon']
3 ['fabrice', 'know', 'need', 'help']
4 ['know', 'new']
5 ['microsoft', 'site', 'new', 'office', 'products']
6 ['sure', 'update', 'microsoft', 'released', 'patch', 'word', 'help', 'having', 'security', 'remember']
7 ['helpful', 'remember', 'update']
8 ['thanks', 'got', 'patch']
9 ['glad', 'got', 'think', 'contoso', 'problems']
10 ['confirmed', 'round', 'hardware', 'pcs', 'year', 'holiday', 'projects']
11 ['new', 'hardware', 'new', 'product']
12 ['help', 'desk', 'holiday', 'upgrade']
13 ['think', 'check', 'garret', 'vargas', 'able', 'help', 'need', 'things']
14 ['thanks']
15 ['like', 'contoso', 'electronics', 'right', 'contoso', 'latest', 'contoso', 'products', 'new', 'week']
16 ['time', 'project', 'try', 'idea', 'year']
17 ['like', 'year']
18 ['think', 'soon']
19 ['year', 'projects', 'heard', 'helpful']
20 ['remember', 'scheduled', 'contoso', 'site', 'site', 'time']
21 ['idea', 'changes']
22 ['think', 'changes']
23 ['security', 'updates', 'changes', 'product', 'operations', 'page', 'updates']
24 ['contoso', 'product', 'confirmed', 'november', 'scheduled']
25 ['like', 'new', 'products']
26 ['reading', 'current', 'cyber', 'security', 'electronics', 'wondering', 'contoso', 'cyber', 'security']
27 ['thank', 'garret', 'vargas', 'new', 'cyber', 'security', 'november']
28 ['security', 'ago', 'saw', 'garret', 'new', 'security']
29 ['current', 'year', 'operations', 'documents']
30 ['new', 'patch', 'released', 'word', 'think', 'problems', 'latest']
31 ['wondering', 'use', 'work', 'home', 'use', 'internet', 'explorer', 'number', 'think']
32 []
33 ['use', 'firefox', 'home', 'internet', 'explorer', 'work', 'use']
34 ['heard', 'latest', 'version', 'firefox']
35 ['ago', 'email', 'request', 'tablet', 'pcs', 'confirmed', 'request', 'form', 'thanks']
36 ['think', 'garret', 'vargas', 'operations', 'able', 'help']
37 ['email', 'tablet', 'helpful', 'new', 'round', 'holiday']
38 ['right', 'saw', 'check', 'operations', 'request', 'form']
39 ['page', 'remember', 'number', 'having', 'reading', 'documents', 'problems', 'update', 'help', 'desk']
40 ['thanks', 'having', 'week']
41 ['thank', 'new', 'project']

Words? Where we're going, we don't need words.

A new variable dictionary is created to store a special object from the corpora module of the gensim library. This is responsible for indexing the words and assigning them to numbers.



In [25]:

    
dictionary = corpora.Dictionary(texts)
dictionary.save('contoso.dict')

Words can be assigned numbers?

Yep. Proof:



In [26]:

    
print(dictionary.token2id)









    



{'number': 68, 'heard': 50, 'office': 0, 'upgrade': 3, 'update': 22, 'internet': 69, 'problems': 29, 'thank': 61, 'sure': 7, 'use': 67, 'week': 43, 'released': 19, 'email': 72, 'current': 60, 'operations': 55, 'ago': 63, 'latest': 45, 'work': 70, 'product': 37, 'need': 13, 'got': 27, 'pcs': 32, 'check': 39, 'scheduled': 51, 'fabrice': 12, 'november': 56, 'soon': 8, 'firefox': 71, 'reading': 58, 'saw': 62, 'security': 24, 'home': 66, 'projects': 34, 'things': 9, 'desk': 38, 'year': 35, 'version': 5, 'helpful': 25, 'microsoft': 15, 'glad': 10, 'able': 40, 'having': 20, 'confirmed': 31, 'like': 46, 'changes': 52, 'garret': 41, 'patch': 23, 'know': 6, 'tablet': 74, 'right': 1, 'round': 30, 'explorer': 65, 'think': 28, 'time': 47, 'thanks': 26, 'site': 17, 'products': 16, 'contoso': 2, 'remember': 21, 'vargas': 42, 'try': 11, 'idea': 49, 'new': 4, 'help': 14, 'updates': 54, 'word': 18, 'request': 73, 'wondering': 59, 'cyber': 57, 'hardware': 33, 'page': 53, 'form': 75, 'project': 48, 'electronics': 44, 'documents': 64, 'holiday': 36}

Save the corpus!

Sounds like a new charity, but in this case, the gensim library is used to save a corpus as in a special format called bag of words (bow). There are a few file formats for those, and the Matrix Market format is one of the most common ones, which could be helpful if this corpus would be processed in its numeric form by another library.



In [27]:

    
corpus = [dictionary.doc2bow(text) for text in texts]
corpora.MmCorpus.serialize('contoso.mm', corpus)

What are the REALLY important words

Now, a model, or another form of numeric representation of the corpus, is trained based on a techique called Term Frequency / Inverse Document Frequency.

Just because a word appears often in the entire corpus, it doesn't mean that it can be used to classify individual discussions. The stopwords won't catch any industry-specific terms, for example. If a word or phrase is used a lot in a few posts, that word or phrase could be relevant when determining the topic of that discussion. TF/IDF is a great way to identity truly relevant words or phrases. In this example, only words are considered.



In [28]:

    
tfidf = models.TfidfModel(corpus, normalize=True)

Corpus transformation

The corpus is now transformed to match the trained TF/IDF model. Before, the numeric representation was based on the bag of words model (how much times each word appears), and after the transformation, the numeric representation will be based on the times each word appears in the whole corpus and scaled by how many individual discussions use that word. The actual math is considerably more complex than that:

$$idf(t,D) = log{{N}\over{|\{d \in D : t \in d\}|}}$$

Where: idf(t, D) is the inverse document frequency of the term t in our set of discussions D, and N is the total number of discussions in the corpus.



In [29]:

    
corpus_tfidf = tfidf[corpus]

Time to discover the topics

The numeric representation of the corpus is transformed again using another method called Latent Semantic Indexing, which aims to discover (using some clever maths) the words that seem to be associated together frequently, which may suggest they contribute to an overall topic. In this case, the number of topics is limited to 5.



In [30]:

    
lsi = models.LsiModel(corpus_tfidf, id2word=dictionary, num_topics=5)
corpus_lsi = lsi[corpus_tfidf]
topics = lsi.show_topics()

Labelling the topics

The key words for each of the five discovered topics are displayed. This corpus doesn't have that many discussion documents, so it's hard to come up with actual topics for them, so they're labeled with generic names, which are stored in topic_labels



In [31]:

    
topic_words = []
for topic in topics:
    topic = re.split('[^A-Za-z]',topic)
    topic = [token for token in topic if token != '']
    topic_words.append(topic)
    print(topic)

topic_labels = ['Topic A','Topic B', 'Topic C', 'Topic D', 'Topic E']









    



['new', 'think', 'contoso', 'security', 'know', 'help', 'garret', 'like', 'soon', 'products']
['think', 'like', 'products', 'year', 'vargas', 'new', 'able', 'garret', 'changes', 'help']
['thanks', 'patch', 'having', 'got', 'new', 'know', 'week', 'update', 'remember', 'problems']
['year', 'helpful', 'know', 'projects', 'soon', 'holiday', 'operations', 'round', 'security', 'contoso']
['use', 'changes', 'year', 'internet', 'home', 'explorer', 'work', 'think', 'security', 'thanks']

For future use, store the topic keywords

Each discussion's "topic words" are stored with the discussion in the df_discussions dataframe, in case they're useful later. They're sorted by their relative contribution to the overall topic. The topic names and IDs are then calculated and stored in the df_discussions dataframe.



In [32]:

    
best_topics = []
best_topic_labels = []
lsi_topics = [lsi for lsi in corpus_lsi]
for lsi in lsi_topics:
    if lsi:
        lsi.sort(key=lambda tup: tup[1], reverse=True)
        topic_id = lsi[0][0]
        best_topics.append(topic_id)
        best_topic_labels.append(topic_labels[topic_id])
    else:
        best_topics.append('unknown')
        best_topic_labels.append('unknown')

The story so far...

Time to print the df_discussions DataFrame to see how it looks so far.



In [33]:

    
df_discussions









    Out[33]:






  
    
      
      Title_user
      Title_disc
      Body_Text
      Body
      AuthorId
      ParentItemID
      BestAnswerId
      Email
      relevant_words
    
    
      Id
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      Fabrice Canel
      Update to Office, Available Soon!
      The new release of Office will is right around the corner I am not sure when Contoso is planning to upgrade, but I know I will be getting the new version as soon as I can.
      <div class="ExternalClass7680EB0FE93243B1919ABD9B621E78F3"><p>The new release of Office will is right around the corner&#160; <span><span>I am not sure when Contoso is planning to upgrade, but </span></span>I know I will be getting the new version as soon as I can.&#160; <br></p></div>
      32
      NaN
      3
      FabriceC@lnidemo.onmicrosoft.com
      [new, office, right, sure, contoso, upgrade, know, new, version, soon]
    
    
      2
      Molly Dempsey
      None
      I am glad you are so up-to-date on things Fabrice! I will try to get the new version soon as well.
      <div class="ExternalClass4889D37A14C34CE7BB830F5A7FF38A96"><p>I am glad you are so up-to-date on things Fabrice! &#160;I will try to get the new version soon as well.</p></div>
      41
      1
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [glad, things, fabrice, try, new, version, soon]
    
    
      3
      Pavel Bansky
      None
      Sounds good Fabrice! Let me know if you need someone to help troubleshoot something with you.
      <div class="ExternalClassF913E494E9CC465C86A6F1692A0E86FC"><p>Sounds good Fabrice!&#160; Let me know if you need someone to help troubleshoot something with you.</p></div>
      42
      1
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [fabrice, know, need, help]
    
    
      4
      Kari Furse
      None
      Anyone know what some of the new features are?
      <div class="ExternalClassB2DC0D9515FC4291903985A825BBA466"><p>Anyone know what some of the new features are?</p></div>
      38
      1
      NaN
      KariF@lnidemo.onmicrosoft.com
      [know, new]
    
    
      5
      Fabrice Canel
      None
      The Microsoft site has a full description of the new Office products.
      <div class="ExternalClass85990E08E2C24CD296E6711B5D5312ED"><p>The Microsoft site has a full description of the new Office products.<br></p></div>
      32
      1
      NaN
      FabriceC@lnidemo.onmicrosoft.com
      [microsoft, site, new, office, products]
    
    
      6
      Garth Fort
      Microsoft Issues Word Patch
      Make sure to update your systems, because Microsoft has released a critical patch for Word. This should help smooth over some of the difficulties we have been having andgive the Security team a rest. Even so, remember not to open files unless they are from a trusted source!
      <div class="ExternalClass1A1D176690C04027BB797CAF92E32C80"><p>Make sure to update your systems, because Microsoft has released a critical patch for Word. &#160;This should help smooth over some of the difficulties we have been having and&#160;give the Security team a rest. &#160;Even so, remember not to open files unless they are from a trusted source!</p></div>
      34
      NaN
      NaN
      GarthF@lnidemo.onmicrosoft.com
      [sure, update, microsoft, released, patch, word, help, having, security, remember]
    
    
      7
      Janet Schorr
      None
      This is a helpful reminder, Garth. I will remember to set up the update before I leave for the weekend.
      <div class="ExternalClass66B34417A321490DAE6A30D413D87F74"><p>This is a helpful reminder, Garth. &#160;I will remember to set up the update before I leave for the weekend. &#160;</p></div>
      35
      6
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [helpful, remember, update]
    
    
      8
      Kari Furse
      None
      Thanks. I just got the patch.
      <div class="ExternalClass9EEA8838B7D34AA2A1083C7166DDFBC7"><p>Thanks.&#160; I just got the patch.</p></div>
      38
      6
      NaN
      KariF@lnidemo.onmicrosoft.com
      [thanks, got, patch]
    
    
      9
      Molly Dempsey
      None
      I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!
      <div class="ExternalClass7CDA6082559245648E3B114FD0F1A15F"><p>I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!</p></div>
      41
      6
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [glad, got, think, contoso, problems]
    
    
      10
      Janet Schorr
      Windows 10 PCs for Next Year
      It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects.
      <div class="ExternalClass8459219F464C4AD3B72DDF6ED7AB52C9"><p>It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects. &#160;</p></div>
      35
      NaN
      13
      JanetS@lnidemo.onmicrosoft.com
      [confirmed, round, hardware, pcs, year, holiday, projects]
    
    
      11
      Molly Dempsey
      None
      Where can we get new hardware before then? We have a new hire in Product Marketing and she needs a laptop.
      <div class="ExternalClassC2AA5D409B4D412C8B5DAB3F484BE7E4"><p>Where can we get new hardware before then? &#160;We have a new hire in Product Marketing and she needs a laptop.</p></div>
      41
      10
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [new, hardware, new, product]
    
    
      12
      Fabrice Canel
      None
      It's too bad we can't get them sooner. I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.
      <div class="ExternalClass501FFBDBC6C947F6BD10168CEE22F937"><p>It's too bad we can't get them sooner.&#160; I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.&#160; <br></p></div>
      32
      10
      NaN
      FabriceC@lnidemo.onmicrosoft.com
      [help, desk, holiday, upgrade]
    
    
      13
      Pavel Bansky
      None
      Molly, I think if you check with Garret Vargas he might be able to help. There is also a spare parts room down the hall from me if you need smaller things.
      <div class="ExternalClass7ABFB4067F634305BE7A48A3739DC694"><p>Molly, I think if you check with Garret Vargas he might be able to help.&#160; There is also a spare parts room down the hall from me if you need smaller things.</p></div>
      42
      10
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [think, check, garret, vargas, able, help, need, things]
    
    
      14
      Molly Dempsey
      None
      Thanks, Pavel. We haveeverything figured out now!
      <div class="ExternalClass05C0F75DFCDF43FBA95CF57A23B09DAA"><span style="line-height&#58;18px;">Thanks, Pavel. &#160;We have&#160;everything figured out now!</span></div>
      41
      10
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [thanks]
    
    
      15
      Aziz Hassouneh
      Contoso Electronics Annual Conference
      Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters. Come and see the latest Contoso products on display and what will be new in 2013. I am posting this for Junmin Hao, as he is on Vacation this week.
      <div class="ExternalClass9EF3920ACE76463490E6F3D3B4A891C3"><p>Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters.&#160; Come and see the latest Contoso products on display and what will be new in 2013.&#160; I am posting this for Junmin Hao, as he is on Vacation this week.<br></p></div>
      26
      NaN
      18
      AzizH@lnidemo.onmicrosoft.com
      [like, contoso, electronics, right, contoso, latest, contoso, products, new, week]
    
    
      16
      Alex Darrow
      None
      If I can get some time away from my project then I will try to stop by! It would be nice to get a better idea of what is coming out next year.
      <div class="ExternalClass4073E610825D4932912C24A745D83703"><p>If I can get some time away from my project then I will try to stop by! &#160;It would be nice to get a better idea of what is coming out next year.</p></div>
      23
      15
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [time, project, try, idea, year]
    
    
      17
      Belinda Newman
      None
      Will this be happening in the lobby of building one like last year?
      <div class="ExternalClass3D5D71E7DB7B4F0CB49C3DD110E5BD35"><p>Will this be happening in the lobby of building one like last year?<br></p></div>
      27
      15
      NaN
      BelindaN@lnidemo.onmicrosoft.com
      [like, year]
    
    
      18
      Katie Jordan
      None
      I think so, Belinda. I believe there will be a formal announcement about it soon.
      <div class="ExternalClass6C98B30759914971924FC738C498DFBD"><p>I think so, Belinda.&#160; I believe there will be a formal announcement about it soon.&#160; </p></div>
      39
      15
      NaN
      KatieJ@lnidemo.onmicrosoft.com
      [think, soon]
    
    
      19
      Janet Schorr
      None
      Last year there were some projects that I hadn't even heard about. It is helpful to get filled in on what the Engineering and R&D departments are up to.
      <div class="ExternalClass716703823FD3406CA443ED98207BFBF1"><p>Last year there were some projects that I hadn't even heard about. &#160;It is helpful to get filled in on what the Engineering and R&amp;D departments are up to.</p></div>
      35
      15
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [year, projects, heard, helpful]
    
    
      20
      Bonnie Kearney
      Reminder: Contoso Site Scheduled Maintenance
      Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4:00am to 9:00am next Mondaymorning. The site will be down during this time, so please plan accordingly.
      <div class="ExternalClassEAFB575A5F864303B5629648ACDAD4B4"><p>Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4&#58;00am to 9&#58;00am next Monday&#160;morning. &#160;The site will be down during this time, so please plan accordingly. &#160;</p></div>
      28
      NaN
      23
      BonnieK@lnidemo.onmicrosoft.com
      [remember, scheduled, contoso, site, site, time]
    
    
      21
      Aziz Hassouneh
      None
      Do you have any idea what changes are going to be made?
      <div class="ExternalClass85891AEB4CDA411B9D2A0DEE1CF3DDBE"><p>Do you have any idea what changes are going to be made?<br></p></div>
      26
      20
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [idea, changes]
    
    
      22
      Alex Darrow
      None
      I think the changes are mostly behind-the-scenes.
      <div class="ExternalClassC26D91B6FACE431C8299819E70B9B606"><p>I think the changes are mostly behind-the-scenes. &#160;</p></div>
      23
      20
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [think, changes]
    
    
      23
      Janet Schorr
      None
      There will be a few security updates as well as some UI changes in the product view area. See the Operations page for a full list of updates.
      <div class="ExternalClassEECEFF9459394301A23688F20397EFAA"><p>There will be a few security updates as well as some UI changes in the product view area. &#160;See the Operations page for a full list of updates.</p></div>
      35
      20
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [security, updates, changes, product, operations, page, updates]
    
    
      24
      Pavel Bansky
      Contoso Boston Show is Going Forward
      Contoso Sales product show confirmed in Boston on the 15th of November. The long awaitednew lineup of QT2000 models will be unveiled to the public. Please mark your calendars if you are scheduled to attend.
      <div class="ExternalClass7CEF049EFC104DF9B78DE8F3FFC913BA"><p>Contoso Sales product show confirmed in Boston on the 15th of November.&#160; The long awaited&#160;new lineup of QT2000 models will be unveiled to the public.&#160; Please mark your calendars if you are scheduled to attend.</p></div>
      42
      NaN
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [contoso, product, confirmed, november, scheduled]
    
    
      25
      Janet Schorr
      None
      I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.
      <div class="ExternalClass3C8C616E8D34448BB6F7471EDD28FEF6"><p>I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.</p></div>
      35
      24
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [like, new, products]
    
    
      26
      Alex Darrow
      Contoso Cyber Security
      I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.
      <div class="ExternalClassAD866A0079F74461B272CAB3A50051C0"><p>I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.</p></div>
      23
      NaN
      27
      AlexD@lnidemo.onmicrosoft.com
      [reading, current, cyber, security, electronics, wondering, contoso, cyber, security]
    
    
      27
      Katie Jordan
      None
      Thank you for the concern, Alex! From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.
      <div class="ExternalClass8E9D74A7CB73431FBBB74ED732C554B0"><p>Thank you for the concern, Alex!&#160; From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.</p></div>
      39
      26
      NaN
      KatieJ@lnidemo.onmicrosoft.com
      [thank, garret, vargas, new, cyber, security, november]
    
    
      28
      Aziz Hassouneh
      None
      I was concerned myself, what with the security breach two years ago. Nevertheless, I saw a report on posted by Garret Vargason Monday about the new security system.
      <div class="ExternalClassA5645BC66C2644249369F20188D76E57"><p>I was concerned myself, what with the security breach two years ago.&#160; Nevertheless, I saw a report on posted by Garret Vargas&#160;on Monday about the new security system. <br></p></div>
      26
      26
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [security, ago, saw, garret, new, security]
    
    
      29
      Belinda Newman
      None
      If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.
      <div class="ExternalClassFF5CC4DD8C554F5B94278832C9E95D64"><p>If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.<br></p></div>
      27
      26
      NaN
      BelindaN@lnidemo.onmicrosoft.com
      [current, year, operations, documents]
    
    
      30
      Garth Fort
      None
      With the new patch released for Word, I think some of our problems may be lessened. See my latest post.
      <div class="ExternalClass084CADB1843B4786B1CA4362B0789760"><p>With the new patch released for Word, I think some of our problems may be lessened. &#160;See my latest post.</p></div>
      34
      26
      NaN
      GarthF@lnidemo.onmicrosoft.com
      [new, patch, released, word, think, problems, latest]
    
    
      31
      Belinda Newman
      Browser Recomendation?
      I was wondering what all of you prefer to use as your default browser when at work or home. I use Internet Explorer 9, as it seems the most stable when running a large number of tabs. What do you think?
      <div class="ExternalClass0542C39609D349F78EE04C7B04013676"><p>I was wondering what all of you prefer to use as your default browser when at work or home.&#160; I use Internet Explorer 9, as it seems the most stable when running a large number of tabs.&#160; What do you think?<br></p></div>
      27
      NaN
      32
      BelindaN@lnidemo.onmicrosoft.com
      [wondering, use, work, home, use, internet, explorer, number, think]
    
    
      32
      Alex Darrow
      None
      I have always used IE primarily myself.
      <div class="ExternalClass1EBC1F98DCC54A47A42334587D6E5096"><p>I have always used IE primarily myself.</p></div>
      23
      31
      NaN
      AlexD@lnidemo.onmicrosoft.com
      []
    
    
      33
      Garth Fort
      None
      Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.
      <div class="ExternalClass70900ADCE11145AB89AAAD9BE04364DB"><p>Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.</p></div>
      34
      31
      NaN
      GarthF@lnidemo.onmicrosoft.com
      [use, firefox, home, internet, explorer, work, use]
    
    
      34
      Pavel Bansky
      None
      Have you heard about the vulnerabilities with the latest version of Firefox?
      <div class="ExternalClassE330B4A36E1B489FB2469A238E5CDF48"><p>Have you heard about the vulnerabilities with the latest version of Firefox?&#160; </p></div>
      42
      31
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [heard, latest, version, firefox]
    
    
      35
      Dorena Paschke
      Tablet PCs to be Issued Next Month?
      A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops. Has this been confirmed? Is there a request form available? Thanks in advance!
      <div class="ExternalClass866847BAC5DD4FD4993F5A862035C78E"><p>A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops.&#160; Has this been confirmed?&#160; Is there a request form available?&#160; Thanks in advance!<br></p></div>
      31
      NaN
      36
      DorenaP@lnidemo.onmicrosoft.com
      [ago, email, request, tablet, pcs, confirmed, request, form, thanks]
    
    
      36
      Aziz Hassouneh
      None
      I think if you ask Garret Vargas from Operations he might be able to help you.
      <div class="ExternalClassFDB19559BDD140FF8D6F3427269E28B1"><p>I think if you ask Garret Vargas from Operations he might be able to help you.</p></div>
      26
      35
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [think, garret, vargas, operations, able, help]
    
    
      37
      Alex Darrow
      None
      I will email him about that. A tablet PC would be quite helpful for creating the new round of holiday ads.
      <div class="ExternalClass7D8995E7F79341EF95C75A2B06F379B5"><p>I will email him about that. &#160;A tablet PC would be quite helpful for creating the new round of holiday ads.</p></div>
      23
      35
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [email, tablet, helpful, new, round, holiday]
    
    
      38
      Belinda Newman
      None
      You are right, Aziz. I just saw him today and he told me to check the Operations document center to get the request form.
      <div class="ExternalClass85293723737444E09824C4E104EA6181"><p>You are right, Aziz.&#160; I just saw him today and he told me to check the Operations document center to get the request form.<br></p></div>
      27
      35
      NaN
      BelindaN@lnidemo.onmicrosoft.com
      [right, saw, check, operations, request, form]
    
    
      39
      Dorena Paschke
      Please Remember to Keep your Software Updated
      So that we can all stay on the same page, please remember to keep your software items updated. I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents. Many of these problems can be resolved by doing a simple update. The IT help desk can provide assistance of you are unsure of what to do.
      <div class="ExternalClass37DEF653B8124DCD84D9F9900651C73C"><p>So that we can all stay on the same page, please remember to keep your software items updated.&#160; I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents.&#160; Many of these problems can be resolved by doing a simple update.&#160; The IT help desk can provide assistance of you are unsure of what to do.<br></p></div>
      31
      NaN
      NaN
      DorenaP@lnidemo.onmicrosoft.com
      [page, remember, number, having, reading, documents, problems, update, help, desk]
    
    
      40
      Aziz Hassouneh
      None
      Thanks Dorena, I was having a problem with this just last week.
      <div class="ExternalClass4A7B2AB9D13B43AF9C6F12787095CCB7"><p>Thanks Dorena, I was having a problem with this just last week.&#160; </p></div>
      26
      39
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [thanks, having, week]
    
    
      41
      Alex Darrow
      None
      Thank you! I always forget to do that, especially with the new QT series project due next month!
      <div class="ExternalClass7E49E6C38232475FAD90173531BD7D0C"><p>Thank you! &#160;I always forget to do that, especially with the new QT series project due next month!</p></div>
      23
      39
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [thank, new, project]

It's words ... in Space SpaCy

Using SpaCy, the noun phrases and any organization or person named entities are discovered, and added to the df_discussions DataFrame.



In [34]:

    
disc_np = []
disc_op = []

for entry in discussion_bodies:
    doc = nlp(entry)
    nounphrases_and_head = [[np.orth_, np.root.head.orth_] for np in doc.noun_chunks]
    nounphrases = [np.orth_ for np in doc.noun_chunks]
    entities = list(doc.ents)
    orgs_and_people = [entity.orth_ for entity in entities if entity.label_ in ['ORG','PERSON']]
    disc_np.append(', '.join(nounphrases))
    disc_op.append(', '.join(orgs_and_people))

df_discussions['noun_phrases'] = disc_np
df_discussions['orgs_and_people'] = disc_op

How's it looking now?

Time to print the df_discussions DataFrame again, just to see the new stuff (scroll to the right of the table)



In [35]:

    
df_discussions









    Out[35]:






  
    
      
      Title_user
      Title_disc
      Body_Text
      Body
      AuthorId
      ParentItemID
      BestAnswerId
      Email
      relevant_words
      noun_phrases
      orgs_and_people
    
    
      Id
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      Fabrice Canel
      Update to Office, Available Soon!
      The new release of Office will is right around the corner I am not sure when Contoso is planning to upgrade, but I know I will be getting the new version as soon as I can.
      <div class="ExternalClass7680EB0FE93243B1919ABD9B621E78F3"><p>The new release of Office will is right around the corner&#160; <span><span>I am not sure when Contoso is planning to upgrade, but </span></span>I know I will be getting the new version as soon as I can.&#160; <br></p></div>
      32
      NaN
      3
      FabriceC@lnidemo.onmicrosoft.com
      [new, office, right, sure, contoso, upgrade, know, new, version, soon]
      The new release, Office, the corner, I, Contoso, I, I, the new version, I
      Contoso
    
    
      2
      Molly Dempsey
      None
      I am glad you are so up-to-date on things Fabrice! I will try to get the new version soon as well.
      <div class="ExternalClass4889D37A14C34CE7BB830F5A7FF38A96"><p>I am glad you are so up-to-date on things Fabrice! &#160;I will try to get the new version soon as well.</p></div>
      41
      1
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [glad, things, fabrice, try, new, version, soon]
      I, you, things, I, the new version
      
    
    
      3
      Pavel Bansky
      None
      Sounds good Fabrice! Let me know if you need someone to help troubleshoot something with you.
      <div class="ExternalClassF913E494E9CC465C86A6F1692A0E86FC"><p>Sounds good Fabrice!&#160; Let me know if you need someone to help troubleshoot something with you.</p></div>
      42
      1
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [fabrice, know, need, help]
      me, you, someone, something, you
      
    
    
      4
      Kari Furse
      None
      Anyone know what some of the new features are?
      <div class="ExternalClassB2DC0D9515FC4291903985A825BBA466"><p>Anyone know what some of the new features are?</p></div>
      38
      1
      NaN
      KariF@lnidemo.onmicrosoft.com
      [know, new]
      Anyone, what, the new features
      
    
    
      5
      Fabrice Canel
      None
      The Microsoft site has a full description of the new Office products.
      <div class="ExternalClass85990E08E2C24CD296E6711B5D5312ED"><p>The Microsoft site has a full description of the new Office products.<br></p></div>
      32
      1
      NaN
      FabriceC@lnidemo.onmicrosoft.com
      [microsoft, site, new, office, products]
      The Microsoft site, a full description, the new Office products
      Microsoft, Office
    
    
      6
      Garth Fort
      Microsoft Issues Word Patch
      Make sure to update your systems, because Microsoft has released a critical patch for Word. This should help smooth over some of the difficulties we have been having andgive the Security team a rest. Even so, remember not to open files unless they are from a trusted source!
      <div class="ExternalClass1A1D176690C04027BB797CAF92E32C80"><p>Make sure to update your systems, because Microsoft has released a critical patch for Word. &#160;This should help smooth over some of the difficulties we have been having and&#160;give the Security team a rest. &#160;Even so, remember not to open files unless they are from a trusted source!</p></div>
      34
      NaN
      NaN
      GarthF@lnidemo.onmicrosoft.com
      [sure, update, microsoft, released, patch, word, help, having, security, remember]
      your systems, Microsoft, a critical patch, Word, the difficulties, we, andgive, the Security team, the Security team a rest, files, they, a trusted source
      Microsoft
    
    
      7
      Janet Schorr
      None
      This is a helpful reminder, Garth. I will remember to set up the update before I leave for the weekend.
      <div class="ExternalClass66B34417A321490DAE6A30D413D87F74"><p>This is a helpful reminder, Garth. &#160;I will remember to set up the update before I leave for the weekend. &#160;</p></div>
      35
      6
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [helpful, remember, update]
      a helpful reminder, I, the update, I, the weekend
      Garth
    
    
      8
      Kari Furse
      None
      Thanks. I just got the patch.
      <div class="ExternalClass9EEA8838B7D34AA2A1083C7166DDFBC7"><p>Thanks.&#160; I just got the patch.</p></div>
      38
      6
      NaN
      KariF@lnidemo.onmicrosoft.com
      [thanks, got, patch]
      I, the patch
      
    
    
      9
      Molly Dempsey
      None
      I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!
      <div class="ExternalClass7CDA6082559245648E3B114FD0F1A15F"><p>I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!</p></div>
      41
      6
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [glad, got, think, contoso, problems]
      I, it, I, anyone, Contoso, problems, it
      
    
    
      10
      Janet Schorr
      Windows 10 PCs for Next Year
      It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects.
      <div class="ExternalClass8459219F464C4AD3B72DDF6ED7AB52C9"><p>It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects. &#160;</p></div>
      35
      NaN
      13
      JanetS@lnidemo.onmicrosoft.com
      [confirmed, round, hardware, pcs, year, holiday, projects]
      It, the next round, hardware replacements, windows, 10 PCs, the year, interference, ongoing holiday projects
      
    
    
      11
      Molly Dempsey
      None
      Where can we get new hardware before then? We have a new hire in Product Marketing and she needs a laptop.
      <div class="ExternalClassC2AA5D409B4D412C8B5DAB3F484BE7E4"><p>Where can we get new hardware before then? &#160;We have a new hire in Product Marketing and she needs a laptop.</p></div>
      41
      10
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [new, hardware, new, product]
      we, new hardware, We, a new hire, Product Marketing, she, a laptop
      Product Marketing
    
    
      12
      Fabrice Canel
      None
      It's too bad we can't get them sooner. I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.
      <div class="ExternalClass501FFBDBC6C947F6BD10168CEE22F937"><p>It's too bad we can't get them sooner.&#160; I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.&#160; <br></p></div>
      32
      10
      NaN
      FabriceC@lnidemo.onmicrosoft.com
      [help, desk, holiday, upgrade]
      It, we, them, I, the help desk, the holiday rush, the upgrade
      
    
    
      13
      Pavel Bansky
      None
      Molly, I think if you check with Garret Vargas he might be able to help. There is also a spare parts room down the hall from me if you need smaller things.
      <div class="ExternalClass7ABFB4067F634305BE7A48A3739DC694"><p>Molly, I think if you check with Garret Vargas he might be able to help.&#160; There is also a spare parts room down the hall from me if you need smaller things.</p></div>
      42
      10
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [think, check, garret, vargas, able, help, need, things]
      I, you, Garret Vargas, he, a spare parts room, the hall, me, you, smaller things
      Molly, Garret Vargas
    
    
      14
      Molly Dempsey
      None
      Thanks, Pavel. We haveeverything figured out now!
      <div class="ExternalClass05C0F75DFCDF43FBA95CF57A23B09DAA"><span style="line-height&#58;18px;">Thanks, Pavel. &#160;We have&#160;everything figured out now!</span></div>
      41
      10
      NaN
      MollyD@lnidemo.onmicrosoft.com
      [thanks]
      We
      
    
    
      15
      Aziz Hassouneh
      Contoso Electronics Annual Conference
      Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters. Come and see the latest Contoso products on display and what will be new in 2013. I am posting this for Junmin Hao, as he is on Vacation this week.
      <div class="ExternalClass9EF3920ACE76463490E6F3D3B4A891C3"><p>Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters.&#160; Come and see the latest Contoso products on display and what will be new in 2013.&#160; I am posting this for Junmin Hao, as he is on Vacation this week.<br></p></div>
      26
      NaN
      18
      AzizH@lnidemo.onmicrosoft.com
      [like, contoso, electronics, right, contoso, latest, contoso, products, new, week]
      I, you, the upcoming Contoso Electronics conference, Contoso headquarters, the latest Contoso products, display, what, I, Junmin Hao, he, Vacation
      Contoso Electronics, Junmin Hao
    
    
      16
      Alex Darrow
      None
      If I can get some time away from my project then I will try to stop by! It would be nice to get a better idea of what is coming out next year.
      <div class="ExternalClass4073E610825D4932912C24A745D83703"><p>If I can get some time away from my project then I will try to stop by! &#160;It would be nice to get a better idea of what is coming out next year.</p></div>
      23
      15
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [time, project, try, idea, year]
      I, some time, my project, I, It, a better idea, what
      
    
    
      17
      Belinda Newman
      None
      Will this be happening in the lobby of building one like last year?
      <div class="ExternalClass3D5D71E7DB7B4F0CB49C3DD110E5BD35"><p>Will this be happening in the lobby of building one like last year?<br></p></div>
      27
      15
      NaN
      BelindaN@lnidemo.onmicrosoft.com
      [like, year]
      the lobby, last year
      
    
    
      18
      Katie Jordan
      None
      I think so, Belinda. I believe there will be a formal announcement about it soon.
      <div class="ExternalClass6C98B30759914971924FC738C498DFBD"><p>I think so, Belinda.&#160; I believe there will be a formal announcement about it soon.&#160; </p></div>
      39
      15
      NaN
      KatieJ@lnidemo.onmicrosoft.com
      [think, soon]
      I, I, a formal announcement, it
      
    
    
      19
      Janet Schorr
      None
      Last year there were some projects that I hadn't even heard about. It is helpful to get filled in on what the Engineering and R&D departments are up to.
      <div class="ExternalClass716703823FD3406CA443ED98207BFBF1"><p>Last year there were some projects that I hadn't even heard about. &#160;It is helpful to get filled in on what the Engineering and R&amp;D departments are up to.</p></div>
      35
      15
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [year, projects, heard, helpful]
      some projects, I, It, what, the Engineering and R&D departments
      
    
    
      20
      Bonnie Kearney
      Reminder: Contoso Site Scheduled Maintenance
      Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4:00am to 9:00am next Mondaymorning. The site will be down during this time, so please plan accordingly.
      <div class="ExternalClassEAFB575A5F864303B5629648ACDAD4B4"><p>Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4&#58;00am to 9&#58;00am next Monday&#160;morning. &#160;The site will be down during this time, so please plan accordingly. &#160;</p></div>
      28
      NaN
      23
      BonnieK@lnidemo.onmicrosoft.com
      [remember, scheduled, contoso, site, site, time]
      you, a scheduled maintenence session, the Contoso site, The site, this time
      
    
    
      21
      Aziz Hassouneh
      None
      Do you have any idea what changes are going to be made?
      <div class="ExternalClass85891AEB4CDA411B9D2A0DEE1CF3DDBE"><p>Do you have any idea what changes are going to be made?<br></p></div>
      26
      20
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [idea, changes]
      you, any idea, what changes
      
    
    
      22
      Alex Darrow
      None
      I think the changes are mostly behind-the-scenes.
      <div class="ExternalClassC26D91B6FACE431C8299819E70B9B606"><p>I think the changes are mostly behind-the-scenes. &#160;</p></div>
      23
      20
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [think, changes]
      I, the changes, behind-the-scenes
      
    
    
      23
      Janet Schorr
      None
      There will be a few security updates as well as some UI changes in the product view area. See the Operations page for a full list of updates.
      <div class="ExternalClassEECEFF9459394301A23688F20397EFAA"><p>There will be a few security updates as well as some UI changes in the product view area. &#160;See the Operations page for a full list of updates.</p></div>
      35
      20
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [security, updates, changes, product, operations, page, updates]
      a few security updates, some UI changes, the product view area, the Operations page, a full list, updates
      
    
    
      24
      Pavel Bansky
      Contoso Boston Show is Going Forward
      Contoso Sales product show confirmed in Boston on the 15th of November. The long awaitednew lineup of QT2000 models will be unveiled to the public. Please mark your calendars if you are scheduled to attend.
      <div class="ExternalClass7CEF049EFC104DF9B78DE8F3FFC913BA"><p>Contoso Sales product show confirmed in Boston on the 15th of November.&#160; The long awaited&#160;new lineup of QT2000 models will be unveiled to the public.&#160; Please mark your calendars if you are scheduled to attend.</p></div>
      42
      NaN
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [contoso, product, confirmed, november, scheduled]
      Boston, the 15th, November, The long awaitednew lineup, QT2000 models, the public, your calendars, you
      
    
    
      25
      Janet Schorr
      None
      I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.
      <div class="ExternalClass3C8C616E8D34448BB6F7471EDD28FEF6"><p>I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.</p></div>
      35
      24
      NaN
      JanetS@lnidemo.onmicrosoft.com
      [like, new, products]
      I, the ad campaign, that show, I, any new information, the products
      
    
    
      26
      Alex Darrow
      Contoso Cyber Security
      I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.
      <div class="ExternalClassAD866A0079F74461B272CAB3A50051C0"><p>I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.</p></div>
      23
      NaN
      27
      AlexD@lnidemo.onmicrosoft.com
      [reading, current, cyber, security, electronics, wondering, contoso, cyber, security]
      I, an article, current cyber security issues, some big electronics companies, the last couple, months, I, what, the state, Contoso, Cyber Security
      Contoso Cyber Security
    
    
      27
      Katie Jordan
      None
      Thank you for the concern, Alex! From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.
      <div class="ExternalClass8E9D74A7CB73431FBBB74ED732C554B0"><p>Thank you for the concern, Alex!&#160; From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.</p></div>
      39
      26
      NaN
      KatieJ@lnidemo.onmicrosoft.com
      [thank, garret, vargas, new, cyber, security, november]
      you, the concern, I, David Longmuir, Garret Vargas, a new cyber security system, place, the end, November
      Alex!, David Longmuir, Garret Vargas
    
    
      28
      Aziz Hassouneh
      None
      I was concerned myself, what with the security breach two years ago. Nevertheless, I saw a report on posted by Garret Vargason Monday about the new security system.
      <div class="ExternalClassA5645BC66C2644249369F20188D76E57"><p>I was concerned myself, what with the security breach two years ago.&#160; Nevertheless, I saw a report on posted by Garret Vargas&#160;on Monday about the new security system. <br></p></div>
      26
      26
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [security, ago, saw, garret, new, security]
      I, the security breach, I, a report, Garret Vargason, the new security system
      Garret Vargason
    
    
      29
      Belinda Newman
      None
      If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.
      <div class="ExternalClassFF5CC4DD8C554F5B94278832C9E95D64"><p>If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.<br></p></div>
      27
      26
      NaN
      BelindaN@lnidemo.onmicrosoft.com
      [current, year, operations, documents]
      anyone, you, the Virus Reports, the current year, the Operations department, Documents
      Operations
    
    
      30
      Garth Fort
      None
      With the new patch released for Word, I think some of our problems may be lessened. See my latest post.
      <div class="ExternalClass084CADB1843B4786B1CA4362B0789760"><p>With the new patch released for Word, I think some of our problems may be lessened. &#160;See my latest post.</p></div>
      34
      26
      NaN
      GarthF@lnidemo.onmicrosoft.com
      [new, patch, released, word, think, problems, latest]
      the new patch, Word, I, our problems, my latest post
      Word
    
    
      31
      Belinda Newman
      Browser Recomendation?
      I was wondering what all of you prefer to use as your default browser when at work or home. I use Internet Explorer 9, as it seems the most stable when running a large number of tabs. What do you think?
      <div class="ExternalClass0542C39609D349F78EE04C7B04013676"><p>I was wondering what all of you prefer to use as your default browser when at work or home.&#160; I use Internet Explorer 9, as it seems the most stable when running a large number of tabs.&#160; What do you think?<br></p></div>
      27
      NaN
      32
      BelindaN@lnidemo.onmicrosoft.com
      [wondering, use, work, home, use, internet, explorer, number, think]
      I, what, you, your default browser, work, home, I, Internet Explorer, it, a large number, tabs, What, you
      
    
    
      32
      Alex Darrow
      None
      I have always used IE primarily myself.
      <div class="ExternalClass1EBC1F98DCC54A47A42334587D6E5096"><p>I have always used IE primarily myself.</p></div>
      23
      31
      NaN
      AlexD@lnidemo.onmicrosoft.com
      []
      I, IE
      
    
    
      33
      Garth Fort
      None
      Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.
      <div class="ExternalClass70900ADCE11145AB89AAAD9BE04364DB"><p>Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.</p></div>
      34
      31
      NaN
      GarthF@lnidemo.onmicrosoft.com
      [use, firefox, home, internet, explorer, work, use]
      I, Firefox, home, I, Internet Explorer, work use
      
    
    
      34
      Pavel Bansky
      None
      Have you heard about the vulnerabilities with the latest version of Firefox?
      <div class="ExternalClassE330B4A36E1B489FB2469A238E5CDF48"><p>Have you heard about the vulnerabilities with the latest version of Firefox?&#160; </p></div>
      42
      31
      NaN
      PavelB@lnidemo.onmicrosoft.com
      [heard, latest, version, firefox]
      you, the vulnerabilities, the latest version, Firefox
      Firefox
    
    
      35
      Dorena Paschke
      Tablet PCs to be Issued Next Month?
      A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops. Has this been confirmed? Is there a request form available? Thanks in advance!
      <div class="ExternalClass866847BAC5DD4FD4993F5A862035C78E"><p>A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops.&#160; Has this been confirmed?&#160; Is there a request form available?&#160; Thanks in advance!<br></p></div>
      31
      NaN
      36
      DorenaP@lnidemo.onmicrosoft.com
      [ago, email, request, tablet, pcs, confirmed, request, form, thanks]
      an email, a request, Tablet PCs, our old laptops, a request form, advance
      Tablet
    
    
      36
      Aziz Hassouneh
      None
      I think if you ask Garret Vargas from Operations he might be able to help you.
      <div class="ExternalClassFDB19559BDD140FF8D6F3427269E28B1"><p>I think if you ask Garret Vargas from Operations he might be able to help you.</p></div>
      26
      35
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [think, garret, vargas, operations, able, help]
      I, you, Garret Vargas, Operations, he, you
      Garret Vargas
    
    
      37
      Alex Darrow
      None
      I will email him about that. A tablet PC would be quite helpful for creating the new round of holiday ads.
      <div class="ExternalClass7D8995E7F79341EF95C75A2B06F379B5"><p>I will email him about that. &#160;A tablet PC would be quite helpful for creating the new round of holiday ads.</p></div>
      23
      35
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [email, tablet, helpful, new, round, holiday]
      I, him, A tablet PC, the new round, holiday ads
      
    
    
      38
      Belinda Newman
      None
      You are right, Aziz. I just saw him today and he told me to check the Operations document center to get the request form.
      <div class="ExternalClass85293723737444E09824C4E104EA6181"><p>You are right, Aziz.&#160; I just saw him today and he told me to check the Operations document center to get the request form.<br></p></div>
      27
      35
      NaN
      BelindaN@lnidemo.onmicrosoft.com
      [right, saw, check, operations, request, form]
      You, I, him, he, me, the Operations document center, the request form
      Aziz, Operations
    
    
      39
      Dorena Paschke
      Please Remember to Keep your Software Updated
      So that we can all stay on the same page, please remember to keep your software items updated. I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents. Many of these problems can be resolved by doing a simple update. The IT help desk can provide assistance of you are unsure of what to do.
      <div class="ExternalClass37DEF653B8124DCD84D9F9900651C73C"><p>So that we can all stay on the same page, please remember to keep your software items updated.&#160; I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents.&#160; Many of these problems can be resolved by doing a simple update.&#160; The IT help desk can provide assistance of you are unsure of what to do.<br></p></div>
      31
      NaN
      NaN
      DorenaP@lnidemo.onmicrosoft.com
      [page, remember, number, having, reading, documents, problems, update, help, desk]
      we, the same page, your software items, I, a number, complaints, people, who, trouble, meetings, documents, these problems, a simple update, The IT help desk, assistance, you, what
      
    
    
      40
      Aziz Hassouneh
      None
      Thanks Dorena, I was having a problem with this just last week.
      <div class="ExternalClass4A7B2AB9D13B43AF9C6F12787095CCB7"><p>Thanks Dorena, I was having a problem with this just last week.&#160; </p></div>
      26
      39
      NaN
      AzizH@lnidemo.onmicrosoft.com
      [thanks, having, week]
      I, a problem, this just last week
      
    
    
      41
      Alex Darrow
      None
      Thank you! I always forget to do that, especially with the new QT series project due next month!
      <div class="ExternalClass7E49E6C38232475FAD90173531BD7D0C"><p>Thank you! &#160;I always forget to do that, especially with the new QT series project due next month!</p></div>
      23
      39
      NaN
      AlexD@lnidemo.onmicrosoft.com
      [thank, new, project]
      you, I, the new QT series project

Excel users, fear not

You aren't left out. The df_discussions DataFrame is saved to a CSV file for later examination with Excel.



In [36]:

    
df_discussions.to_csv('contoso.csv')

Who's active on the site (and who's not)

Using the list of AuthorId values from the df_discussions DataFrame, the active_users variable is set to a filtered copy of the original df_users DataFrame. The inverse of that list of authors is used to populate inactive_users with another filtered copy of df_users.



In [37]:

    
active_user_ids = set(df_discussions.AuthorId.values)
active_users = df_users[df_users.index.isin(active_user_ids)]
inactive_users = df_users[~df_users.index.isin(active_user_ids)]
inactive_users = inactive_users[~(inactive_users.Email == "")]

display(HTML('<h2>Active Users</h2>'))
display(HTML(active_users.to_html()))
display(HTML('<h2>Inactive Users</h2>'))
display(HTML(inactive_users.to_html()))









    




Active Users






    





  
    
      
      Email
      Title
    
    
      Id
      
      
    
  
  
    
      23
      AlexD@lnidemo.onmicrosoft.com
      Alex Darrow
    
    
      26
      AzizH@lnidemo.onmicrosoft.com
      Aziz Hassouneh
    
    
      27
      BelindaN@lnidemo.onmicrosoft.com
      Belinda Newman
    
    
      28
      BonnieK@lnidemo.onmicrosoft.com
      Bonnie Kearney
    
    
      31
      DorenaP@lnidemo.onmicrosoft.com
      Dorena Paschke
    
    
      32
      FabriceC@lnidemo.onmicrosoft.com
      Fabrice Canel
    
    
      34
      GarthF@lnidemo.onmicrosoft.com
      Garth Fort
    
    
      35
      JanetS@lnidemo.onmicrosoft.com
      Janet Schorr
    
    
      38
      KariF@lnidemo.onmicrosoft.com
      Kari Furse
    
    
      39
      KatieJ@lnidemo.onmicrosoft.com
      Katie Jordan
    
    
      41
      MollyD@lnidemo.onmicrosoft.com
      Molly Dempsey
    
    
      42
      PavelB@lnidemo.onmicrosoft.com
      Pavel Bansky
    
  







    




Inactive Users






    





  
    
      
      Email
      Title
    
    
      Id
      
      
    
  
  
    
      24
      AllieB@lnidemo.onmicrosoft.com
      Allie Bellew
    
    
      25
      AnneW@lnidemo.onmicrosoft.com
      Anne Wallace
    
    
      29
      DavidL@lnidemo.onmicrosoft.com
      David Longmuir
    
    
      30
      DenisD@lnidemo.onmicrosoft.com
      Denis Dehenne
    
    
      33
      GarretV@lnidemo.onmicrosoft.com
      Garret Vargas
    
    
      36
      JulianI@lnidemo.onmicrosoft.com
      Julian Isla
    
    
      37
      JunminH@lnidemo.onmicrosoft.com
      Junmin Hao
    
    
      17
      admin@lnidemo.onmicrosoft.com
      MOD Administrator
    
    
      40
      RobinC@lnidemo.onmicrosoft.com
      Robin Counts
    
    
      43
      RobY@lnidemo.onmicrosoft.com
      Rob Young
    
    
      44
      SaraD@lnidemo.onmicrosoft.com
      Sara Davis
    
    
      45
      TonyK@lnidemo.onmicrosoft.com
      Tony Krijnen
    
    
      46
      ZrinkaM@lnidemo.onmicrosoft.com
      Zrinka Makovac

Just for fun...

Just for fun, here's a word cloud of ALL the words from the discussions list.



In [38]:

    
%matplotlib inline
from wordcloud import WordCloud

wctext = ' '.join([word for sublist in texts for word in sublist])


# take relative word frequencies into account, lower max_font_size
wc = WordCloud(background_color="white", max_font_size=40)
wc.generate(wctext)
plt.figure(figsize=(12,12))
plt.imshow(wc)
plt.axis("off")
plt.show()

	Email	Title
Id
23	AlexD@lnidemo.onmicrosoft.com	Alex Darrow
24	AllieB@lnidemo.onmicrosoft.com	Allie Bellew
25	AnneW@lnidemo.onmicrosoft.com	Anne Wallace
26	AzizH@lnidemo.onmicrosoft.com	Aziz Hassouneh
27	BelindaN@lnidemo.onmicrosoft.com	Belinda Newman
28	BonnieK@lnidemo.onmicrosoft.com	Bonnie Kearney
29	DavidL@lnidemo.onmicrosoft.com	David Longmuir
30	DenisD@lnidemo.onmicrosoft.com	Denis Dehenne
31	DorenaP@lnidemo.onmicrosoft.com	Dorena Paschke
13		Everyone
16		Everyone except external users
32	FabriceC@lnidemo.onmicrosoft.com	Fabrice Canel
33	GarretV@lnidemo.onmicrosoft.com	Garret Vargas
34	GarthF@lnidemo.onmicrosoft.com	Garth Fort
35	JanetS@lnidemo.onmicrosoft.com	Janet Schorr
36	JulianI@lnidemo.onmicrosoft.com	Julian Isla
37	JunminH@lnidemo.onmicrosoft.com	Junmin Hao
38	KariF@lnidemo.onmicrosoft.com	Kari Furse
39	KatieJ@lnidemo.onmicrosoft.com	Katie Jordan
17	admin@lnidemo.onmicrosoft.com	MOD Administrator
41	MollyD@lnidemo.onmicrosoft.com	Molly Dempsey
12		NT AUTHORITY\authenticated users
42	PavelB@lnidemo.onmicrosoft.com	Pavel Bansky
20		Provisioning User
19		Provisioning User
18		Provisioning User
21		Provisioning User
22		Provisioning User
40	RobinC@lnidemo.onmicrosoft.com	Robin Counts
43	RobY@lnidemo.onmicrosoft.com	Rob Young
44	SaraD@lnidemo.onmicrosoft.com	Sara Davis
14		_spocrwl_186_16429
1073741823		System Account
45	TonyK@lnidemo.onmicrosoft.com	Tony Krijnen
46	ZrinkaM@lnidemo.onmicrosoft.com	Zrinka Makovac

	Title	Description	ItemCount	Hidden
Id
0e5f4ea4-cefc-4137-926e-476396f620bc	Discussions List	Use the Discussion list to hold forum-style conversations, including question and answer, on topics relevant to your team, project, or community.	41	False
389381be-4e08-488e-9f62-94f6615ed487	Community Members	This list keeps a record of ongoing activity by members and reputation they accrue within this community.	13	False
a8648997-cb67-4aff-8eae-ed782e5cb4b8	Site Pages		6	False
d1e1de69-2934-4904-b128-b2509a763635	Categories	Use the Categories list to define the categories available for discussion list posts.	1	False
fa06102c-e32e-42fa-8a34-d312d7352a2e	Documents	This system library was created by the Publishing feature to store documents that are used on pages in this site.	0	False

	Title	Body	AuthorId	Id	ParentItemID	BestAnswerId
0	Update to Office, Available Soon!	<div class="ExternalClass7680EB0FE93243B1919ABD9B621E78F3"><p>The new release of Office will is right around the corner  <span><span>I am not sure when Contoso is planning to upgrade, but </span></span>I know I will be getting the new version as soon as I can.  <br></p></div>	32	1	NaN	3
1	None	<div class="ExternalClass4889D37A14C34CE7BB830F5A7FF38A96"><p>I am glad you are so up-to-date on things Fabrice!  I will try to get the new version soon as well.</p></div>	41	2	1	NaN
2	None	<div class="ExternalClassF913E494E9CC465C86A6F1692A0E86FC"><p>Sounds good Fabrice!  Let me know if you need someone to help troubleshoot something with you.</p></div>	42	3	1	NaN
3	None	<div class="ExternalClassB2DC0D9515FC4291903985A825BBA466"><p>Anyone know what some of the new features are?</p></div>	38	4	1	NaN
4	None	<div class="ExternalClass85990E08E2C24CD296E6711B5D5312ED"><p>The Microsoft site has a full description of the new Office products.<br></p></div>	32	5	1	NaN
5	Microsoft Issues Word Patch	<div class="ExternalClass1A1D176690C04027BB797CAF92E32C80"><p>Make sure to update your systems, because Microsoft has released a critical patch for Word.  This should help smooth over some of the difficulties we have been having and give the Security team a rest.  Even so, remember not to open files unless they are from a trusted source!</p></div>	34	6	NaN	NaN
6	None	<div class="ExternalClass66B34417A321490DAE6A30D413D87F74"><p>This is a helpful reminder, Garth.  I will remember to set up the update before I leave for the weekend.  </p></div>	35	7	6	NaN
7	None	<div class="ExternalClass9EEA8838B7D34AA2A1083C7166DDFBC7"><p>Thanks.  I just got the patch.</p></div>	38	8	6	NaN
8	None	<div class="ExternalClass7CDA6082559245648E3B114FD0F1A15F"><p>I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!</p></div>	41	9	6	NaN
9	Windows 10 PCs for Next Year	<div class="ExternalClass8459219F464C4AD3B72DDF6ED7AB52C9"><p>It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects.  </p></div>	35	10	NaN	13
10	None	<div class="ExternalClassC2AA5D409B4D412C8B5DAB3F484BE7E4"><p>Where can we get new hardware before then?  We have a new hire in Product Marketing and she needs a laptop.</p></div>	41	11	10	NaN
11	None	<div class="ExternalClass501FFBDBC6C947F6BD10168CEE22F937"><p>It's too bad we can't get them sooner.  I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.  <br></p></div>	32	12	10	NaN
12	None	<div class="ExternalClass7ABFB4067F634305BE7A48A3739DC694"><p>Molly, I think if you check with Garret Vargas he might be able to help.  There is also a spare parts room down the hall from me if you need smaller things.</p></div>	42	13	10	NaN
13	None	<div class="ExternalClass05C0F75DFCDF43FBA95CF57A23B09DAA"><span style="line-height:18px;">Thanks, Pavel.  We have everything figured out now!</span></div>	41	14	10	NaN
14	Contoso Electronics Annual Conference	<div class="ExternalClass9EF3920ACE76463490E6F3D3B4A891C3"><p>Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters.  Come and see the latest Contoso products on display and what will be new in 2013.  I am posting this for Junmin Hao, as he is on Vacation this week.<br></p></div>	26	15	NaN	18
15	None	<div class="ExternalClass4073E610825D4932912C24A745D83703"><p>If I can get some time away from my project then I will try to stop by!  It would be nice to get a better idea of what is coming out next year.</p></div>	23	16	15	NaN
16	None	<div class="ExternalClass3D5D71E7DB7B4F0CB49C3DD110E5BD35"><p>Will this be happening in the lobby of building one like last year?<br></p></div>	27	17	15	NaN
17	None	<div class="ExternalClass6C98B30759914971924FC738C498DFBD"><p>I think so, Belinda.  I believe there will be a formal announcement about it soon.  </p></div>	39	18	15	NaN
18	None	<div class="ExternalClass716703823FD3406CA443ED98207BFBF1"><p>Last year there were some projects that I hadn't even heard about.  It is helpful to get filled in on what the Engineering and R&D departments are up to.</p></div>	35	19	15	NaN
19	Reminder: Contoso Site Scheduled Maintenance	<div class="ExternalClassEAFB575A5F864303B5629648ACDAD4B4"><p>Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4:00am to 9:00am next Monday morning.  The site will be down during this time, so please plan accordingly.  </p></div>	28	20	NaN	23
20	None	<div class="ExternalClass85891AEB4CDA411B9D2A0DEE1CF3DDBE"><p>Do you have any idea what changes are going to be made?<br></p></div>	26	21	20	NaN
21	None	<div class="ExternalClassC26D91B6FACE431C8299819E70B9B606"><p>I think the changes are mostly behind-the-scenes.  </p></div>	23	22	20	NaN
22	None	<div class="ExternalClassEECEFF9459394301A23688F20397EFAA"><p>There will be a few security updates as well as some UI changes in the product view area.  See the Operations page for a full list of updates.</p></div>	35	23	20	NaN
23	Contoso Boston Show is Going Forward	<div class="ExternalClass7CEF049EFC104DF9B78DE8F3FFC913BA"><p>Contoso Sales product show confirmed in Boston on the 15th of November.  The long awaited new lineup of QT2000 models will be unveiled to the public.  Please mark your calendars if you are scheduled to attend.</p></div>	42	24	NaN	NaN
24	None	<div class="ExternalClass3C8C616E8D34448BB6F7471EDD28FEF6"><p>I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.</p></div>	35	25	24	NaN
25	Contoso Cyber Security	<div class="ExternalClassAD866A0079F74461B272CAB3A50051C0"><p>I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.</p></div>	23	26	NaN	27
26	None	<div class="ExternalClass8E9D74A7CB73431FBBB74ED732C554B0"><p>Thank you for the concern, Alex!  From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.</p></div>	39	27	26	NaN
27	None	<div class="ExternalClassA5645BC66C2644249369F20188D76E57"><p>I was concerned myself, what with the security breach two years ago.  Nevertheless, I saw a report on posted by Garret Vargas on Monday about the new security system. <br></p></div>	26	28	26	NaN
28	None	<div class="ExternalClassFF5CC4DD8C554F5B94278832C9E95D64"><p>If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.<br></p></div>	27	29	26	NaN
29	None	<div class="ExternalClass084CADB1843B4786B1CA4362B0789760"><p>With the new patch released for Word, I think some of our problems may be lessened.  See my latest post.</p></div>	34	30	26	NaN
30	Browser Recomendation?	<div class="ExternalClass0542C39609D349F78EE04C7B04013676"><p>I was wondering what all of you prefer to use as your default browser when at work or home.  I use Internet Explorer 9, as it seems the most stable when running a large number of tabs.  What do you think?<br></p></div>	27	31	NaN	32
31	None	<div class="ExternalClass1EBC1F98DCC54A47A42334587D6E5096"><p>I have always used IE primarily myself.</p></div>	23	32	31	NaN
32	None	<div class="ExternalClass70900ADCE11145AB89AAAD9BE04364DB"><p>Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.</p></div>	34	33	31	NaN
33	None	<div class="ExternalClassE330B4A36E1B489FB2469A238E5CDF48"><p>Have you heard about the vulnerabilities with the latest version of Firefox?  </p></div>	42	34	31	NaN
34	Tablet PCs to be Issued Next Month?	<div class="ExternalClass866847BAC5DD4FD4993F5A862035C78E"><p>A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops.  Has this been confirmed?  Is there a request form available?  Thanks in advance!<br></p></div>	31	35	NaN	36
35	None	<div class="ExternalClassFDB19559BDD140FF8D6F3427269E28B1"><p>I think if you ask Garret Vargas from Operations he might be able to help you.</p></div>	26	36	35	NaN
36	None	<div class="ExternalClass7D8995E7F79341EF95C75A2B06F379B5"><p>I will email him about that.  A tablet PC would be quite helpful for creating the new round of holiday ads.</p></div>	23	37	35	NaN
37	None	<div class="ExternalClass85293723737444E09824C4E104EA6181"><p>You are right, Aziz.  I just saw him today and he told me to check the Operations document center to get the request form.<br></p></div>	27	38	35	NaN
38	Please Remember to Keep your Software Updated	<div class="ExternalClass37DEF653B8124DCD84D9F9900651C73C"><p>So that we can all stay on the same page, please remember to keep your software items updated.  I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents.  Many of these problems can be resolved by doing a simple update.  The IT help desk can provide assistance of you are unsure of what to do.<br></p></div>	31	39	NaN	NaN
39	None	<div class="ExternalClass4A7B2AB9D13B43AF9C6F12787095CCB7"><p>Thanks Dorena, I was having a problem with this just last week.  </p></div>	26	40	39	NaN
40	None	<div class="ExternalClass7E49E6C38232475FAD90173531BD7D0C"><p>Thank you!  I always forget to do that, especially with the new QT series project due next month!</p></div>	23	41	39	NaN

	Title_user	Title_disc	Body_Text	Body	AuthorId	ParentItemID	BestAnswerId	Email	relevant_words
Id
1	Fabrice Canel	Update to Office, Available Soon!	The new release of Office will is right around the corner I am not sure when Contoso is planning to upgrade, but I know I will be getting the new version as soon as I can.	<div class="ExternalClass7680EB0FE93243B1919ABD9B621E78F3"><p>The new release of Office will is right around the corner  <span><span>I am not sure when Contoso is planning to upgrade, but </span></span>I know I will be getting the new version as soon as I can.  <br></p></div>	32	NaN	3	FabriceC@lnidemo.onmicrosoft.com	[new, office, right, sure, contoso, upgrade, know, new, version, soon]
2	Molly Dempsey	None	I am glad you are so up-to-date on things Fabrice! I will try to get the new version soon as well.	<div class="ExternalClass4889D37A14C34CE7BB830F5A7FF38A96"><p>I am glad you are so up-to-date on things Fabrice!  I will try to get the new version soon as well.</p></div>	41	1	NaN	MollyD@lnidemo.onmicrosoft.com	[glad, things, fabrice, try, new, version, soon]
3	Pavel Bansky	None	Sounds good Fabrice! Let me know if you need someone to help troubleshoot something with you.	<div class="ExternalClassF913E494E9CC465C86A6F1692A0E86FC"><p>Sounds good Fabrice!  Let me know if you need someone to help troubleshoot something with you.</p></div>	42	1	NaN	PavelB@lnidemo.onmicrosoft.com	[fabrice, know, need, help]
4	Kari Furse	None	Anyone know what some of the new features are?	<div class="ExternalClassB2DC0D9515FC4291903985A825BBA466"><p>Anyone know what some of the new features are?</p></div>	38	1	NaN	KariF@lnidemo.onmicrosoft.com	[know, new]
5	Fabrice Canel	None	The Microsoft site has a full description of the new Office products.	<div class="ExternalClass85990E08E2C24CD296E6711B5D5312ED"><p>The Microsoft site has a full description of the new Office products.<br></p></div>	32	1	NaN	FabriceC@lnidemo.onmicrosoft.com	[microsoft, site, new, office, products]
6	Garth Fort	Microsoft Issues Word Patch	Make sure to update your systems, because Microsoft has released a critical patch for Word. This should help smooth over some of the difficulties we have been having andgive the Security team a rest. Even so, remember not to open files unless they are from a trusted source!	<div class="ExternalClass1A1D176690C04027BB797CAF92E32C80"><p>Make sure to update your systems, because Microsoft has released a critical patch for Word.  This should help smooth over some of the difficulties we have been having and give the Security team a rest.  Even so, remember not to open files unless they are from a trusted source!</p></div>	34	NaN	NaN	GarthF@lnidemo.onmicrosoft.com	[sure, update, microsoft, released, patch, word, help, having, security, remember]
7	Janet Schorr	None	This is a helpful reminder, Garth. I will remember to set up the update before I leave for the weekend.	<div class="ExternalClass66B34417A321490DAE6A30D413D87F74"><p>This is a helpful reminder, Garth.  I will remember to set up the update before I leave for the weekend.  </p></div>	35	6	NaN	JanetS@lnidemo.onmicrosoft.com	[helpful, remember, update]
8	Kari Furse	None	Thanks. I just got the patch.	<div class="ExternalClass9EEA8838B7D34AA2A1083C7166DDFBC7"><p>Thanks.  I just got the patch.</p></div>	38	6	NaN	KariF@lnidemo.onmicrosoft.com	[thanks, got, patch]
9	Molly Dempsey	None	I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!	<div class="ExternalClass7CDA6082559245648E3B114FD0F1A15F"><p>I am glad it got fixed up so quickly, I don't think anyone from Contoso had problems with it!</p></div>	41	6	NaN	MollyD@lnidemo.onmicrosoft.com	[glad, got, think, contoso, problems]
10	Janet Schorr	Windows 10 PCs for Next Year	It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects.	<div class="ExternalClass8459219F464C4AD3B72DDF6ED7AB52C9"><p>It has been confirmed that the next round of hardware replacements will be windows 10 PCs, to be handed out sometime after the first of the year to avoid interference with ongoing holiday projects.  </p></div>	35	NaN	13	JanetS@lnidemo.onmicrosoft.com	[confirmed, round, hardware, pcs, year, holiday, projects]
11	Molly Dempsey	None	Where can we get new hardware before then? We have a new hire in Product Marketing and she needs a laptop.	<div class="ExternalClassC2AA5D409B4D412C8B5DAB3F484BE7E4"><p>Where can we get new hardware before then?  We have a new hire in Product Marketing and she needs a laptop.</p></div>	41	10	NaN	MollyD@lnidemo.onmicrosoft.com	[new, hardware, new, product]
12	Fabrice Canel	None	It's too bad we can't get them sooner. I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.	<div class="ExternalClass501FFBDBC6C947F6BD10168CEE22F937"><p>It's too bad we can't get them sooner.  I suppose the help desk would be flooded during the holiday rush if the upgrade wasn't delayed.  <br></p></div>	32	10	NaN	FabriceC@lnidemo.onmicrosoft.com	[help, desk, holiday, upgrade]
13	Pavel Bansky	None	Molly, I think if you check with Garret Vargas he might be able to help. There is also a spare parts room down the hall from me if you need smaller things.	<div class="ExternalClass7ABFB4067F634305BE7A48A3739DC694"><p>Molly, I think if you check with Garret Vargas he might be able to help.  There is also a spare parts room down the hall from me if you need smaller things.</p></div>	42	10	NaN	PavelB@lnidemo.onmicrosoft.com	[think, check, garret, vargas, able, help, need, things]
14	Molly Dempsey	None	Thanks, Pavel. We haveeverything figured out now!	<div class="ExternalClass05C0F75DFCDF43FBA95CF57A23B09DAA"><span style="line-height:18px;">Thanks, Pavel.  We have everything figured out now!</span></div>	41	10	NaN	MollyD@lnidemo.onmicrosoft.com	[thanks]
15	Aziz Hassouneh	Contoso Electronics Annual Conference	Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters. Come and see the latest Contoso products on display and what will be new in 2013. I am posting this for Junmin Hao, as he is on Vacation this week.	<div class="ExternalClass9EF3920ACE76463490E6F3D3B4A891C3"><p>Hello Everyone, I would just like to remind you of the upcoming Contoso Electronics conference right here at Contoso headquarters.  Come and see the latest Contoso products on display and what will be new in 2013.  I am posting this for Junmin Hao, as he is on Vacation this week.<br></p></div>	26	NaN	18	AzizH@lnidemo.onmicrosoft.com	[like, contoso, electronics, right, contoso, latest, contoso, products, new, week]
16	Alex Darrow	None	If I can get some time away from my project then I will try to stop by! It would be nice to get a better idea of what is coming out next year.	<div class="ExternalClass4073E610825D4932912C24A745D83703"><p>If I can get some time away from my project then I will try to stop by!  It would be nice to get a better idea of what is coming out next year.</p></div>	23	15	NaN	AlexD@lnidemo.onmicrosoft.com	[time, project, try, idea, year]
17	Belinda Newman	None	Will this be happening in the lobby of building one like last year?	<div class="ExternalClass3D5D71E7DB7B4F0CB49C3DD110E5BD35"><p>Will this be happening in the lobby of building one like last year?<br></p></div>	27	15	NaN	BelindaN@lnidemo.onmicrosoft.com	[like, year]
18	Katie Jordan	None	I think so, Belinda. I believe there will be a formal announcement about it soon.	<div class="ExternalClass6C98B30759914971924FC738C498DFBD"><p>I think so, Belinda.  I believe there will be a formal announcement about it soon.  </p></div>	39	15	NaN	KatieJ@lnidemo.onmicrosoft.com	[think, soon]
19	Janet Schorr	None	Last year there were some projects that I hadn't even heard about. It is helpful to get filled in on what the Engineering and R&D departments are up to.	<div class="ExternalClass716703823FD3406CA443ED98207BFBF1"><p>Last year there were some projects that I hadn't even heard about.  It is helpful to get filled in on what the Engineering and R&D departments are up to.</p></div>	35	15	NaN	JanetS@lnidemo.onmicrosoft.com	[year, projects, heard, helpful]
20	Bonnie Kearney	Reminder: Contoso Site Scheduled Maintenance	Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4:00am to 9:00am next Mondaymorning. The site will be down during this time, so please plan accordingly.	<div class="ExternalClassEAFB575A5F864303B5629648ACDAD4B4"><p>Just so you all remember, there will be a scheduled maintenence session for the Contoso site from 4:00am to 9:00am next Monday morning.  The site will be down during this time, so please plan accordingly.  </p></div>	28	NaN	23	BonnieK@lnidemo.onmicrosoft.com	[remember, scheduled, contoso, site, site, time]
21	Aziz Hassouneh	None	Do you have any idea what changes are going to be made?	<div class="ExternalClass85891AEB4CDA411B9D2A0DEE1CF3DDBE"><p>Do you have any idea what changes are going to be made?<br></p></div>	26	20	NaN	AzizH@lnidemo.onmicrosoft.com	[idea, changes]
22	Alex Darrow	None	I think the changes are mostly behind-the-scenes.	<div class="ExternalClassC26D91B6FACE431C8299819E70B9B606"><p>I think the changes are mostly behind-the-scenes.  </p></div>	23	20	NaN	AlexD@lnidemo.onmicrosoft.com	[think, changes]
23	Janet Schorr	None	There will be a few security updates as well as some UI changes in the product view area. See the Operations page for a full list of updates.	<div class="ExternalClassEECEFF9459394301A23688F20397EFAA"><p>There will be a few security updates as well as some UI changes in the product view area.  See the Operations page for a full list of updates.</p></div>	35	20	NaN	JanetS@lnidemo.onmicrosoft.com	[security, updates, changes, product, operations, page, updates]
24	Pavel Bansky	Contoso Boston Show is Going Forward	Contoso Sales product show confirmed in Boston on the 15th of November. The long awaitednew lineup of QT2000 models will be unveiled to the public. Please mark your calendars if you are scheduled to attend.	<div class="ExternalClass7CEF049EFC104DF9B78DE8F3FFC913BA"><p>Contoso Sales product show confirmed in Boston on the 15th of November.  The long awaited new lineup of QT2000 models will be unveiled to the public.  Please mark your calendars if you are scheduled to attend.</p></div>	42	NaN	NaN	PavelB@lnidemo.onmicrosoft.com	[contoso, product, confirmed, november, scheduled]
25	Janet Schorr	None	I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.	<div class="ExternalClass3C8C616E8D34448BB6F7471EDD28FEF6"><p>I am heading up the ad campaign for that show, so I would like to hear of there is any new information about the products.</p></div>	35	24	NaN	JanetS@lnidemo.onmicrosoft.com	[like, new, products]
26	Alex Darrow	Contoso Cyber Security	I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.	<div class="ExternalClassAD866A0079F74461B272CAB3A50051C0"><p>I was reading an article recently about current cyber security issues in some big electronics companies in the last couple of months and I was wondering what the state of Contoso Cyber Security is.</p></div>	23	NaN	27	AlexD@lnidemo.onmicrosoft.com	[reading, current, cyber, security, electronics, wondering, contoso, cyber, security]
27	Katie Jordan	None	Thank you for the concern, Alex! From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.	<div class="ExternalClass8E9D74A7CB73431FBBB74ED732C554B0"><p>Thank you for the concern, Alex!  From what I am aware David Longmuir has been working with Garret Vargas to get a new cyber security system in place by the end of November.</p></div>	39	26	NaN	KatieJ@lnidemo.onmicrosoft.com	[thank, garret, vargas, new, cyber, security, november]
28	Aziz Hassouneh	None	I was concerned myself, what with the security breach two years ago. Nevertheless, I saw a report on posted by Garret Vargason Monday about the new security system.	<div class="ExternalClassA5645BC66C2644249369F20188D76E57"><p>I was concerned myself, what with the security breach two years ago.  Nevertheless, I saw a report on posted by Garret Vargas on Monday about the new security system. <br></p></div>	26	26	NaN	AzizH@lnidemo.onmicrosoft.com	[security, ago, saw, garret, new, security]
29	Belinda Newman	None	If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.	<div class="ExternalClassFF5CC4DD8C554F5B94278832C9E95D64"><p>If anyone is interested, you can find the Virus Reports for the current year in the Operations department Documents.<br></p></div>	27	26	NaN	BelindaN@lnidemo.onmicrosoft.com	[current, year, operations, documents]
30	Garth Fort	None	With the new patch released for Word, I think some of our problems may be lessened. See my latest post.	<div class="ExternalClass084CADB1843B4786B1CA4362B0789760"><p>With the new patch released for Word, I think some of our problems may be lessened.  See my latest post.</p></div>	34	26	NaN	GarthF@lnidemo.onmicrosoft.com	[new, patch, released, word, think, problems, latest]
31	Belinda Newman	Browser Recomendation?	I was wondering what all of you prefer to use as your default browser when at work or home. I use Internet Explorer 9, as it seems the most stable when running a large number of tabs. What do you think?	<div class="ExternalClass0542C39609D349F78EE04C7B04013676"><p>I was wondering what all of you prefer to use as your default browser when at work or home.  I use Internet Explorer 9, as it seems the most stable when running a large number of tabs.  What do you think?<br></p></div>	27	NaN	32	BelindaN@lnidemo.onmicrosoft.com	[wondering, use, work, home, use, internet, explorer, number, think]
32	Alex Darrow	None	I have always used IE primarily myself.	<div class="ExternalClass1EBC1F98DCC54A47A42334587D6E5096"><p>I have always used IE primarily myself.</p></div>	23	31	NaN	AlexD@lnidemo.onmicrosoft.com	[]
33	Garth Fort	None	Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.	<div class="ExternalClass70900ADCE11145AB89AAAD9BE04364DB"><p>Sometimes I use Firefox at home, but I find Internet Explorer more suited for work use.</p></div>	34	31	NaN	GarthF@lnidemo.onmicrosoft.com	[use, firefox, home, internet, explorer, work, use]
34	Pavel Bansky	None	Have you heard about the vulnerabilities with the latest version of Firefox?	<div class="ExternalClassE330B4A36E1B489FB2469A238E5CDF48"><p>Have you heard about the vulnerabilities with the latest version of Firefox?  </p></div>	42	31	NaN	PavelB@lnidemo.onmicrosoft.com	[heard, latest, version, firefox]
35	Dorena Paschke	Tablet PCs to be Issued Next Month?	A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops. Has this been confirmed? Is there a request form available? Thanks in advance!	<div class="ExternalClass866847BAC5DD4FD4993F5A862035C78E"><p>A few weeks ago there was an email sent out about a request for Tablet PCs to replace some of our old laptops.  Has this been confirmed?  Is there a request form available?  Thanks in advance!<br></p></div>	31	NaN	36	DorenaP@lnidemo.onmicrosoft.com	[ago, email, request, tablet, pcs, confirmed, request, form, thanks]
36	Aziz Hassouneh	None	I think if you ask Garret Vargas from Operations he might be able to help you.	<div class="ExternalClassFDB19559BDD140FF8D6F3427269E28B1"><p>I think if you ask Garret Vargas from Operations he might be able to help you.</p></div>	26	35	NaN	AzizH@lnidemo.onmicrosoft.com	[think, garret, vargas, operations, able, help]
37	Alex Darrow	None	I will email him about that. A tablet PC would be quite helpful for creating the new round of holiday ads.	<div class="ExternalClass7D8995E7F79341EF95C75A2B06F379B5"><p>I will email him about that.  A tablet PC would be quite helpful for creating the new round of holiday ads.</p></div>	23	35	NaN	AlexD@lnidemo.onmicrosoft.com	[email, tablet, helpful, new, round, holiday]
38	Belinda Newman	None	You are right, Aziz. I just saw him today and he told me to check the Operations document center to get the request form.	<div class="ExternalClass85293723737444E09824C4E104EA6181"><p>You are right, Aziz.  I just saw him today and he told me to check the Operations document center to get the request form.<br></p></div>	27	35	NaN	BelindaN@lnidemo.onmicrosoft.com	[right, saw, check, operations, request, form]
39	Dorena Paschke	Please Remember to Keep your Software Updated	So that we can all stay on the same page, please remember to keep your software items updated. I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents. Many of these problems can be resolved by doing a simple update. The IT help desk can provide assistance of you are unsure of what to do.	<div class="ExternalClass37DEF653B8124DCD84D9F9900651C73C"><p>So that we can all stay on the same page, please remember to keep your software items updated.  I have received a number of complaints from people who are having trouble syncing up in meetings or reading documents.  Many of these problems can be resolved by doing a simple update.  The IT help desk can provide assistance of you are unsure of what to do.<br></p></div>	31	NaN	NaN	DorenaP@lnidemo.onmicrosoft.com	[page, remember, number, having, reading, documents, problems, update, help, desk]
40	Aziz Hassouneh	None	Thanks Dorena, I was having a problem with this just last week.	<div class="ExternalClass4A7B2AB9D13B43AF9C6F12787095CCB7"><p>Thanks Dorena, I was having a problem with this just last week.  </p></div>	26	39	NaN	AzizH@lnidemo.onmicrosoft.com	[thanks, having, week]
41	Alex Darrow	None	Thank you! I always forget to do that, especially with the new QT series project due next month!	<div class="ExternalClass7E49E6C38232475FAD90173531BD7D0C"><p>Thank you!  I always forget to do that, especially with the new QT series project due next month!</p></div>	23	39	NaN	AlexD@lnidemo.onmicrosoft.com	[thank, new, project]