PokitDok Interview Homework - Part 1 - Data Visualization

Data Set: Geographic Variation Public Use File - URL

Sources:

[1]

[2] (Used throughout)

[3] (The visualization I used for D3.js)

Setup:

Following commands import a function from Github to parse the excel file into a readable dictionary. Then move that file and rename it for proper use in the Jupyter notebook.

git clone https://gist.github.com/639082.git
mv 639082/* .
rm -rf 639082/
mv xls-dict-reader.py xls_dict_reader.py

I ended up having to do some manual editing of the data set before parsing into JSON. I originally was going to display the statistics for US > State > County, but the resulting json file was too large for simple analysis.

The following code opens the respective excel file. Then uses the recently retrieved github gist to convert that excel file to a dictionary which is then converted to JSON via python. Lastly that data is then associated with the front-end 'window' for access (typically not a good idea but figured it would be safe as everything is constricted to within the notebook) . [1]


In [1]:
import csv, json, xlrd, mmap
from xls_dict_reader import XLSDictReader
from IPython.display import Javascript
from IPython.display import HTML

xlsFile = open('Geographic_Healthcare_Data_States.xlsx','r')

reader = XLSDictReader(xlsFile)
out = json.dumps( [row for row in reader])

javascript = 'window.data={};'.format(out);
Javascript(javascript)


Out[1]:

Importing D3 for use in the javascript.


In [2]:
%%javascript 
require.config({
    paths: {
        d3: '//cdnjs.cloudflare.com/ajax/libs/d3/3.4.8/d3.min'
    }
});


Visualization creation.


In [62]:
%%javascript
require(['d3'], function(d3) {
    //Control the size of the pie charts
    var wrapperWidth = 900;
    var wrapperHeight = 1200;
    
    //Clean-up in case you run this command multiple times
    while (element.firstChild) {
      element.removeChild(element.firstChild);
    }
    //Creation of HTML elements
    var dropDownDiv = document.createElement("div");
    var wrapper = document.createElement("div");
    wrapper.id = "wrapper";
    var wrapperCss = 'font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; width:'+wrapperWidth.toString()+'px; height: '+wrapperHeight.toString()+'px; position: relative;';
    wrapper.style.cssText = wrapperCss;
    
    //sexPieChartDiv
    var sexVisualizationDiv = document.createElement("div");
    sexVisualizationDiv.id = "sexVisualizationDiv";
    sexVisualizationDiv.style.height = (wrapperHeight/3).toString()+"px";
    sexVisualizationDiv.style.width = (wrapperWidth/2).toString()+"px";
    sexVisualizationDiv.style.display = "inline-block";
    
    //ethnicityPieChartDiv
    var ethnicityVisualizationDiv = document.createElement("div");
    ethnicityVisualizationDiv.id = "ethnicityVisualizationDiv";
    ethnicityVisualizationDiv.style.height = (wrapperHeight/3).toString()+"px";
    ethnicityVisualizationDiv.style.width = (wrapperWidth/2).toString()+"px";
    ethnicityVisualizationDiv.style.display = "inline-block";
    
    //diseasePieChartDiv
    var diseaseVisualizationDiv = document.createElement("div");
    diseaseVisualizationDiv.id = "diseaseVisualizationDiv";
    diseaseVisualizationDiv.style.height = (wrapperHeight*(2/3)).toString()+"px";
    diseaseVisualizationDiv.style.width = (wrapperWidth).toString()+"px";
    
    var dropDown = document.createElement('Select');  
    dropDown.id = 'dropDown';
    var option;
    for (var index in data) {
      if(!data.hasOwnProperty(index)) continue;
      option = new Option(data[index]["State"],data[index]["State"]);
        dropDown.options.add(option);
    }
    dropDownDiv.appendChild(dropDown);
    element.append(dropDownDiv);
    element.append(wrapper);
    wrapper.appendChild(sexVisualizationDiv);
    wrapper.appendChild(ethnicityVisualizationDiv);
    wrapper.appendChild(diseaseVisualizationDiv);
    
    //sexPieChart Creation
    var sexPieChartLabelsColors = {
        labels: ["Male", "Female"],
        colors: ["#98abc5", "#8a89a6"]};
    var sexPieChart = createPie("#sexVisualizationDiv",wrapperWidth/2,wrapperHeight/3,sexPieChartLabelsColors);
    var sexPieChartSearch = ['Percent Male', 'Percent Female'];
    
    //ethnicityPieChart Creation
    var ethnicityPieChartLabelsColors = {
        labels: ["White", "African American","Hispanic","Other"],
        colors: ["#7b6888", "#6b486b", "#a05d56", "#d0743c"]};
    var ethnicityPieChart = createPie("#ethnicityVisualizationDiv",wrapperWidth/2,wrapperHeight/3,ethnicityPieChartLabelsColors);
    var ethnicityPieChartSearch = ['Percent Non-Hispanic White', 'Percent African American', 'Percent Hispanic','Percent Other/Unknown'];
    
    //diseasePieChart Creation
    var diseasePieChartLabelsColors = {
        labels: ["HA", "AF","CKD","OPD","Depression","Diabetes","HF","IHD","BC","CC","LC","PC","Asthma","HyperTension","HC","Arthritis","Osteoporosis","ARD","Stroke"],
        colors: ['#3182bd', '#6baed6', '#9ecae1', '#c6dbef',
                '#e6550d', '#fd8d3c', '#fdae6b', '#fdd0a2',
                '#31a354', '#74c476', '#a1d99b', '#c7e9c0',
                '#756bb1', '#9e9ac8', '#bcbddc', '#dadaeb',
                '#636363', '#969696', '#bdbdbd']};
    var diseasePieChart = createPie("#diseaseVisualizationDiv",wrapperWidth,wrapperHeight*(2/3)+100,diseasePieChartLabelsColors);
    var diseasePieChartSearch = ['Percent of Medicare beneficiaries who have had a heart attack',
                                 "Percent of Medicare beneficiaries with atrial fibrillation",
                                 "Percent of Medicare beneficiaries with chronic kidney disease",
                                 "Percent of Medicare beneficiaries with chronic obstructive pulmonary disease",
                                 "Percent of Medicare beneficiaries with depression",
                                 "Percent of Medicare beneficiaries with diabetes",
                                 "Percent of Medicare beneficiaries with heart failure",
                                 "Percent Medicare beneficiaries with ischemic heart disease",
                                 "Percent of Medicare beneficiaries with breast cancer",
                                 "Percent of Medicare beneficiaries with colorectal cancer",
                                 "Percent of Medicare beneficiaries with lung cancer",
                                 "Percent of Medicare beneficiaries with prostate cancer",
                                 "Percent of Medicare beneficiaries with asthma",
                                 "Percent of Medicare beneficiaries with hypertension",
                                 "Percent of Medicare beneficiaries with high cholesterol",
                                 "Percent of Medicare beneficiaries with arthritis",
                                 "Percent of Medicare beneficiaries with osteoporosis",
                                 "Percent of Medicare beneficiaries with Alzheimer's and related disorders",
                                 "Percent of Medicare beneficiaries with stroke"]
    
    function generateData (pieChart, state, searchData){
        var labels = pieChart.color.domain();
        var stateData = data.filter( function(obj) {
            return obj["State"] === state;
        })[0];
        var i = 0;
        return labels.map(function(label){
            var retVal = { label: label, value: stateData[searchData[i]]}
            i++;
            return retVal;
        });
    }

    //Initialization for National
    change(generateData(sexPieChart, "National", sexPieChartSearch), sexPieChart);
    change(generateData(ethnicityPieChart, "National", ethnicityPieChartSearch), ethnicityPieChart);
    change(generateData(diseasePieChart, "National", diseasePieChartSearch), diseasePieChart);
    
    d3.select("#dropDown")
        .on("change",function(){
            var dropDown = document.getElementById('dropDown');
            var state = dropDown[dropDown.selectedIndex].value;
            change(generateData(sexPieChart, state , sexPieChartSearch), sexPieChart);
            change(generateData(ethnicityPieChart, state, ethnicityPieChartSearch), ethnicityPieChart);
            change(generateData(diseasePieChart, state, diseasePieChartSearch), diseasePieChart);
        });

    var key = function(d){ return d.data.label; };
    
    function createPie(appendElement, pieWidth, pieHeight, labelsColor) {
      var svg = d3.select(appendElement)
        .append("svg")
          .style("width","100%")
          .style("height","100%")
        .append("g");

      svg.append("g")
        .attr("class", "slices");
      svg.append("g")
        .attr("class", "labels");
      svg.append("g")
        .attr("class", "lines");

      var width = pieWidth,
      height = pieHeight-150,
      radius = Math.min(width,height)/2;

      var pie = d3.layout.pie()
        .sort(null)
        .value(function(d) {
          return d.value;
        });

      var arc = d3.svg.arc()
        .outerRadius(radius * 0.8)
        .innerRadius(radius * 0.4);

      var outerArc = d3.svg.arc()
        .innerRadius(radius * 0.9)
        .outerRadius(radius * 0.9);

      svg.attr("transform", "translate(" + width / 2 + "," + height / 2 + ")");

      var color = d3.scale.ordinal()
        .domain(labelsColor.labels)
        .range(labelsColor.colors);

      return {
        svg: svg,
        radius: radius,
        pie: pie,
        arc: arc,
        outerArc: outerArc,
        color:color
      };
    }

    
    function change(data, pieChart) {

    /* ------- PIE SLICES -------*/
    var slice = pieChart.svg.select(".slices").selectAll("path.slice")
        .data(pieChart.pie(data), key)
        .style("stroke-width","2px");

    slice.enter()
        .insert("path")
        .style("fill", function(d) { return pieChart.color(d.data.label); })
        .attr("class", "slice");

    slice
        .transition().duration(1000)
        .attrTween("d", function(d) {
            this._current = this._current || d;
            var interpolate = d3.interpolate(this._current, d);
            this._current = interpolate(0);
            return function(t) {
                return pieChart.arc(interpolate(t));
            };
        })

    slice.exit()
        .remove();
        
    /* ------- TEXT LABELS -------*/
    var i = 0;
    var text = pieChart.svg.select(".labels").selectAll("text")
        .data(pieChart.pie(data), key);
        
    text.enter()
        .append("text")
        .attr("y", ".35em")
        .text(function(d) {
            return d.data.label;
        });

    function midAngle(d){
        return d.startAngle + (d.endAngle - d.startAngle)/2;
    }
    text.transition().duration(1000)
        .attrTween("transform", function(d) {
            var i = 0;
            this._current = this._current || d;
            var interpolate = d3.interpolate(this._current, d);
            this._current = interpolate(0);
            return function(t) {
                var d2 = interpolate(t);
                var pos = pieChart.outerArc.centroid(d2);
                //labelArray.push(pos);
                //Changing things slightly to fix overlapping. Not ideal but it works
                //pos[0] = pieChart.radius * (midAngle(d2) < Math.PI ? 1 : -1);
                return "translate("+ pos +")";
            };
        })
        .styleTween("text-anchor", function(d){
        this._current = this._current || d;
        var interpolate = d3.interpolate(this._current, d);
            this._current = interpolate(0);
            return function(t) {
                var d2 = interpolate(t);
                return midAngle(d2) < Math.PI ? "start":"end";
            };
        });
    text.exit()
        .remove();
        
    /* ------- SLICE TO TEXT POLYLINES -------*/

    var polyline = pieChart.svg.select(".lines").selectAll("polyline")
        .data(pieChart.pie(data), key);

    polyline.enter()
        .append("polyline")
            .style("opacity",".3")
            .style("stroke","black")
            .style("stroke-width","2px")
            .style("fill","none");

    polyline.transition().duration(1000)
        .attrTween("points", function(d){
            this._current = this._current || d;
            var interpolate = d3.interpolate(this._current, d);
            this._current = interpolate(0);
            return function(t) {
                var d2 = interpolate(t);
                var pos = pieChart.outerArc.centroid(d2);
                // Changing things slightly to fix overlapping. Not ideal but it works
                //pos[0] = pieChart.radius * 0.95 * (midAngle(d2) < Math.PI ? 1 : -1);
                return [pieChart.arc.centroid(d2), pieChart.outerArc.centroid(d2), pos];
            };
        });
    polyline.exit()
        .remove();
    };  
});


Quick Explanation of the Data:

Chart 1 (Top Left):

This pie chart represents the percentages of the seperate genders of the medicare beneficaries in either the nation or the selected state.

Chart 2 (Top Right):

This pie chart shows the percentages of the seperate ethnicities of the medicare beneficaries in either the nation or the selected state.

Chart 3 (Bottom):

This pie chart represents the percentage of each seperate disease that affected the medicare beneficaries in either the nation or the selected state.

Acronyms used in Chart:

  • HA (Heart Attack), AF (Atrial Fibrillation), CKD (Chronic Kidney Disease), OPD (Obstructive Pulmonary Disease), HF (Hear Failure), IHD (Ischemic Heart Disease), BC (Breast Cancer), CC (Colorectal Cancer), LC (Lung Cancer), PC (Prostate Cancer), HC (High Cholesteral), ARD (Alzheimer's and Related Diseases)

Additional Notes:

I was planning on creating a pie chart for costs as well (represented in the excel file sheet), but after completing the first three I felt they did a well enough job at getting the point across I was trying to convey. On another note I probably spent the longest amount of time trying to fix overlapping labels with frustratingly no success. Ended up having to comment out some lines to get minimum overlapping. Sadly this took away from the pie chart design I originally was shooting for.


In [ ]: