# Análisis de los datos obtenidos

Uso de ipython para el análsis y muestra de los datos obtenidos durante la producción. Los datos analizados son del filamento de bq el día 20 de Julio del 2015



In [3]:

    
#Importamos las librerías utilizadas
import numpy as np
import pandas as pd
import seaborn as sns



In [4]:

    
#Mostramos las versiones usadas de cada librerías
print ("Numpy v{}".format(np.__version__))
print ("Pandas v{}".format(pd.__version__))
print ("Seaborn v{}".format(sns.__version__))









    



Numpy v1.9.2
Pandas v0.16.2
Seaborn v0.6.0



In [5]:

    
#Abrimos el fichero csv con los datos de la muestra
datos = pd.read_csv('BQ.CSV')



In [20]:

    
%pylab inline









    



Populating the interactive namespace from numpy and matplotlib



In [47]:

    
#Mostramos un resumen de los datos obtenidoss
datos.describe()
#datos.describe().loc['mean',['Diametro X [mm]', 'Diametro Y [mm]']]









    Out[47]:






  
    
      
      Tmp Husillo [C]
      Tmp Nozzle [C]
      Diametro X [mm]
      Diametro Y [mm]
      MARCHA
      PARO
      RPM
    
  
  
    
      count
      333.000000
      333.000000
      333.000000
      333.000000
      333
      333
      3.330000e+02
    
    
      mean
      23.987688
      24.117117
      1.761381
      1.735165
      1
      0
      3.219000e+00
    
    
      std
      0.032907
      0.037723
      0.018183
      0.019583
      0
      0
      2.446166e-14
    
    
      min
      23.900000
      24.100000
      1.670000
      1.690000
      True
      False
      3.219000e+00
    
    
      25%
      24.000000
      24.100000
      1.750000
      1.720000
      1
      0
      3.219000e+00
    
    
      50%
      24.000000
      24.100000
      1.770000
      1.740000
      1
      0
      3.219000e+00
    
    
      75%
      24.000000
      24.100000
      1.770000
      1.750000
      1
      0
      3.219000e+00
    
    
      max
      24.000000
      24.200000
      1.800000
      1.780000
      True
      False
      3.219000e+00



In [48]:

    
#Almacenamos en una lista las columnas del fichero con las que vamos a trabajar
columns = ['Diametro X [mm]', 'Diametro Y [mm]', 'RPM']



In [49]:

    
#Mostramos en varias gráficas la información obtenida tras el ensayo
datos[columns].plot(subplots=True, figsize=(20,20))









    Out[49]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x10cc3d0b8>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10ce38c50>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c9f22b0>], dtype=object)

Representamos ambos diámetros en la misma gráfica



In [8]:

    
datos.ix[:, "Diametro X [mm]":"Diametro Y [mm]"].plot(figsize=(16,3))









    Out[8]:





<matplotlib.axes._subplots.AxesSubplot at 0x1097df198>



In [7]:

    
datos.ix[:, "Diametro X [mm]":"Diametro Y [mm]"].boxplot(return_type='axes')









    Out[7]:





<matplotlib.axes._subplots.AxesSubplot at 0x7defe10>

Mostramos la representación gráfica de la media de las muestras



In [10]:

    
pd.rolling_mean(datos[columns], 50).plot(subplots=True, figsize=(12,12))









    Out[10]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x10b66f630>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10b6de748>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10b918780>], dtype=object)

Comparativa de Diametro X frente a Diametro Y para ver el ratio del filamento



In [11]:

    
plt.scatter(x=datos['Diametro X [mm]'], y=datos['Diametro Y [mm]'], marker='.')









    Out[11]:





<matplotlib.collections.PathCollection at 0x10bfe5be0>

Filtrado de datos

Las muestras tomadas $d_x >= 0.9$ or $d_y >= 0.9$ las asumimos como error del sensor, por ello las filtramos de las muestras tomadas.



In [12]:

    
datos_filtrados = datos[(datos['Diametro X [mm]'] >= 0.9) & (datos['Diametro Y [mm]'] >= 0.9)]

Representación de X/Y



In [13]:

    
plt.scatter(x=datos_filtrados['Diametro X [mm]'], y=datos_filtrados['Diametro Y [mm]'], marker='.')









    Out[13]:





<matplotlib.collections.PathCollection at 0x10c0cfe10>

Analizamos datos del ratio



In [14]:

    
ratio = datos_filtrados['Diametro X [mm]']/datos_filtrados['Diametro Y [mm]']
ratio.describe()









    Out[14]:





count    333.000000
mean       1.015286
std        0.018370
min        0.948864
25%        1.005682
50%        1.017241
75%        1.029070
max        1.047059
dtype: float64



In [15]:

    
rolling_mean = pd.rolling_mean(ratio, 50)
rolling_std = pd.rolling_std(ratio, 50)
rolling_mean.plot(figsize=(12,6))
# plt.fill_between(ratio, y1=rolling_mean+rolling_std, y2=rolling_mean-rolling_std, alpha=0.5)
ratio.plot(figsize=(12,6), alpha=0.6, ylim=(0.5,1.5))









    Out[15]:





<matplotlib.axes._subplots.AxesSubplot at 0x10c0a6780>

Límites de calidad

Calculamos el número de veces que traspasamos unos límites de calidad. $Th^+ = 1.85$ and $Th^- = 1.65$



In [35]:

    
Th_u = 1.85
Th_d = 1.65



In [36]:

    
data_violations = datos[(datos['Diametro X [mm]'] > Th_u) | (datos['Diametro X [mm]'] < Th_d) |
                       (datos['Diametro Y [mm]'] > Th_u) | (datos['Diametro Y [mm]'] < Th_d)]



In [37]:

    
data_violations.describe()









    



/Users/darkomen/anaconda/lib/python3.4/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/Users/darkomen/anaconda/lib/python3.4/site-packages/numpy/core/_methods.py:83: RuntimeWarning: Degrees of freedom <= 0 for slice
  warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)






    Out[37]:






  
    
      
      Tmp Husillo [C]
      Tmp Nozzle [C]
      Diametro X [mm]
      Diametro Y [mm]
      MARCHA
      PARO
      RPM
    
  
  
    
      count
      0
      0
      0
      0
      0
      0
      0
    
    
      mean
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      std
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      min
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      25%
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      50%
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      75%
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      max
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN



In [34]:

    
data_violations.plot(subplots=True, figsize=(12,12))









    Out[34]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x10c6aca58>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c6ea5f8>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c70dd68>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c76eeb8>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c7bac88>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c91d390>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x10c969240>], dtype=object)



In [ ]:

	Tmp Husillo [C]	Tmp Nozzle [C]	Diametro X [mm]	Diametro Y [mm]	MARCHA	PARO	RPM
count	333.000000	333.000000	333.000000	333.000000	333	333	3.330000e+02
mean	23.987688	24.117117	1.761381	1.735165	1	0	3.219000e+00
std	0.032907	0.037723	0.018183	0.019583	0	0	2.446166e-14
min	23.900000	24.100000	1.670000	1.690000	True	False	3.219000e+00
25%	24.000000	24.100000	1.750000	1.720000	1	0	3.219000e+00
50%	24.000000	24.100000	1.770000	1.740000	1	0	3.219000e+00
75%	24.000000	24.100000	1.770000	1.750000	1	0	3.219000e+00
max	24.000000	24.200000	1.800000	1.780000	True	False	3.219000e+00