Toy1 Dataset - Geometric Solids

This is a simple dataset with 10,000 rows, I use it for my deep learning course. There is no noise, no missing values, and it is 100% predictable if you know the rule. The following fields are present:

  • metal - The metal that the object is made of. Values of gold, silver, bronze, and platinum.
  • shape - The shape of the geometric solid. Values of box, cylinder, sphere.
  • height - The height (in CM) of the object. Cylinders always run along the height.
  • length - The length (in CM), or depth, of the object. For spheres and cylinders the length/width are always the same.
  • width - The width (in CM), or depth, of the object. For spheres and cylinders the length/width are always the same.
  • weight - The weight (in grams) of the object.

The following code shows 10 sample rows.


In [31]:
import pandas as pd

path = "./data/"
    
filename = os.path.join(path,"toy1.csv")
df = pd.read_csv(filename,na_values=['NA','?'])

df[0:10]


Out[31]:
metal shape height length width weight
0 silver cylinder 6 5 5 1235.82
1 bronze cylinder 2 6 6 525.34
2 bronze sphere 2 2 2 38.91
3 silver sphere 6 6 6 1186.39
4 tin cylinder 10 6 6 2066.85
5 tin box 10 2 2 292.40
6 tin box 6 1 4 175.44
7 platinum box 8 2 4 1349.76
8 silver cylinder 5 3 3 370.75
9 platinum box 2 9 10 3796.20

This dataset deals with the weights of different geometric solids, of different sizes, and of different metals. Use the columns metal, shape, height, length, and width to determine the weight.

The geometric solids are:

Box

$ V = lwh $

Cylinder

$ V = \pi r^2 h = \pi {\frac{w}{2}}^2 h $

Sphere

$ V = \frac{4}{3} \pi r^3 = \frac{4}{3} \pi {\frac{h}{2}}^3 $

The following code shows how to exactly calculate the weight for any row in the dataset. Of course, the idea is to create a model, of some sort, that is able to obtain the same value.


In [35]:
import math

def calculate_weight(metal,shape,h,l,w):
    metal_name = ['gold','silver','bronze','tin','platinum']
    metal_density = [19.32,10.49, 9.29,7.31, 21.09 ]
    shape_name = ['sphere','box','cylinder']
    
    metal = metal_name.index(metal)
    shape = shape_name.index(shape)
    
    if shape==0:
        # sphere
        vol = (4.0/3.0)  * math.pi * ((l/2.0)**3)
    elif shape==1:
        # box
        vol = l * w * h
    elif shape==2:
        # cylinder
        vol = math.pi * ((w/2.0)**2.0) * h
        
    weight = vol * metal_density[metal]
        
    return weight
    
print(calculate_weight('silver', 'cylinder', 6, 5, 5))
print(calculate_weight('bronze', 'cylinder', 2, 6, 6))
print(calculate_weight('bronze', 'sphere', 2, 2, 2))
print(calculate_weight('silver', 'sphere', 6, 6, 6))
print(calculate_weight('tin', 'cylinder', 10, 6, 6))
print(calculate_weight('tin', 'box', 10, 2, 2))
print(calculate_weight('tin', 'box', 6, 1, 4))


1235.8240101058848
525.3371235332852
38.913861002465566
1186.3910497016493
2066.8538067967247
292.4
175.44

In [ ]: