Fashion-MNIST
is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST
to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.
Here's an example how the data looks (each class takes three-rows):
You can use direct links to download the dataset. The data is stored in the same format as the original MNIST data.
Name | Content | Examples | Size | Link | MD5 Checksum |
---|---|---|---|---|---|
train-images-idx3-ubyte.gz |
training set images | 60,000 | 26 MBytes | Download | 8d4fb7e6c68d591d4c3dfef9ec88bf0d |
train-labels-idx1-ubyte.gz |
training set labels | 60,000 | 29 KBytes | Download | 25c81989df183df01b3e8a0aad5dffbe |
t10k-images-idx3-ubyte.gz |
test set images | 10,000 | 4.3 MBytes | Download | bef4ecab320f06d8554ea6380940ec79 |
t10k-labels-idx1-ubyte.gz |
test set labels | 10,000 | 5.1 KBytes | Download | bb300cfdad3c16e7a12a480ee83cd310 |
Each training and test example is assigned to one of the following labels:
Label | Description |
---|---|
0 | T-shirt/top |
1 | Trouser |
2 | Pullover |
3 | Dress |
4 | Coat |
5 | Sandal |
6 | Shirt |
7 | Sneaker |
8 | Bag |
9 | Ankle boot |
In [1]:
# -*- coding: utf-8 -*-
import sys
import numpy as np
from keras.utils import np_utils
from PIL import Image
In [2]:
def load_mnist(path, kind='train'):
import os
import gzip
import numpy as np
"""Load MNIST data from `path`"""
labels_path = os.path.join(path,
'%s-labels-idx1-ubyte.gz'
% kind)
images_path = os.path.join(path,
'%s-images-idx3-ubyte.gz'
% kind)
with gzip.open(labels_path, 'rb') as lbpath:
labels = np.frombuffer(lbpath.read(), dtype=np.uint8,
offset=8)
with gzip.open(images_path, 'rb') as imgpath:
images = np.frombuffer(imgpath.read(), dtype=np.uint8,
offset=16).reshape(len(labels), 784)
return images, labels
In [3]:
X_train, y_train = load_mnist('./data/fashion', kind='train')
X_test, y_test = load_mnist('./data/fashion', kind='t10k')
#訓練画像
train_no = 0
outImg = Image.fromarray(X_train[train_no].reshape((28,28))*255).convert("RGB")
outImg.save("out/train.png")
#テスト画像
test_no = 0
outImg = Image.fromarray(X_test[test_no].reshape((28,28))).convert("RGB")
outImg.save("out/test.png")
抜き出した画像は ‘train.png’と’test.png’という名前で保存されます。
28×28サイズで格納されている。
引用:http://www.cs.toronto.edu/%7Ekriz/cifar.html
CIFAR-10データセットは以下の10カテゴリの画像が学習用,テスト用それぞれに50000枚,10000枚含まれるデータセット
In [ ]: