Part 1 - Introduction


In [1]:
from mdf_forge.forge import Forge  # This is the only required import for Forge.

Authentication

Authentication is handled automatically. Just follow the prompt once and let Forge take care of the rest.


In [2]:
# You can set up Forge with no arguments. Forge will automatically authenticate and connect to MDF.
mdf = Forge()

Basic Queries

Using the search() method, you can perform a basic text search of the data in MDF. You will get back a list of matching entries (up to 10,000).

Let's say we want to find data on aluminum. We can just search for "Al" like so:


In [3]:
res = mdf.search("Al")
res[0]


Out[3]:
{'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',
   'filename': 'nist_xps_41530.json',
   'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json',
   'length': 1248,
   'mime_type': 'text/plain',
   'sha512': '69912ca91261bba53dc0df956338baebf81a3f9d1281f4e9108200c3b8473f073ffdff7437a55c8ac3d08d40074a68a5509bbeb1a391426f838427398f3963dd',
   'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json'}],
 'material': {'composition': 'Al', 'elements': ['Al']},
 'mdf': {'ingest_date': '2018-11-06T16:57:59.847843Z',
  'mdf_id': '5be1c83f2ef3883312753d4a',
  'parent_id': '5be1c8172ef388331274efdf',
  'resource_type': 'record',
  'scroll_id': 19819,
  'source_id': 'nist_xps_db_v1',
  'source_name': 'nist_xps_db',
  'version': 1},
 'nist_xps_db': {'binding_energy_ev': '72.5',
  'energy_uncertainty_ev': '',
  'notes': 'Al(111).',
  'temperature_k': '300'}}

Advanced-mode searches

You can also query more precisely with the advanced=True argument. The basic use is the form key.subkey:value. The full documentation for the query syntaz can be found here: http://globus-search-docs.s3-website-us-east-1.amazonaws.com/stable/api/search.html#_query_syntax

In this example, we can search for "Al" inside the "mdf.elements" key.

We're also going to limit the number of results to 10.


In [4]:
res = mdf.search("material.elements:Al", advanced=True, limit=10)
res[0]


Out[4]:
{'cip': {'bv': '79.0',
  'energy': '-3.36',
  'forcefield': 'Al99.eam.alloy',
  'gv': '29.4',
  'mpid': 'mp-134',
  'totenergy': '-107.52'},
 'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',
   'filename': 'classical_interatomic_potentials.json',
   'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/cip_v1/classical_interatomic_potentials.json',
   'length': 1841203,
   'mime_type': 'text/plain',
   'sha512': '96635ee0c15d1d0187b18805653a02b1a6dfa5648db82153467045de18adcc08c753e2897d2b48a78a2167a442219e9aeff6b1103732c2158facac8fa4911b33',
   'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/cip_v1/classical_interatomic_potentials.json'}],
 'material': {'composition': 'Al32', 'elements': ['Al']},
 'mdf': {'ingest_date': '2018-10-29T17:47:57.468388Z',
  'mdf_id': '5bd747cf2ef3880b0f213904',
  'parent_id': '5bd747cd2ef3880b0f2135d1',
  'resource_type': 'record',
  'scroll_id': 819,
  'source_id': 'cip_v1',
  'source_name': 'cip',
  'version': 1}}

If you want to search on a value with special characters, such as a colon or space, you must wrap the value in double quotes. Otherwise, you may get unexpected results.


In [5]:
res = mdf.search('dc.titles.title:"High-throughput Ab-initio Dilute Solute Diffusion Database"', advanced=True)
res[0]


Out[5]:
{'data': {'endpoint_path': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/ab_initio_solute_database_v1-2/',
  'link': 'https://www.globus.org/app/transfer?origin_id=e38ee745-6d04-11e5-ba46-22000b92c6ec&origin_path=/MDF/mdf_connect/prod/data/ab_initio_solute_database_v1-2/'},
 'dc': {'contributors': [{'affiliations': ['University of Wisconsin-Madison'],
    'contributorName': 'Morgan, Dane',
    'contributorType': 'ContactPerson',
    'familyName': 'Morgan',
    'givenName': 'Dane'}],
  'creators': [{'affiliations': ['University of Wisconsin-Madison'],
    'creatorName': 'Morgan, Dane',
    'familyName': 'Morgan',
    'givenName': 'Dane'},
   {'affiliations': ['University of Wisconsin-Madison'],
    'creatorName': 'Mayeshiba, Tam',
    'familyName': 'Mayeshiba',
    'givenName': 'Tam'},
   {'affiliations': ['University of Wisconsin-Madison'],
    'creatorName': 'Henry, Wu',
    'familyName': 'Henry',
    'givenName': 'Wu'}],
  'dates': [{'date': '2017-08-07T16:07:32.938812Z', 'dateType': 'Collected'}],
  'descriptions': [{'description': 'We demonstrate automated generation of diffusion databases from high-throughput density functional theory (DFT) calculations. A total of more than 230 dilute solute diffusion systems in Mg, Al, Cu, Ni, Pd, and Pt host lattices have been determined using multi-frequency diffusion models. We apply a correction method for solute diffusion in alloys using experimental and simulated values of host self-diffusivity.',
    'descriptionType': 'Other'}],
  'publicationYear': '2016',
  'publisher': 'MDF (placeholder)',
  'relatedIdentifiers': [{'relatedIdentifier': 'http://dx.doi.org/10.1038/sdata.2016.54',
    'relatedIdentifierType': 'DOI',
    'relationType': 'IsPartOf'}],
  'resourceType': {'resourceType': 'JSON', 'resourceTypeGeneral': 'Dataset'},
  'subjects': [{'subject': 'dilute'},
   {'subject': 'solute'},
   {'subject': 'DFT'},
   {'subject': 'diffusion'},
   {'subject': 'dataset'}],
  'titles': [{'title': 'High-throughput Ab-initio Dilute Solute Diffusion Database'}]},
 'mdf': {'ingest_date': '2018-11-24T08:12:11.852893Z',
  'mdf_id': '5bf907db2ef3885ee1191ae0',
  'resource_type': 'dataset',
  'scroll_id': 0,
  'source_id': 'ab_initio_solute_database_v1-2',
  'source_name': 'ab_initio_solute_database',
  'version': 1},
 'services': {'mdf_search': 'This dataset was ingested to MDF Search.'}}

In [ ]: