Features

Discovery

  • Curated, cleaned, and standardized public data products via GEMSOpen
  • Use terms from 10 community of practice supported ontologies [Crop Ontology, ICASA, Planteome, Agrovoc, etc] to search
  • Find protected data products and request access from data providers
  • Identify potential research partners
  • Data packaged with provided metadata
  • Access third party data via API [NASA, Ag Data Commons, etc]

Privacy and Sharing

  • Authenticate through your home institution or Globus ID
  • Upload data with Globus, SCP, HTTPS
  • Data encrypted at rest and in flight
  • Private sandboxes [no data registration required], managed sharing, or open access
  • Automatic versioning of registered data
  • Rollback to earlier versions *
  • Control data discovery independent from data access by determining visibility of metadata for each data product
  • Grant granular access to subsets of data products
  • Grant access to teams or individuals of your choosing
  • Control over access privileges by user and team [read only, edit, delete, manage access]
  • Single or multiple access managers
  • Data deidentification [household, farm, and other personal/proprietary data] *
  • Geographic fuzzing [aggregation, jitter] *
  • Monitor data usage *
  • DOI minting *
  • Above functionality applied to tool privacy and sharing *
  • Federate G.E.M.S or deploy a private GEMShare installation *

Data Cleaning and Standardization

  • Computer assisted spelling correction
  • Map data to community of practice supported ontologies [10 current: Crop Ontology, ICASA, Planteome, Agrovoc,  etc]
  • Cross domain ontology integration *
  • Computer assisted metadata creation
  • Computer assisted numerical outlier detection *
  • Computer assisted geospatial entry error corrections *
  • Computer assisted weather data cleaning *

Analysis and Visualization

  • Use Jupyter, R Studio, or Remote Desktop to a Linux node
  • Customizable Python or R environments with libraries of your choice
  • Parallelize workflows with Apache Spark or Dask
  • Transfer data to the cloud, a HPC data center, or a laptop for analysis
  • Curate data products to power third party websites *
  • Visualizations of spatial data *
  • Interactive web UI for analyzing data *

* denotes feature in development