Metaversity on the Internet Computer: Data Science Pipeline For Academic Research & Web 3.0 Education
Introduction
Overview of Metaversity on the Internet Computer
The Metaversity on the Internet Computer project consists of three phases.
- Phase I: Socrates-Lau Tzu
- Phase II: Plato-Confucius
- Phase III Aristotle-Mencius
This project belongs to the first of the three phases of Metaversity on the Internet Computer. Metaversity on the Internet Computer aims to create an open, innovative, and sustainable academy on the Internet Computer, providing scholars easy access to blockchain data, a platform for generating meaningful behavioral data, and a sustainable ecosystem for the future of academia. In the first phase, Metaversity will address the need to provide a data science pipeline for academic research and web 3.0 education. In the second phase, we plan to build an on-chain platform for behavioral experiments in academia. In the third phase, we plan to create a metaversity of token economy.
Motivation
Existing blockchain data source, including token dynamics and allocation, governance related proposals, transaction data, dApp analytics, has three common issues:
- Not research and education oriented. Most of the data sources that exist today are not research and education oriented, but intended for purely documentary, display purposes, or industrial usage, which has caused unnecessary barriers to using blockchain data in academia. One of the biggest obstacles is the lack of uniform standards or guidelines for data access and usage, which significantly slows down the efficiency of blockchain-related research iterations.
- Lack of integral and reasonable research-wise documentation. Currently, it’s hard to find data sources with supplementary research-wise documentation. The quality of documentation of blockchain data varies, and it is difficult to get information from it that is useful for scientific research. One obstacle is that much of the useful information in the blockchain industry, which includes a reasonable interpretation of the data sources are not understandable for the interdisciplinary audiences. Thus, there is a critical need for an integrated platform to enable an inclusive ecosystem of researchers and educators from multiple disciplines.
- Lack of supporting data science pipeline. Most of the current data sources exist in standalone form. However, related research requires a range of supporting data science tools; for example, easy-to-use and highly compatible database APIs, data monitoring dashboards, integration with several programming languages, etc.
Phase I solutions
We will build a data science pipeline intended for academia usage including database, API, detailed documentations, common programming language integration solutions, and data monitoring dashboards for research and educational purpose. Furthermore, we will invite scholars to join academic projects using our data science pipeline, together create sample deliverables, and conduct user-experience studies to further improve our product.
First, we will query data from several existing ICP data sources. There are currently a few online APIs that provide ICP data, for example, ICP Public Dashboard API and ICP Explorer. Most of the data in these sources contain only real-time information. However, most academic research requires historical data. Hence, we will identify the data of academic value and collect the historical data.
Then, we will use the data to build a database and provide an academic-friendly API. We plan to provide a conventional RESTful API. The format will be similar to the Public Dashboard API provided by the DFINITY Foundation. We first place the data on a centralized server to ensure that we could quickly develop features upfront. In the future, we plan to migrate the database to the Internet Computer and build databases through libraries such as sudograph. Furthermore, we will provide API integration solutions for common programming languages such as Python similar to the Python wrapper of the Alpha API.
Finally, we will build a series of data science accessories to serve the database usage, including a dashboard for data monitoring and detailed data descriptors. We will provide visualization tools similar to the Internet Computer Network Status dashboard. Our data descriptors will be similar to the online documentation of the Alpha API in presentation, but providing more guidance in academic usage.
Project Roadmap & Milestones
Phase I - Milestone I: Data Collection and Database Construction
Deliverables:
- Create ICP Historical Database
collect and record by constantly querying real-time data through DFINITY public API
- Create Online Database API
Reference: DFINITY public dashboard API
Phase I - Milestone II: API Documentation
Deliverables:
- Create API Documentation Site
Reference: Alpha API
Phase I - Milestone III: Data Science Tools Development
Deliverables:
- Create Programming language wrapper for the API
Reference: Python wrapper of the Alpha API
- Create Interactive Dashboard for Researchers and Educators
Reference: Internet Computer Network Status dashboard
Phase I - Milestone IV: User Experience Study for Improvements
- Invite professors and students in multiple disciplines to joint research and educational programs using our data science pipeline
- Conduct user-experience studies to collect feedback for further improving our data science pipeline
To Internet Computer
Metaversity will be the first platform intended for serving an academic purpose on the Internet Computer. Our project will attract global professors and students to innovate on the Internet Computer for the great cause of research and education. Blockchain has empowered an open internet for many industrial sectors, but the application for academia still awaits. Our project will enable IC to take the first essential step into the academic space, attracting intellectuals able to innovate at the frontier of human civilizations.