KDD 2020 will be held in San Diego, CA, USA from August 23 to 27, 2020. The Automatic Graph Representation Learning challenge (AutoGraph), the first ever AutoML challenge applied to Graph-structured data, is the AutoML track challenge in KDD Cup 2020 provided by 4Paradigm, ChaLearn, Stanford and Google.
Machine learning on graph-structured data. Graph-structured data have been ubiquitous in real-world, such as social networks, scholar networks, knowledge graph etc. Graph representation learning has been a very hot topic, and the goal is to learn low-dimensional representation of each node in the graph, which are used for downstream tasks, such as friend recommendation in a social network, or classifying academic papers into different subjects in a citation network. Traditionally, heuristics are exploited to extract features for each node from the graph, e.g., the degree statistics, or random walk based similarities. However, in recent years, sophisticated models such as graph neural networks (GNN) have been proposed for the graph representation learning tasks, which lead to the state-of-the-art results in many tasks, such as node classification, or link prediction.
Challenges in developing versatile models. Nevertheless, no matter the traditional heuristic methods or recent GNN based methods, huge computational and expertise resources are needed to be invested to achieve a satisfying performance given a task. For example, in DeepWalk and node2vec, two well-known random walk based methods, various hyper-parameters like the length and number of walks per node, the window size, have to be fine-tuned to obtain better performance. And when using the GNN models, e.g. GraphSAGE or GAT, we have to spend quite a lot of time to choose the optimal aggregation function in GraphSAGE, or head numbers of self-attention in GAT. Therefore, it limits the application of the existing graph representation models due to the huge demand of human experts in fine-tuning process.
Autograph Challenge. AutoML/AutoDL (https://autodl.chalearn.org) is a promising approach to lower the manpower costs of machine learning applications, and has achieved encouraging successes in hyper-parameter tuning, model selection, neural architecture search, and feature engineering. In order to enable more people and organizations to fully exploit their graph-structured data, we organize AutoGraph challenge dedicated to such data.
1st Prize: 15000 USD
2nd Prize: 10000 USD
3rd Prize: 5000 USD
4th - 10th prize: 500 USD each
Please contact the organizers if you have any problem concerning this challenge.
- Wei-Wei Tu, 4Pardigm Inc., China and ChaLearn, USA
- Jure Leskovec, Stanford University, USA
- Hugo Jair Escalante, IANOE, Mexico and ChaLearn, USA
- Isabelle Guyon, Université Paris-Saclay, France, ChaLearn, USA
- Qiang Yang, Hong Kong University of Science and Technology, Hong Kong, China
- Xiawei Guo, 4Paradigm Inc., China
- Shouxiang Liu, 4Paradigm Inc., China
- Zhen Xu, 4Paradigm Inc., China
- Rex Ying, Stanford University, USA
- Huan Zhao, 4Paradigm Inc., China
Previous AutoML Challenges:
Founded in early 2015, 4Paradigm is one of the world’s leading AI technology and service providers for industrial applications. 4Paradigm’s flagship product – the AI Prophet – is an AI development platform that enables enterprises to effortlessly build their own AI applications, and thereby significantly increase their operation’s efficiency. Using the AI Prophet, a company can develop a data-driven “AI Core System”, which could be largely regarded as a second core system next to the traditional transaction-oriented Core Banking System (IBM Mainframe) often found in banks. Beyond this, 4Paradigm has also successfully developed more than 100 AI solutions for use in various settings such as finance, telecommunication and internet applications. These solutions include, but are not limited to, smart pricing, real-time anti-fraud systems, precision marketing, personalized recommendation and more. And while it is clear that 4Paradigm can completely set up a new paradigm that an organization uses its data, its scope of services does not stop there. 4Paradigm uses state-of-the-art machine learning technologies and practical experiences to bring together a team of experts ranging from scientists to architects. This team has successfully built China’s largest machine learning system and the world’s first commercial deep learning system. However, 4Paradigm’s success does not stop there. With its core team pioneering the research of “Transfer Learning,” 4Paradigm takes the lead in this area, and as a result, has drawn great attention of worldwide tech giants.
ChaLearn is a non-profit organization with vast experience in the organization of academic challenges. ChaLearn is interested in all aspects of challenge organization, including data gathering procedures, evaluation protocols, novel challenge scenarios (e.g., competitions), training for challenge organizers, challenge analytics, result dissemination and, ultimately, advancing the state-of-the-art through challenges.
This is a challenge with code submission. We provide one baseline above for test purposes.
To make a test submission, download the starting kit and follow the readme.md file instruction. click on the blue button "Upload a Submission" in the upper right corner of the page and re-upload it. You must click first the orange tab "Feedback Phase" if you want to make a submission simultaneously on all datasets and get ranked in the challenge. You may also submit on a single dataset at a time (for debug purposes). To check progress on your submissions goes to the "My Submissions" tab. Your best submission is shown on the leaderboard visible under the "Results" tab.
The starting kit contains everything you need to create your own code submission (just by modifying the file model.py) and to test it on your local computer, with the same handling programs and Docker image as those of the Codalab platform (but the hardware environment is in general different).
The starting kit contains sample data. Besides that, 5 public datasets are also provided so that you can develop your solutions offline. These 5 public datasets can be downloaded from the link at the beginning.
Note that the version of cuda in this docker is 10, if the cuda version on your own PCs is less than 10, it may cause you to be unable to use the GPU in this docker.
You can test your code in the exact same environment as the Codalab environment using docker. You are able to run the ingestion program (to produce predictions) and the scoring program (to evaluate your predictions) on toy sample data.
1. If you are new to docker, install docker (version > 19) from https://docs.docker.com/get-started/.
2. At the shell, change to the starting-kit directory, run
docker run --gpus=0 -it --rm -v "$(pwd):/app/autograph" -w /app/autograph nehzux/kddcup2020:v2
3. Now you are in the bash of the docker container, run the local test program
python run_local_test.py --dataset_dir=path_to_dataset --code_dir=path_to_model_file
It runs ingestion and scoring program simultaneously, and the predictions and scoring results are in sample_result_submissions and scoring_output directory.
The interface is simple and generic: you must supply a Python model.py, where a Model class is defined following the API defined in "Evaluation" page.
To make submissions, zip model.py and its dependency files (without the directory), then use the "Upload a Submission" button. Please note that you must click first the orange tab "Feedback Phase" if you want to make a submission simultaneously on all datasets and get ranked in the challenge. You may also submit on a single dataset at a time (for debug purposes). Besides that, the ranking in the public leaderboard is determined by the LAST code submission of the participants.
Please note that for this challenge, "Detailed Results" button on the submission page is not used and provides no information.
In the starting- kit, we provide a docker that simulates the running environment of our challenge platform. Participants can check the python version and installed python packages with the following commands:
For other packages/libs that are not installed in the docker, participants can install them outside of method "train_predict", e.g. using
os.system("pip install xxx") at the begining of model.py. Please note that the time cost of installing the libraries is not counted as the time budget.
On our platform, for each submission, the allocated computational resources are:
Three graph related libraries are installed:
This page describes the datasets used in AutoGraph challenge. 15 graph datasets are prepared for this competition. 5 public datasets, which can be downloaded, are provided to the participants so that they can develop their solutions offline. Besides that, another 5 feedback datasets are also provided to participants to evaluate the public leaderboard scores of their AutoGraph solutions. Afterwards, their solutions will be evaluated with 5 final datasets without human intervention.
This challenge focuses on the problem of graph representation learning, where node classification is chosen as the task to evaluate the quality of learned representations.
Note that you can try more datasets to debug your solutions in the open graph benchmark and SNAP project from Stanford University.
The datasets are collected from real-world business, and are shuffled and split into training and testing parts. Each dataset contains two node files (training and testing), an edge file, a feature file, two label files (training and testing) and a metadata file.
Please note that the data files are read by our program and sent to the participant's program. For the details, please see Evaluations .
The training node file (train_node_id.txt) and testing node file (test_node_id.txt) list all node indices used for training and testing correspondingly. The node indices are int type.
node_index 0 1 2 3 4 5 6 7 8
The edge file (edge.tsv) contains a set of triplets. A triplet in the form (src_idx, dst_idx, edge_weight) describes a connection from node index src_idx to node dst_idx with the edge weight edge_weight. The type of edge_weight is numerical (float or int)
src_idx dst_idx edge_weight 0 62 1 0 40 1 0 127 1 0 178 1 0 53 1 0 67 1 0 189 1 0 135 1 0 48 1
The feature file (feature.tsv) is in tsv format. A line of the file is in the format: (node_index f0 f1 ...), where node_index is the index of a node and f0, f1, ... are its features
The types of features are all numerical
node_index f0 f1 f2 f3 f4 0 0.47775876104073356 0.05387578793865644 0.729954200019264 0.6908184238803438 0.9235037015600726 1 0.34224099072954905 0.6693042243297719 0.08736572053032532 0.07358721227831977 0.27398819586899037 2 0.8259856025619777 0.4421366756096389 0.9872258141866499 0.4865590790508849 0.12633483872234397 3 0.11177231902956064 0.40446709473609854 0.2293892960354328 0.4021930454713125 0.40698138834963693 4 0.34427740190016 0.26622372452918375 0.8042497280547812 0.0022605424347530434 0.8903425653304337 5 0.08640169107378592 0.43038539444039425 0.6635778390235518 0.9229371884297638 0.8912709075205572 6 0.6765202023072282 0.9039673560303431 0.986304900152288 0.23661480664770496 0.7140162062880935 7 0.043651531427249424 0.010090830922163785 0.758404203984433 0.05315076246728134 0.8017402643849966 8 0.49802375200717 0.6735698429117265 0.04292694482433346 0.3033723691640159 0.43132281219124635
The training label file (train_label.tsv) and the testing label file (test_label.tsv) are also in tsv format and contains label information of training and testing nodes correspondingly. A line in the files is in the format: (node_index class), where node_index is the index of a node and class is its label.
node_index class 0 1 1 3 2 1 3 1 4 3 5 1 6 1 7 3 8 1
The metadata file (config.yml) is in yaml format. It provides meta-information of the datasets, including:
time_budget: 5000 n_class: 7
This challenge has three phases. The participants are provided with 5 public datasets which can be downloaded, so that they can develop their solutions offline. Then, the code will be uploaded to the platform and participants will receive immediate feedback on the performance of their method at another 5 feedback datasets. After Feedback Phase terminates, we will have another Check Phase, where participants are allowed to submit their code only once on final datasets in order to debug. Participants won't be able to read detailed logs but they are able to see whether their code report errors. Last, in Final Phase, participants' solutions will be evaluated on 5 final datasets. The ranking in Final Phase will count towards determining the winners.
Code submitted is trained and tested automatically, without any human intervention. Code submitted on Feedback (resp. Final) Phase is run on all 5 feedback (resp. final) datasets in parallel on separate compute workers, each one with its own time budget.
The identities of the datasets used for testing on the platform are concealed. The data are provided in a raw form (no feature extraction) to encourage researchers to use Deep Learning methods performing automatic feature learning, although this is NOT a requirement. All problems are node classification problems. The tasks are constrained by the time budget.
Here is some pseudo-code of the evaluation protocol:
# For each dataset, our evaluation program calls the model constructor: # load the dataset dataset = Dataset(args.dataset_dir) # get information about the dataset time_budget = dataset.get_metadata().get("time_budget") n_class = dataset.get_metadata().get("n_class") schema = dataset.get_metadata().get("schema") # import and initialize the participant's Model class umodel = init_usermodel() # initialize the timer timer = _init_timer(time_budget) # train the model and predict the labels of testing data predictions = _train_predict(umodel, dataset, timer, n_class, schema)
For both Feedback Phase and Final Phase, Accuracy is evaluated on each dataset. The submissions will be ranked by the averaged rank on all datasets of a phase.
Note that if a submission fails on a certain dataset, a default score (-1 in this challenge) will be marked in the corresponding dataset of leaderbaord.
The participants should implement a class
Model with a class method
train_predict, which is described as follows:
class Model: """user model""" def __init__(self): # init def train_predict(self, data, time_budget, n_class, schema): """train and prediction This method will be called by the competition platform and constraint with time_budget. Parameters: ----------- data: dict, store all input data. keys and values are: 'fea_table': pandas.DataFrame, features for training and testing dataset, 'edge_file': pandas.DataFrame, edge information of the graph, dtypes of all columns are int 'train_indices': list of int, indices of all training nodes 'test_indices': list of int, indices of all testing nodes 'train_label': pandas.DataFrame, labels of training nodes for the details, please check the format of data files. n_class: int, the number of classes in this task schema: this is deprecated Return ------ pred: list(or pandas.Series / 1D numpy.ndarray) pred contains predictions for all testing samples, and they are in the same order as test_indices """ return pred
It is the responsibility of the participants to make sure that the "train_predict" method does not exceed the time budget.
No, they can make entries that show on the leaderboard for test purposes and to stimulate participation, but they are excluded from winning prizes.
No, except accepting the TERMS AND CONDITIONS.
No, you can join the challenge until one week before the end of feedback phase. After that, we will require real personal identification (notified by organizers) to avoid duplicate accounts.
You can download "practice datasets" only from the Instructions page. The data on which your code is evaluated cannot be downloaded, it will be visible to your code only, on the Codalab platform.
To make a valid challenge entry, click the blue button on the upper right side "Upload a Submission". This will ensure that you submit on all 5 datasets of the challenge simultaneously. You may also make a submission on a single dataset for debug purposes, but it will not count towards the final ranking.
We provide a Starting Kit in Python with step-by-step instructions in "README.md".
|1st place||2nd place||3rd place|
4th - 10th place: 500 USD each
Yes, participation is by code submission.
No. You just grant to the ORGANIZERS a license to use your code for evaluation purposes during the challenge. You retain all other rights.
Yes, we will provide the fact sheet in a suitable time.
We are running your submissions on Google Cloud workers, each of which will have one NVIDIA Tesla P100 GPU (running CUDA 10 with drivers cuDNN 7.5) and 4 vCPUs, with 30 GB of memory, 200 GB disk.
The PARTICIPANTS will be informed if the computational resources increase. They will NOT decrease.
This is not explicitly forbidden, but it is discouraged. We prefer if all calculations are performed on the server. If you submit a pre-trained model, you will have to disclose it in the fact sheets.
YES. The ranking of participants will be made from a final blind test made by evaluating a SINGLE SUBMISSION made on the final test submission site. The submission will be evaluated on five new test datasets in a completely "blind testing" manner. The final test ranking will determine the winners.
Each dataset has a predefined time budget associated in the meta information.
In principle no more than its time budget. We kill the process if the time budget is exceeded. Submissions are queued and run on a first time first serve basis. We are using several identical servers. Contact us if your submission is stuck for more than 24 hours. Check on the leaderboard the execution time.
3 submissions per day. This may be subject to change, according to the number of participants. Please respect other users. It is forbidden to register under multiple user IDs to gain an advantage and make more submissions. Violators will be DISQUALIFIED FROM THE CONTEST.
Failed submissions will be counted. Please contact us if you think the failure is due to the platform rather than to your code and we will try to resolve the problem promptly. If a submission fails, a default score (-1 in this challenge) will be marked in the leaderboard.
This should be avoided. In the case where a submission exceeds the time budget for a particular task (dataset), the submission handling process (ingestion program in particular) will be killed when time budget is used up and predictions made so far (with their corresponding timestamps) will be used for evaluation. In the other case where a submission exceeds the total compute time per day, all running tasks will be killed by CodaLab and the status will be marked 'Failed' and a default score will be produced. See previous question for more details.
No, sorry, not for this challenge.
Please go to 'Get Started' -> 'Evaluation' -> 'Metrics' section.
The code was tested under Python 3.6.8. We are running Python 3.6.8 on the server and the same libraries are available.
Yes. Any Linux executable can run on the system, provided that it fulfills our Python interface and you bundle all necessary libraries with your submission.
When you submit code to Codalab, your code is executed inside a Docker container. This environment can be exactly reproduced on your local machine by downloading the corresponding docker image. The docker environment of the challenge contains common Machine Learning libraries, TensorFlow, and PyTorch (among other things).
Your last submission is shown automatically on the leaderboard. You cannot choose which submission to select. If you want another submission than the last one you submitted to "count" and be displayed on the leaderboard, you need to re-submit it.
No. If you accidentally register multiple times or have multiple accounts from members of the same team, please notify the ORGANIZERS. Teams or solo PARTICIPANTS with multiple accounts will be disqualified.
We have disabled Codalab team registration. To join as a team, just share one account with your team. The team leader is responsible for making submissions and observing the results.
It is up to you and the team leader to make arrangements. However, you cannot participate in multiple teams.
ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED "AS-IS". UPSUD, CHALEARN, IDF, AND/OR OTHER ORGANIZERS AND SPONSORS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE, AND THE WARRANTY OF NON-INFRIGEMENT OF ANY THIRD PARTY'S INTELLECTUAL PROPERTY RIGHTS. IN NO EVENT SHALL ISABELLE GUYON AND/OR OTHER ORGANIZERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF SOFTWARE, DOCUMENTS, MATERIALS, PUBLICATIONS, OR INFORMATION MADE AVAILABLE FOR THE CHALLENGE. In case of dispute or possible exclusion/disqualification from the competition, the PARTICIPANTS agree not to take immediate legal action against the ORGANIZERS or SPONSORS. Decisions can be appealed by submitting a letter to the CHALEARN president, and disputes will be resolved by the CHALEARN board of directors. See contact information.
For questions of general interest, THE PARTICIPANTS should post their questions to the forum.
Other questions should be directed to the organizers.
Start: March 26, 2020, midnight
Description: Please make submissions by clicking on following 'Submit' button. Then you can view the submission results of your algorithm on each dataset in corresponding tab (Dataset 1, Dataset 2, etc).
|Dataset 1||None||March 26, 2020, midnight|
|Dataset 2||None||March 26, 2020, midnight|
|Dataset 3||None||March 26, 2020, midnight|
|Dataset 4||None||March 26, 2020, midnight|
|Dataset 5||None||March 26, 2020, midnight|
June 9, 2020, 3:59 p.m.
You must be logged in to participate in competitions.Sign In